Posted by Steve Karam
on Feb 7, 2013 in Development
The Problem: Moving data outside of a database offers multiple options,
each with their own pros and cons.
XML (Extensible Markup Language):
What it is: A markup language designed for full featured
- a schema to provide structure and definitions for your document
- markup to describe your content
- broad acceptance in the business and development communities
What it is: A more human readable serialized data interchange
- easy to read key/value pairing
- native datatypes with schema validation
- excellent compatibility with major development platforms
- strong use in AJAX development
CSV (Comma Separated Values):
What it is: Ol' Faithful in data interchange, not very standardized
or descriptive, but still useful
- a quick format for loading structured data in delimited or fixed
- comfort for those who don't know or have the inclination to learn a
more modern format
- broad compatibility with nearly any language or database environment
# Which brings us to YAML
What it is: Data serialization that is super easy to read and write, first
proposed in 2001 by Clark Evans.
What it ain't: YAML Ain't Markup Language # get it?
YAML is actually designed to be very close to JSON; in fact, every JSON
document is a valid YAML document (but not the other way around). The big
difference is readability. YAML is focused on being extremely human
readable. In fact as you've probably guessed, this blog post is formatted
It is important to remember that YAML is really incomparable to XML.
While both of them can be used as data interchange formats, the purposes
are fundamentally different. Whereas XML is all about defining self
describing data through markup and providing values for that data in
a variety of ways, YAML is purely focused on serializing data in a
readable and parseable format.
And I have to say, YAML just squeezes by on being a parseable format.
Combinations of characters and indentation determine the type of node
each document line is (though multiline content is possible, as is
the case in this text). In the end, the only true types of nodes which
exist in YAML are Collections and Scalars. A collection can be either a
sequence of data or a key/value pair. Scalars are any type of integer,
string, date, and so on. Very few rules are enforced when it comes to
the actual content; however, YAML is very particular about indentation
(of course), and that each node of a collection must be unique. This
is a bonus for relational table loading/unloading, as primary key
integrity is guaranteed in a given collection of mappings. It also
forces more organized and more understandable data for human
To learn more about YAML, check out the spec and play around with it
a bit. You can even find tools like a YAML Parser online, which can
convert YAML to JSON. In fact, you can copy the contents of this blog
post (dashes to dots) straight into it to see the result.
You can also check out a quick usage with PHP that I made using this
blog text by clicking here.