YAML is a Weird Beast
“YAML is a weird beast.”
Someone’s comment on a PR months ago triggered my curiosity.
I use YAML daily for configuration files: OpenAPI specs, Helm charts, Symfony configs, Spring Boot properties… yet I’d never questioned its inner workings.
How complex could a configuration format really be?
Instead of just accepting it, I decided to dive deep and understand why YAML earned this reputation.
So I did what any curious developer would do: I implemented a YAML parser from scratch in PHP
.
Is my parser 100% spec-compliant? No.
Was that the goal? Also no.
The goal was understanding. Learning what’s actually possible with YAML beyond basic key-value pairs. And what I discovered was fascinating.
Beyond Simple Configs: YAML’s Hidden Features
Tags: Extending YAML’s Type System
Did you know YAML supports tags?
Beyond predefined tags like !!str, !!int, !!map, and !!seq, you can define custom tags to represent specialized types.
Want to load environment variables? Include external files? Represent custom objects?
Tags make it possible.
database:
host: !env DATABASE_HOST
port: !env DATABASE_PORT
credentials: !include credentials.yaml
Anchors and Aliases: DRY Configuration
YAML lets you define reusable content with anchors (&) and reference it later with aliases (*).
This works for simple values:
First occurrence: &anchor Value
Second occurrence: *anchor
And even complex structures with circular references:
person: &person
name: John
spouse: *spouse
spouse: &spouse
name: Jane
spouse: *person
The Merge Key: Inheritance in YAML
The merge key << isn’t part of the YAML 1.2 spec, but it’s widely supported and incredibly useful for creating configuration variants:
default: &default
host: localhost
port: 3306
development:
<<: *default
database: dev_db
production:
<<: *default
database: prod_db
It even supports merging multiple aliases with overrides:
---
- &CENTER { x: 1, y: 2 }
- &LEFT { x: 0, y: 2 }
- &BIG { r: 10 }
- # Merge one map
<<: *CENTER
r: 10
label: center
- # Merge multiple maps
<<: [ *CENTER, *BIG ]
label: center/big
- # Merge multiple maps
<<: [ *CENTER, *LEFT ]
r: 4
label: center/left
- # Override
<<: [ *BIG, *LEFT, { r: 1 } ]
x: 2
label: center/left/small
Multi-Document Files
A single YAML file can contain multiple documents separated by ---:
---
name: Document 1
value: 123
---
name: Document 2
value: 456
You can also use the document end marker ... to explicitly terminate documents and define per-document directives:
%YAML 1.2
---
name: Document 1
value: 123
...
%YAML 1.1
%TAG !mytag! tag:example.com,2000:app/
---
name: Document 2
value: !mytag!customValue value
...
Explicit Key Notation: When Keys Get Complex
Most YAML users never encounter explicit key notation, but it’s essential when your key itself is a complex structure:
- sun: yellow
- ? earth: blue
: moon: white
Or in its weirdest form: a single scalar as an explicit key followed by an empty key:
{
? foo :,
: bar,
}
Weird, right?
YAML 1.2: A JSON Superset
Here’s something that surprised me: YAML 1.2 is a complete superset of JSON.
This means every valid JSON document is also valid YAML 1.2.
You can literally paste JSON into a YAML 1.2 parser, and it will work without any modifications.
# This is valid YAML 1.2
{
"name": "John Doe",
"age": 30,
"hobbies": ["reading", "coding"],
"address": {
"city": "Berlin",
"country": "Germany"
}
}
This compatibility makes migration between formats seamless and allows you to gradually adopt YAML’s features while maintaining JSON compatibility where needed.
This shows YAML is very flexible.
But is it too flexible?
The Real Beast: Complexity and Ambiguity
And here’s the weirdest thing about YAML: its sheer complexity and countless edge cases.
The specification itself isn’t always clear and is sometimes contradictory.
While building my parser, I had to make judgment calls on how to handle ambiguous cases.
I tried to follow the specification as closely as possible, but practical implementation sometimes required deviating from it.
That’s when I truly understood the comment from the PR I mentioned earlier.
YAML’s flexibility comes at the cost of predictability.
Conclusion
YAML is indeed a weird beast.
It’s simultaneously simple enough for basic configs and complex enough to represent intricate data structures with custom types, inheritance, and cross-references.
Would I recommend using all these features in production? Probably not.
But understanding what’s possible helps you make informed decisions about when to use YAML and when to choose something simpler.
If you’re curious about the implementation details, check out my PHP YAML parser on GitHub
.
It’s not perfect, but it taught me everything I wanted to know about YAML’s quirks and complexities.
