Photo by Ferenc Almasi on Unsplash
YAML Tutorial: Every DevOps Novice must know
YAML is a stepping stone to become a DevOps Pro
If you are following the DevOps Journey. You must have crossed paths with the term YML / YAML . Even when reading the abbreviation it seems to be complex but trust me it is one of the effortless to learn and grasp part.
If you are new to the DevOps Path I would recommend you to go through the article to get the gist of the things going on here. The Intro to DevOps
What is YAML ?
YAML Ain't Markup Language (YAML) is a serialization language that has steadily increased in popularity over the last few years. It's often used as a format for configuration files, but its object serialization abilities make it a viable replacement for languages like JSON. So the traditional term used to be "Yet Another Markup Language".
[Are you confused with what the Markup language is ? I got you covered :) ]
*Markup Languages
They are the computer language that uses tags to define elements within a document. They are human readable so they contain standard words rather than typical programming syntax. The most popular ones which I am sure you have heard of are HTML [Hyper Text markup Language] and *XML [Xtensible Markup Language]
What is Serialization and Deserialization ?
Serialization is the process of converting an object into a stream of bytes to store the object or transmit it to memory, a database, or a file. Its main purpose is to save the state of an object in order to be able to recreate it when needed and the reverse process is called as the Deserialization.
The YAML acronym was shorthand for Yet Another Markup Language. But the maintainers renamed it to YAML Ain't Markup Language to place more emphasis on its data-oriented features.
Points to note
YAML is a serilzation language.
YAML application is the most in the configuration files for the Docker and the Kuberentes.
YAML also has the applications in the Logs and the Cache part.
What are the benefits of the YAML ?
- It is human understandable and easy to read.
- It has a strict syntax. (We have to be careful with the white spaces and the indentation)
- It can be easily converted to the JSON and XML
- YAML gives us more power to represent the complex data.
- To read the YAML file there are wide variety of Parser tools available.
Enough with the theory; lets dig in and get our hands dirty.
Every Yml file should start with the "---" (3 hyphens) and it also acts as the separator for the distinguishing into various formats of yml file and ends with "..." (3 dots)
We can store the data in the YML in the following formats like.
- Key-value pair
- Lists
- Block Style
---
# Represention of the Key value pairs
name: "aditya"
age: "24"
----
# Representation of the Lists
fruits:
- apple
- mango
- pineapple
---
# Representation of the Block Style
# (The main difference here lies with the indentation)
cities:
- rome
- venice
- naples
...
As from the example above you can see for commenting out the different lines we need to use "#" individually for each line which needs to be commented.
Data Types in YAML
There are various datatypes in the YAML which can be mainly classified as follows:
- string
- integer
- float
- boolean
- null
- dates
---
# String Datatype
name: "Aditya Dhopade"
job: "Wannabe DevOps Pro"
---
# What if we want to explicitly specify to yml file to read
# it as a certain datatype ?
name: !!str "Aditya Dhopade"
---
# Integer Datatype
Years of experience: 2
age: 24
num1: !!int 54
num2: !!int -54
binaryNum: !!int 011001
octalNum: !!int 0678
hexNum: !!int 0x45
commaValues: !!int +450_000 #it would be shown as 450,000
exponents: 6.022E56 # it represents (6.022)^(56)
not a number: .nan
---
# Floating Datatype
temperature: 43.34
obtained percentage: 88.67
---
# Boolean Datatype
# boolean values can also be represented using many forms for the following
# for false values: false, False, FALSE, n, no, No, NO
# for false values : true, True, TRUE, y, yes, Yes,YES
boolean: false
---
# null Datatype
recognition: !!null Null
# a key can also be null and can be represented by "~"
~: this is a null key here
---
# dates Datatype
date: !!timestamp 2022-09-01 # by default it would consider the UTC time
---
# What if you want to print the strings in the multiple lines ?
gibberish: hey there i
am nobody but here to teach #This will not work
---
# For the above statements to work
gibberish: |
Now this should work
as we want here
---
# What if we want to have the strings in multiple lines in Yml
# but to be interpreted as the a single line ?
more gibberish: >
This would be printed as a single line
Dont believe me
Just look now
...
This are some of the basics datatypes which a developer should be knowing so as to keep up with the YML world.
There are some Advanced Concepts which also needs to have a look upon.
Sequences in YML
To understand sequence styles, it is important to understand collections. The concept of collections and sequence styles work in parallel. The collection in YAML is represented with proper sequence styles. If you want to refer proper sequencing of tags, always refer to collections. Collections in YAML are indexed by sequential integers starting with zero as represented in arrays. The focus of sequence styles begins with collections.
Let us consider the example of 4 Wheeler Vehicle Industry. The number of car companies in 4 Wheeler Vehicle Industry Market as a sequence which can be created as a collection. The following code shows how to represent the sequence styles of vehicles in 4 Wheeler Vehicle Industry Market.
Also here we can observe the Block Style and the Flow Style in the YAML structure
---
# Ordered sequence of nodes in the YAML Structure
Block style: !!seq
- Tata
- Maruti
- Hyundai
- KIA
- Renault
- Honda
- Nissan
Flow style: !!seq [ Tata, Maruti, Hyundai, KIA, Renault, Honda, Nissan ]
...
Now if we want to convert it into JSON format then
{
"Flow style": [
"Tata",
"Maruti",
"Hyundai",
"KIA",
"Renault",
"Honda",
"Nissan"
],
"Block style": [
"Tata",
"Maruti",
"Hyundai",
"KIA",
"Renault",
"Honda",
"Nissan"
]
}
Now moving on to the sequences part there are some sequences in which the key can be empty such sequences are called as the Sparse Sequences.
Nested Sequence : The sequence which can be presented as the value field inside the sequence.
Maps: The key- value pairs used in the YAML/ YML files are represented as the Maps The concept of the maps then can be further extended to the nested mappings.
What if we want to represent Unique Values then we could use set datatype.
---
# Sparse sequences
- hey
- there
-
- missing
- you
---
# Nested Sequences
-
- mango
- pineapple
- apple
-
- orange
- lemon
- citron
---
# Nested Mappings
name: "Aditya"
charecteristics:
working: yes
too hard: yes
---
# Set Datatype only lets us use the Unique Values
names: !!set
? aditya
? rohan
? mugdha
...
Q. What if we want the entire sequence to be as the value ?
In such case we can use the Dictionary Datatype; represented as dictionary !!omap
---
# dictionary
People:
- aditya:
age: 24
- mugdha:
age: 24
- rahul:
age: 34
---
likings:
fav fruits: mango
dislikes: kiwi
personal likings:
name: Deepak
fav fruits: mango
dislikes: kiwi
# here we can see the fav fruits and the dislikes have been repeated
# to avoid this use the "Anchors" and used as belows
likings: &likes #properties which needs to be reused
fav fruits: mango
dislikes: kiwi
personal likings:
name: Deepak
<<: *likes #copies here the entire properties
other likings:
name: Aditya
<<: *likes
dislikes: berries
...
These are some of the things we need to take care while learning the YAML. Overall YAML is a very important while writing the configuration files. I hope this somewhat sheds the fog and clears your way to climb the mountain of becoming a DevOps Professional.
To get further dive in onto the YAML and how YAML works with the Ansible and Kubernetes can refer to the following article.
Hope you have a good day and stay tuned for the next one.