yan han's blog

Templating Ansible style YAML files with only variables

12 Dec 2015, by Pang Yan Han

Introduction

Suppose you are working on a project in Ansible and you need want to combine some YAML files which contain variables, such as those in group_vars, all into one single file. However, some / all of these YAML files make heavy use of interpolation and you do not want to reinvent the wheel just to achieve your objective. Ideally, you want to make use of the Ansible codebase to do this.

If the above sounds like the problem you’re trying to tackle, then this post and the accompanying code here on GitHub is what you need.

How to run the code:

Clone the git repository:

git clone git://github.com/yanhan/templating-ansible-style-yaml-files-with-only-variables.git


Run the setup.sh script:

./setup.sh


Put all the YAML files of interest in one folder. Let’s call this folder my_yaml_files. NOTE: missing variables will cause the program to go into an infinite loop. Then run:

. venv/bin/activate
python main.py my_yaml_files
deactivate


and all resulting variables will be dumped as JSON to the standard output.

How does this work

Below is an outline of how all the codebase works.

Our sample YAML files

Now, if you look at the yaml_files folder, there are 2 files. sample_yaml_file.yml:

home_dir: /home/ubuntu
home_bin_dir: "{{ home_dir }}/bin"
march: x86_64
docs_dir: "{{ home_dir }}/docs"
start_everything: "{{ home_bin_dir }}/start-everything"
secret_doc: "{{ docs_dir }}/secret.txt"
my_user: jake
my_home: "/home/{{ my_user }}"
destination: "{{ my_home }}/python2.7"
language: english
translation_file: "{{ my_home }}/dictionary/{{ language }}.txt"
sisters_user: jane
affected_users:
- "{{ my_user }}"
- ubuntu
- "{{ sisters_user }}"


and sample_yaml_file_two.yml:

db: macey
books: "{{ my_home }}/books"


Notice the heavy use of interpolation, especially in sample_yaml_file.yml. Note that the books variable in sample_yaml_file_two.yml makes use of the my_home variable in smaple_yaml_file.yml.

The code

I shall highlight the more important parts.

First:

import argparse
import os
import os.path
import sys
# The following line allows us to use the ansible codebase.
sys.path.append(os.path.join(os.getcwd(), "ansible", "lib"))

from ansible.errors import AnsibleUndefinedVariable
from ansible.template import Templar


Notice the sys.path.append(os.path.join(os.getcwd(), "ansible", "lib")) line and the subsequent 3 imports from various modules under the ansible module. Our code makes use of the ansible codebase as a submodule at the commit with tag v2.0.0-0.7.rc2. This should explain the git submodule init and git submodule update lines in the setup.sh file.

Next:

  # Load variables from the YAML files
yaml_files_dir = os.path.join(args.yaml_files_dir)
var_files = [
os.path.join(yaml_files_dir, file_name)
for file_name in os.listdir(yaml_files_dir)
]
vars_to_template = dict()
for var_file in var_files:


For simplicity, we require that all YAML files we’re trying to template, including those containing variables required by those files, should be placed in one directory for convenience sake, so easily load them using an instance of the ansible.parsing.dataloader.DataLoader class provided by Ansible. Specifically, we make use of its load_from_file method. After this part, vars_to_template is a dict of all variables in all the YAML files, waiting to be templated:

{
u'my_user': u'jake',
u'march': u'x86_64',
u'language': u'english',
u'my_home': u'/home/{{ my_user }}',
u'db': u'macey',
u'home_dir': u'/home/ubuntu',
u'sisters_user': u'jane',
u'secret_doc': u'{{ docs_dir }}/secret.txt',
u'affected_users': [u'{{ my_user }}', u'ubuntu', u'{{ sisters_user }}'],
u'books': u'{{ my_home }}/books',
u'translation_file': u'{{ my_home }}/dictionary/{{ language }}.txt',
u'home_bin_dir': u'{{ home_dir }}/bin',
u'destination': u'{{ my_home }}/python2.7',
u'start_everything': u'{{ home_bin_dir }}/start-everything',
u'docs_dir': u'{{ home_dir }}/docs'
}


Moving on:

  templar = Templar(loader=dl)
result_vars = dict()

while vars_to_template:
successfully_templated_vars = []
for var_name, value in vars_to_template.items():
try:
templated_value = templar.template(value)
result_vars[var_name] = templated_value
successfully_templated_vars.append(var_name)
templar.set_available_variables(result_vars.copy())
except AnsibleUndefinedVariable:
pass
for var_name in successfully_templated_vars:
del vars_to_template[var_name]


We make use of an instance of the ansible.template.Templar class to perform the templating. The Templar.template method may raise a AnsibleUndefinedVariable if it happens to template a variable which requires some variable that’s not found in its available variables. At this point, we have all the variables, those that require templating and those that do not, and we could perform a topological sort of variables so we can template all variables with all the dependencies fulfilled (assuming that all required variables are present), we’re too lazy to do it.

Instead, we maintain result_vars, a dict of all variables that do not need to be templated, and vars_to_template, which is all variables we’ve loaded from the YAML files into memory. Using the while loop, we repeatedly try to templating all variables in vars_to_template until we run out of variables. Successfully templated variables are added to result_vars and we use the Templar.set_available_variables method to make that variable available the next time we perform templating using the Templar.template method. In addition, we append the name of that variable to a list named successfully_templated_vars. At the end of each iteration of the while loop, we remove all successfully templated variables from vars_to_template. This is done until we run out of variables. If in the event that there are missing variables required for templating, the while loop will not terminate.

Outputting the variables

At this point, if we print the result_vars dict, we will see this:

{
u'my_user': u'jake',
u'march': u'x86_64',
u'language': u'english',
u'my_home': u'/home/jake',
u'db': u'macey',
u'home_dir': u'/home/ubuntu',
u'sisters_user': u'jane',
u'secret_doc': u'/home/ubuntu/docs/secret.txt',
u'affected_users': [u'jake', u'ubuntu', u'jane'],
u'books': u'/home/jake/books',
u'translation_file': u'/home/jake/dictionary/english.txt',
u'home_bin_dir': u'/home/ubuntu/bin',
u'destination': u'/home/jake/python2.7',
u'start_everything': u'/home/ubuntu/bin/start-everything',
u'docs_dir': u'/home/ubuntu/docs'
}


while it may seem that the “most atomic” values are unicode strings, that is in fact not the case. They are ansible.parsing.yaml.objects.AnsibleUnicode objects. So if we use yaml.dumps to serialize result_vars, we’ll see some rather intimidating looking output like the following:

? !!python/object/new:ansible.parsing.yaml.objects.AnsibleUnicode
args: [!!python/unicode 'affected_users']
state: {_column_number: 1, _data_source: /home/philip/templating-ansible-style-yaml-files-with-only-variables/yaml_files/sample_yaml_file.yml,
_line_number: 13}
: - !!python/unicode 'jake'
- !!python/object/new:ansible.parsing.yaml.objects.AnsibleUnicode
args: [!!python/unicode 'ubuntu']
state: {_column_number: 5, _data_source: /home/philip/templating-ansible-style-yaml-files-with-only-variables/yaml_files/sample_yaml_file.yml,
_line_number: 15}
- !!python/unicode 'jane'


Hence in main.py, I chose to use JSON as the default output format, which yields something a lot more friendly to humans:

{
"my_user": "jake",
"march": "x86_64",
"language": "english",
"my_home": "/home/jake",
"db": "macey",
"home_dir": "/home/ubuntu",
"sisters_user": "jane",
"secret_doc": "/home/ubuntu/docs/secret.txt",
"affected_users": ["jake", "ubuntu", "jane"],
"books": "/home/jake/books",
"translation_file": "/home/jake/dictionary/english.txt",
"home_bin_dir": "/home/ubuntu/bin",
"destination": "/home/jake/python2.7",
"start_everything": "/home/ubuntu/bin/start-everything",
"docs_dir": "/home/ubuntu/docs"
}


Mission accomplished.