In this post we’re going to look at Ansible variables and facts, but mostly at variables (because I haven’t worked with facts much to be honest). This is the second part of our series titled Ansible and AWS and adds to the first, so if you get lost make sure and have a look at Ansible and AWS – Part 1.
At some point you’re going to get to a point where you have two machines that mostly look like one another except for their environment, size, web URL, etc. I’ve come across this when having two servers in separate environments, say, development, staging, and production. They will have different URLs, different amounts of CPU or RAM (which can drive certain configuration values). Or, let’s say each machine backs up data to a given S3 bucket, and that backup script you wrote needs the bucket name. A perfect usecase for an Ansible variable.
So let’s quickly look at the four primary locations for a variable, and then I’ll share several rules of thumb I use as to where to put one:
- in
ansible_hosts
- in
group_vars
- in
host_vars
- in the playbook
Now I did say primary locations because there are other places; for now we’re ignoring variables and defaults that are included in roles or provided on the commandline. For the canonical reference on variables, see the official documentation.
ansible_hosts
Don’t put variables here. Moving on.
I kid, somewhat. When first starting out you may see ansible_hosts
files that look like this:
[code lang=text]
[all]
18.188.72.168 HOSTNAME=ansible-helloworld OPERATING_ENVIRONMENT=staging
[/code]
Terrible form, but let’s try it out in our playbook (see Ansible and AWS – Part 1) and the sample Github repository. We’re going to add another task to our playbook that creates a file on the server based upon a template. The templating language is Jinja2 (don’t worry, subsequent posts will go into Jinja and Ansible templates in much greater detail). First, create a new directory (inside your ansible-helloworld
directory) called templates
. This is a specific name for Ansible, so don’t name it template
or something else:
[code lang=text]
# cd ansible-helloworld
# mkdir templates
[/code]
Inside of templates
create a file called environment.j2
(.j2
is the extension used by Jinja2 templates) and populate it with the following content:
[code lang=text]
# Created by Ansible
OPERATING_ENVIRONMENT='{{ OPERATING_ENVIRONMENT }}'
[/code]
Note! I personally prefer any file on a server that was created by Ansible to say as much right at the beginning. So many times people have gone on to a server and edited a file without realizing that it was generated and stands a good chance to be overwritten. We could do an entire article on what a good header for a such a file might look like. Hell, I might just do that!
Then, in your playbook add the following task to the end:
1 2 3 4 5 |
# Create /etc/environment - name: Create /etc/environment template: src: environment.j2 dest: /etc/environment |
Remember that YAML is indentation sensitive, so if you paste this into your playbook, make sure it is properly aligned with the rest of your tasks.
One last step! Locate your hostname
task and change the hardcoded ansible-helloworld
(that’s our hostname), to "{{ HOSTNAME }}"
, like this:
1 2 3 4 |
# Set our hostname - name: Set our hostname hostname: name: "{{ HOSTNAME }}" |
If you’re quick on the uptake, you should already know what is going to happen when this playbook is executed. The variable HOSTNAME
in the ansible_hosts
file is going to be applied in the hostname
task, and the OPERATING_ENVIRONMENT
variable will be applied in the template
task.
Go ahead and run the playbook:
[code lang=text]
PLAY [all] *********************************************************************
TASK [Gathering Facts] *********************************************************
ok: [18.188.72.168]
TASK [Set our hostname] ********************************************************
ok: [18.188.72.168]
TASK [Install base packages] ***************************************************
ok: [18.188.72.168] => (item=[u'htop', u'zsh'])
TASK [Create /etc/environment] *************************************************
changed: [18.188.72.168]
PLAY RECAP *********************************************************************
18.188.72.168 : ok=4 changed=1 unreachable=0 failed=0
[/code]
Because there is a new task (the template
task), and our hostname didn’t change, we see changed=1
.
If you look at /etc/environment
on the server, you should see:
[code lang=text]
cat /etc/environment
# Created by Ansible
OPERATING_ENVIRONMENT='staging'
[/code]
Nice.
Let’s change our hostname in ansible_hosts
to simply helloworld
:
[code lang=text]
[all]
18.188.72.168 HOSTNAME=helloworld OPERATING_ENVIRONMENT=staging
[/code]
Rerunning the playbook will change the hostname to helloworld
, since that is the new value of the HOSTNAME
variable.
Group Variables
Notice in our ansible_hosts
file there is the [all]
tag? Well, we can change to that split hosts up. Let’s rewrite our ansible_hosts
file to look like this:
[code lang=text]
[staging]
18.188.72.168 HOSTNAME=helloworld
[/code]
and then create a directory (inside your ansible-helloworld
directory) called group_vars
. Then in group_vars
create a file called staging.yml
and in it put:
[code lang=text]
—
OPERATING_ENVIRONMENT: staging
[/code]
and run your playbook. If you’re following along verbatim nothing should happen. All we’ve done is extracted variables common to staging servers (like our OPERATING_ENVIRONMENT
variable) into a single file that will apply to all of the hosts in the staging
group.
Try renaming staging.yml
to somethingelse.yml
and rerunning. You should get an error regarding an undefined variable, since Ansible wasn’t able to find OPERATING_ENVIRONMENT
. When you supplied a properly named group_vars
file (staging.yml
) it is able to look it up.
Finally, for our group variables, notice the syntax changed from the ansible_hosts
“ini-style” syntax to a YAML file. This is important to note!
Host Variables
Now let’s take a look at host variables and how to use them. Create a directory called host_vars
, again, inside the ansible-helloworld
directory. In it create a directory named the same as how your host is defined in ansible_hosts
. My server is being referenced as 18.188.72.168, but since AWS provides an FQDN, I’ll switch to that to demonstrate how what is in ansible_hosts
can be an IP address, alias in /etc/hosts
, or an FQDN resolvable by DNS. I’m going to change my ansible_hosts
to this:
[code lang=text]
[staging]
ec2-18-188-72-168.us-east-2.compute.amazonaws.com
[/code]
and then create a file called vars.yml
in host_vars/ec2-18-188-72-168.us-east-2.compute.amazonaws.com
and place the following content:
[code lang=text]
—
HOSTNAME: helloworld
[/code]
Try the playbook out!
Take Note
Have you noticed that we’ve eliminated the variables from our ansible_hosts
file? You may feel otherwise, but I’m a strong proponent of ansible_hosts
files that consist of nothing more than groups of hostnames. You might think at first that you’ll have just a couple of variables per host, but odds are this will grow to a dozen or more quickly! Think about the various things that are configurable on an environment or host basis:
- what monitoring environment will this host report to?
- where are syslogs being sent?
- what S3 bucket is used for backing up configuration files (that aren’t in Ansible)?
- what are the credentials to that bucket?
- does the host (or group) have custom nameservers?
- how is application-specific configuration handled?
And so on. Trust me. Get into the good habit of creating a group_vars
and host_vars
directories and start putting your variables there.
Variables In Playbooks (and Other Places)
You can also put variables in your playbooks, but I rarely do. If it is in the playbook it’s really less of a variable and more of a strict setting, since it will override anything from your ansible_hosts
file, group_vars
or host_vars
. If you’re set on it, try this out. In your playbook, add vars:
like this:
1 2 3 4 5 6 |
become_method: sudo vars: HOSTNAME: from_my_playbook OPERATING_ENVIRONMENT: from_my_playbook tasks: |
That is, nestle it in between the become_method
and tasks
. Run it and you’ll see that both your hostname and /etc/environment
file changes.
Facts
Ansible also provides the ability to reference facts that it has gathered. To be sure, I don’t use this feature as often, but sometimes I’ll need the host’s IP address (perhaps to put it in a configuration file), or how many CPUs it has. Seriously, check out the documentation of all of the awesome facts you can reference in your playbook. Here are some that you may find yourself using, especially if your playbooks require subtle tweaks to support different distributions, amount of memory, CPU, etc.:
ansible_distribution_name
ansible_distribution_release
ansible_distribution_version
ansible_eth0
(and properties of it)ansible_memtotal_mb
ansible_processor_cores
ansible_processor_count
You can use these in your playbook in the same manner as variables:
1 2 3 |
- name: Total Server RAM (MB) debug: msg: "Total Server RAM: {{ ansible_memtotal_mb }} MB" |
This is a debug
task will just print out a message:
[code lang=text]
TASK [Total Server RAM (MB)] ***************************************************
ok: [ec2-18-188-72-168.us-east-2.compute.amazonaws.com] => {
"msg": "Total Server RAM: 990 MB"
}
[/code]
Let’s make use of facts in our /etc/environment
template:
templates/environment.j2
:
[code lang=text]
# Created by Ansible ({{ ansible_date_time.date }} {{ ansible_date_time.time }} {{ ansible_date_time.tz }})
OPERATING_ENVIRONMENT='{{ OPERATING_ENVIRONMENT }}'
[/code]
Notice here we’re using the ansible_date_time
fact to include in our generated /etc/environment
file when the file was created. After running:
[code lang=text]
ubuntu@helloworld:~$ cat /etc/environment
# Created by Ansible (2018-05-11 11:56:48 UTC)
OPERATING_ENVIRONMENT='staging'
[/code]
Final Remarks
This post might seem at first glance to be a little less “meaty” than Part 1, but variables and facts will comprise a large part of your future playbooks (we didn’t even talk about how they are used in roles). As promised here are some guidelines I use when deciding where to put variable definitions:
If the hosts for a given Ansible playbook can be organized into logical groups (such as staging and production), and there are a set of variables that will be common to all of the staging servers, and likewise common to all of the production servers, these variables are a good candidate to put into group_vars
. Examples here might be:
- the endpoints for log shipping
- the IP addresses for name servers
- the IP of a monitoring server
- AWS region information (for example, if staging is in one region and production is in another)
Or, let’s say you run some type of monitor traps that send e-mails on certain events. You might want to send staging alerts to staging-alerts@iachieved.it vs. production-alerts@iachieved.it. Ansible group variables might come in handy here.
If the variable is specific to a host, then obviously you’d put the information in host_vars
. I prefer to explicitly set server hostnames, and so HOSTNAME
goes into the vars.yml
for the host. Application-specific information for that specific host, put it into host_vars
.
I’m hard pressed to think of when I’d want to explicitly put a variable in either ansible_hosts
or the playbook
itself; in ansible_hosts
it just clutters up the file and in the playbook
it’s effectively a constant.
Now, make no mistake: you will, over time, refactor your playbooks and the locations and groupings of variables. If you’re a perfectionist and lose sleep over whether you’ve spent enough time in discussions over how to architect your playbooks, well, I feel sad for you. It’ll never be right the first time, so make the best decision you can and sleep well knowing it can be refactored as your environment or needs change.
Getting the Code
You can find the finished playbook for this article on the part2
branch of this Github repository.