Ansible and AWS – Part 2

DevOps ToolChain, WikiPedia, CC BY-SA 4.0

In this post we’re going to look at Ansible variables and facts, but mostly at variables (because I haven’t worked with facts much to be honest). This is the second part of our series titled Ansible and AWS and adds to the first, so if you get lost make sure and have a look at Ansible and AWS – Part 1.

At some point you’re going to get to a point where you have two machines that mostly look like one another except for their environment, size, web URL, etc. I’ve come across this when having two servers in separate environments, say, development, staging, and production. They will have different URLs, different amounts of CPU or RAM (which can drive certain configuration values). Or, let’s say each machine backs up data to a given S3 bucket, and that backup script you wrote needs the bucket name. A perfect usecase for an Ansible variable.

So let’s quickly look at the four primary locations for a variable, and then I’ll share several rules of thumb I use as to where to put one:

  • in ansible_hosts
  • in group_vars
  • in host_vars
  • in the playbook

Now I did say primary locations because there are other places; for now we’re ignoring variables and defaults that are included in roles or provided on the commandline. For the canonical reference on variables, see the official documentation.

ansible_hosts

Don’t put variables here. Moving on.

I kid, somewhat. When first starting out you may see ansible_hosts files that look like this:

[all]
18.188.72.168 HOSTNAME=ansible-helloworld OPERATING_ENVIRONMENT=staging

Terrible form, but let’s try it out in our playbook (see Ansible and AWS – Part 1) and the sample Github repository. We’re going to add another task to our playbook that creates a file on the server based upon a template. The templating language is Jinja2 (don’t worry, subsequent posts will go into Jinja and Ansible templates in much greater detail). First, create a new directory (inside your ansible-helloworld directory) called templates. This is a specific name for Ansible, so don’t name it template or something else:

# cd ansible-helloworld
# mkdir templates

Inside of templates create a file called environment.j2 (.j2 is the extension used by Jinja2 templates) and populate it with the following content:

# Created by Ansible

OPERATING_ENVIRONMENT='{{ OPERATING_ENVIRONMENT }}'

Note! I personally prefer any file on a server that was created by Ansible to say as much right at the beginning. So many times people have gone on to a server and edited a file without realizing that it was generated and stands a good chance to be overwritten. We could do an entire article on what a good header for a such a file might look like. Hell, I might just do that!

Then, in your playbook add the following task to the end:

Remember that YAML is indentation sensitive, so if you paste this into your playbook, make sure it is properly aligned with the rest of your tasks.

One last step! Locate your hostname task and change the hardcoded ansible-helloworld (that’s our hostname), to "{{ HOSTNAME }}", like this:

If you’re quick on the uptake, you should already know what is going to happen when this playbook is executed. The variable HOSTNAME in the ansible_hosts file is going to be applied in the hostname task, and the OPERATING_ENVIRONMENT variable will be applied in the template task.

Go ahead and run the playbook:

PLAY [all] *********************************************************************

TASK [Gathering Facts] *********************************************************
ok: [18.188.72.168]

TASK [Set our hostname] ********************************************************
ok: [18.188.72.168]

TASK [Install base packages] ***************************************************
ok: [18.188.72.168] => (item=[u'htop', u'zsh'])

TASK [Create /etc/environment] *************************************************
changed: [18.188.72.168]

PLAY RECAP *********************************************************************
18.188.72.168              : ok=4    changed=1    unreachable=0    failed=0

Because there is a new task (the template task), and our hostname didn’t change, we see changed=1.

If you look at /etc/environment on the server, you should see:

cat /etc/environment
# Created by Ansible

OPERATING_ENVIRONMENT='staging'

Nice.

Let’s change our hostname in ansible_hosts to simply helloworld:

[all]
18.188.72.168 HOSTNAME=helloworld OPERATING_ENVIRONMENT=staging

Rerunning the playbook will change the hostname to helloworld, since that is the new value of the HOSTNAME variable.

Group Variables

Notice in our ansible_hosts file there is the [all] tag? Well, we can change to that split hosts up. Let’s rewrite our ansible_hosts file to look like this:

[staging]
18.188.72.168 HOSTNAME=helloworld

and then create a directory (inside your ansible-helloworld directory) called group_vars. Then in group_vars create a file called staging.yml and in it put:

---
OPERATING_ENVIRONMENT:  staging

and run your playbook. If you’re following along verbatim nothing should happen. All we’ve done is extracted variables common to staging servers (like our OPERATING_ENVIRONMENT variable) into a single file that will apply to all of the hosts in the staging group.

Try renaming staging.yml to somethingelse.yml and rerunning. You should get an error regarding an undefined variable, since Ansible wasn’t able to find OPERATING_ENVIRONMENT. When you supplied a properly named group_vars file (staging.yml) it is able to look it up.

Finally, for our group variables, notice the syntax changed from the ansible_hosts “ini-style” syntax to a YAML file. This is important to note!

Host Variables

Now let’s take a look at host variables and how to use them. Create a directory called host_vars, again, inside the ansible-helloworld directory. In it create a directory named the same as how your host is defined in ansible_hosts. My server is being referenced as 18.188.72.168, but since AWS provides an FQDN, I’ll switch to that to demonstrate how what is in ansible_hosts can be an IP address, alias in /etc/hosts, or an FQDN resolvable by DNS. I’m going to change my ansible_hosts to this:

[staging]
ec2-18-188-72-168.us-east-2.compute.amazonaws.com

and then create a file called vars.yml in host_vars/ec2-18-188-72-168.us-east-2.compute.amazonaws.com and place the following content:

---
HOSTNAME:  helloworld

Try the playbook out!

Take Note

Have you noticed that we’ve eliminated the variables from our ansible_hosts file? You may feel otherwise, but I’m a strong proponent of ansible_hosts files that consist of nothing more than groups of hostnames. You might think at first that you’ll have just a couple of variables per host, but odds are this will grow to a dozen or more quickly! Think about the various things that are configurable on an environment or host basis:

  • what monitoring environment will this host report to?
  • where are syslogs being sent?
  • what S3 bucket is used for backing up configuration files (that aren’t in Ansible)?
  • what are the credentials to that bucket?
  • does the host (or group) have custom nameservers?
  • how is application-specific configuration handled?

And so on. Trust me. Get into the good habit of creating a group_vars and host_vars directories and start putting your variables there.

Variables In Playbooks (and Other Places)

You can also put variables in your playbooks, but I rarely do. If it is in the playbook it’s really less of a variable and more of a strict setting, since it will override anything from your ansible_hosts file, group_vars or host_vars. If you’re set on it, try this out. In your playbook, add vars: like this:

That is, nestle it in between the become_method and tasks. Run it and you’ll see that both your hostname and /etc/environment file changes.

Facts

Ansible also provides the ability to reference facts that it has gathered. To be sure, I don’t use this feature as often, but sometimes I’ll need the host’s IP address (perhaps to put it in a configuration file), or how many CPUs it has. Seriously, check out the documentation of all of the awesome facts you can reference in your playbook. Here are some that you may find yourself using, especially if your playbooks require subtle tweaks to support different distributions, amount of memory, CPU, etc.:

  • ansible_distribution_name
  • ansible_distribution_release
  • ansible_distribution_version
  • ansible_eth0 (and properties of it)
  • ansible_memtotal_mb
  • ansible_processor_cores
  • ansible_processor_count

You can use these in your playbook in the same manner as variables:

This is a debug task will just print out a message:

TASK [Total Server RAM (MB)] ***************************************************
ok: [ec2-18-188-72-168.us-east-2.compute.amazonaws.com] => {
    "msg": "Total Server RAM:  990 MB"
}

Let’s make use of facts in our /etc/environment template:

templates/environment.j2:

# Created by Ansible ({{ ansible_date_time.date }} {{ ansible_date_time.time }} {{ ansible_date_time.tz }})

OPERATING_ENVIRONMENT='{{ OPERATING_ENVIRONMENT }}'

Notice here we’re using the ansible_date_time fact to include in our generated /etc/environment file when the file was created. After running:

ubuntu@helloworld:~$ cat /etc/environment
# Created by Ansible (2018-05-11 11:56:48 UTC)

OPERATING_ENVIRONMENT='staging'

Final Remarks

This post might seem at first glance to be a little less “meaty” than Part 1, but variables and facts will comprise a large part of your future playbooks (we didn’t even talk about how they are used in roles). As promised here are some guidelines I use when deciding where to put variable definitions:

If the hosts for a given Ansible playbook can be organized into logical groups (such as staging and production), and there are a set of variables that will be common to all of the staging servers, and likewise common to all of the production servers, these variables are a good candidate to put into group_vars. Examples here might be:

  • the endpoints for log shipping
  • the IP addresses for name servers
  • the IP of a monitoring server
  • AWS region information (for example, if staging is in one region and production is in another)

Or, let’s say you run some type of monitor traps that send e-mails on certain events. You might want to send staging alerts to staging-alerts@iachieved.it vs. production-alerts@iachieved.it. Ansible group variables might come in handy here.

If the variable is specific to a host, then obviously you’d put the information in host_vars. I prefer to explicitly set server hostnames, and so HOSTNAME goes into the vars.yml for the host. Application-specific information for that specific host, put it into host_vars.

I’m hard pressed to think of when I’d want to explicitly put a variable in either ansible_hosts or the playbook itself; in ansible_hosts it just clutters up the file and in the playbook it’s effectively a constant.

Now, make no mistake: you will, over time, refactor your playbooks and the locations and groupings of variables. If you’re a perfectionist and lose sleep over whether you’ve spent enough time in discussions over how to architect your playbooks, well, I feel sad for you. It’ll never be right the first time, so make the best decision you can and sleep well knowing it can be refactored as your environment or needs change.

Getting the Code

You can find the finished playbook for this article on the part2 branch of this Github repository.

Leave a Reply

Your email address will not be published. Required fields are marked *