Software Development Tips and Tricks


Ansible Vault IDs

There are times when not only you’ll want to have separate vault files for development, staging, and production, but when you will also want to have separate passwords for those individual vaults. Enter vault ids, a feature of Ansible 2.4 (and later).

I had a bit of trouble getting this configured correctly, so I wanted to share my setup in hopes you find it useful as well.

First, we’ll create three separate files that contain our vault passwords. These files should not be checked into revision control, but instead reside in your protected home directory or some other secure location. These files will contain plaintext passwords that will be used to encrypt and decrypt your Ansible vaults. Our files are as follows:

  • ~/.vault-pass.common
  • ~/.vault-pass.staging
  • ~/.vault-pass.production

As you can already guess we’re going to have three separate passwords for our vaults, one each for common credentials we want to encrypt (for example, an API key that is used to communicate with a third party service and is used for all environments), and our staging and production environments. We’ll keep it simple for the contents of each password file:

Obligatory Warning: Do not use these passwords in your environment but instead create strong passwords for each. To create a strong password instead you might try something like:

Once you’ve created your three vault password files, now add to your ansible.cfg [general] section:

vault_identity_list = common@~/.vault-pass.common, staging@~/.vault-pass.staging, production@~/.vault-pass.production

It’s important to note here that your ansible.cfg vault identity list will be consulted when you execute your Ansible playbooks. If the first password won’t open the vault, it will move on to the next one, until one of them works (or, conversely, doesn’t).

Encrypting Your Vaults

To encrypt your vault file you must now explicitly choose which id to encrypt with. For example,


we will encrypt with our common vault id, like this:

# ansible-vault encrypt --encrypt-vault-id common common_vault
Encryption successful

Run head -1 on the resulting file and notice that the vault id used to encrypt is in the header:

If you are in the same directory as your ansible.cfg file, go ahead and view it with ansible-vault view common_vault. Your first identity file (.vault-pass.common) will be consulted for the password. If, however, you are not in the same directory with your ansible.cfg file, you’ll be prompted for the vault password. To make this global, you’ll want to place the vault_identity_list in your ~/.ansible.cfg file.

Repeat the process for other vault files, making sure to specify the id you want to encrypt with:

For a staging vault file:

For a production vault file:

Now you can view any of these files without providing your vault password since ansible.cfg will locate the right password. The same goes running ansible-playbook! Take care though that when you decrypt a file, if you intend on re-encrypting it that you must provide an id to use with the --encrypt-vault-id option!

A Bug, I Think

I haven’t filed this with the Ansible team, but I think this might be a bug. If you are in the same directory as your ansible.cfg (or the identity list is in .ansible.cfg), using --ask-vault to require a password on the command line will ignore the password if it can find it in your vault_identity_list password files. I find this to be counterintuitive: if you explicitly request a password prompt, the password entered should be the one that is attempted, and none other. For example:

# ansible-vault --ask-vault view common_vault
Vault password:

If I type anything other than the actual password for the common identity, I should get an error. Instead Ansible will happily find the password in ~/.vault-pass.common and view the file anyway.

Some Additional Thoughts

I wanted to take a moment to address a comment posted on this article, which can be summarized as:

What’s the point of encrypting services passwords in a vault which you check in to a repository, then pass around a shared vault-passwords file that decrypts them outside of the repository, rather than simply sharing a properties file that has the passwords to the services? It just seems like an extra layer of obfuscation rather than actually more secure.

First, to be clear, a “shared vault-passwords file” is not passed around – either the development operations engineer(s) are or a secured build server is permitted to have the vault passwords. Second, with this technique, you have a minimal number of passwords that are stored in plain text. True, these passwords are capable of unlocking any vaults encrypted with them, but this is true of any master password. Finally, I disagree with the assertion that this is an “extra layer of obfuscation.” If that were the case, any encryption scheme that had a master password (which is what utilizing an Ansible vault password file is), could be considered obfuscation. In the end, this technique is used to accomplish these goals:

  • permit separate sets of services passwords for different environments, i.e., staging and production
  • allow for submitting those services passwords in an encrypted format into a repository (the key here is that these are submitted to a known location alongside the rest of the configuration)
  • allow for decryption of those vaults in a secured environment such as a development operations user account or build server account


Ansible and AWS – Part 5

In Part 5 of our series, we’ll explore provisioning users and groups with Ansible on our AWS servers.

Anyone who has had to add users to an operating environment knows how complex things can get in a hurry. LDAP, Active Directory, and other technologies are designed to provide a centralized repository of users, groups, and access rules. Or, for Linux systems, you can skip that complexity and provision users directly on the server. If you have a lot of servers, Ansible can easily be used to add and delete users and provision access controls.

Now, if you come from an “enterprise” background you might protest and assert that LDAP is the only way to manage users across your servers. You’re certainly entitled to your opinion. But if you’re managing a few dozen or so machines, there’s nothing wrong (in my book) with straight up Linux user provisioning.

Regardless of the technology used, thought must still be given to how your users will be organized, and what permissions users will be given. For example, you might have operations personnel that require sudo access on all servers. Some of your developers may be given the title architect which provides them the luxury of sudo as well on certain servers. Or, you might have a test group that is granted sudo access on test servers, but not on staging servers. And so on. The point is, neither LDAP, Active Directory, or Ansible negate your responsbility of giving thought to how users and groups are organized and setting a policy around it.

So, let’s put together a team that we’ll give different privileges on different systems. Our hypothetical team looks like this:


We’ve decided that access on a given server (or environment) will follow these rules:

productionOnly architects and operations gets access to the environment, and they get sudo access
stagingAll users except tptesters get access to the environment, only architects, operations, and developers get sudo access
testAll users get access to the environent, and with the exception of tptesters, they get sudo access
operationsOnly operations get access to their environment, and they get sudo access

Now, let’s look at how we can enforce these rules with Ansible!

users Role

We’re going to introduce Ansible roles in this post. This is by no means a complete tutorial on roles, for that you might want to check out the Ansible documentation.

Note: git clone https://github.com/iachievedit/ansible-helloworld to get the example repository for this series, and switch to part4 (git checkout part4) to pick up where we left off.

Let’s get started by creating our roles directory in ansible-helloworld.

# git clone https://github.com/iachievedit/ansible-helloworld
# cd ansible-helloworld
# git checkout part4
# mkdir roles

Now we’re going to use the command ansible-galaxy to create a template (not to be confused with a Jinja2 template!) for our first role.

# cd roles
# ansible-galaxy init users
- users was created successfully

Drop in to the users directory that was just created and you’ll see:

# cd users
# ls
README.md files     meta      templates vars
defaults  handlers  tasks     tests

We’ll be working in three directories, vars, files, and tasks. In roles/vars/main.yml add the following:

Recall in previous tutorials our variables definitions were simple tag-value pairs (like HOSTNAME: helloworld). In this post we’re going to take advantage of the fact that variables can be complex types that include lists and dictionaries.

Now, let’s create our users role tasks. We’ll start with creating our groups. In roles/tasks/main.yml:

There’s another new Ansible keyword in use here, loop. loop will take the items in the list given by usergroups and iterate over them, with each item being “plugged in” to item. The Python equivalent might look like:

Loops are powerful constructs in roles and playbooks, so make sure and review the Ansible documentation to review what all can be accomplished with them. Also check out Chris Torgalson’s Untangling Ansible’s Loops, a great overview of Ansible loops and how to leverage them in your playbooks. It also turns out this post is using loops and various constructs to provision users, so definitely compare and contrast the different approaches!

Our next Ansible task will create the users and place them in the appropriate group.

Here it’s important to note that users is being looped over (that is, every item in the list users), and that we’re using a dot-notation to access values inside item. For the first entry in the list, item would have:

item.name = alice
item.group = architects

Now, we could have chosen to allow for multiple groups for each user, in which case we might have defined something like:

That looks pretty good so we’ll stick with that for the final product.

With our user definitions in hand, let’s create an appropriate task to create them in the correct environment. There are two more keywords to introduce: block and when. Let’s take a look:

The block keyword allows us to group a set of tasks together, and is frequently used in conjunction with some type of “scoping” keyword. In this example, we’re using the when keyword to execute the block when a certain condition is met. The tags keyword is another “scoping” keyword that is useful with a block.

Our when conditional indicates that the block will run only if the following conditions are met:

  • the host is in the production group (as defined in ansible_hosts)
  • the user is in either the architects or operations group

The syntax for specifiying this logic looks a little contrived, but it’s quite simple and uses in to return true if a given value is in the specified list. 'production' in group_names is true if the group_names list contains the value production. Likewise for item.groups, but in this case we use the or conditional to add the user to the server if their groups value contains either architects or operations.

We’re not quite done! We want our architects and operations groups to have sudo access on the production servers, so we add the following to our block:

Combining everything together, for production we have:

SSH Keys

Users on our servers will gain access through SSH keys. To add them:

Another new module! authorized_key will edit the ~/.ssh/authorized_keys file of the given user and add the key specified in the key parameter. The lookup function will go and get the key contents from a file (the first argument) given in the location {{ ssh_keys/item.name }}, which will expand to our user’s name.

Note that the lookup function searches the files directory in our role. That is, we have the following:


We do not encrypt public keys (if someone complains you didn’t encrypt their public key, slap them, it’ll make you feel better).


It was years into my career before I realized there was more to life than ksh. No joke, I didn’t realize there was anything but! Today there are a variety of shells, bash, zsh and fish just to name a few. I’ve also learned that an individual’s shell of choice is often as sacrosanct as their choice of editor. So let’s add the ability to set the user’s shell of preference.

First, we need to specify the list of shells we’re going to support. In roles/users/vars/main.yml we’ll add:

bash is already present on our Ubuntu system, so no need to explicitly add it.

Now, in our role task, we add the following before any users are created.

This will ensure all of the shell options we given users are properly installed on the server.

Back to roles/users/vars/main.yml, let’s set the shells of our users:

A different shell for everyone!

Then, again in our role task, we update any addition of a user to include their shell:

Quite simple and elegant.

Editor’s Prerogative: Since this is my blog, and you’re reading it, I’ll give you my personal editor preference. Emacs (or an editor with Emacs keybindings, like Sublime Text) for writing (prose or code), and Vim for editing configuration files. No joke.


Our production environment had a simple rule: only architects and operations are allowed to login, and both get sudo access. Our staging environment is a bit more complicated, all users except tptesters get access to the environment, but only architects, operations, and developers get sudo access. Moreover, we want to have a single lineinfile task and use with_items in it to add the appropriate lines. Unfortunately this isn’t as easy as it sounds, as having with_items in the lineinfile task interferes with our loop tasks. So, we create a separate task specifically for our sudoers updates, and in the end have:

Again, note that we first use a block to create our users and authorized_keys updates for the staging group, only doing so for architects, operations, developers, and testers. The second task adds the appropriate lines in the sudoers file.

Deleting Users (or Groups)

We have a way to add users; we’ll also need a way to remove them (my telecom background comes through when I say “deprovision”).

In roles/users/vars/main.yml we’ll add a new variable deletedusers which contains a list of user names that are to be removed from a server. While we’re at it, let’s add a section from groups that we want to delete as well.

We can then update our user task:

As with the users, we’ll loop over deletedusers and use the absent state to remove the user from the system. Finally, any groups that require deletion can be done so as well with state: absent on the group task.

One last note with the user task with Ansible; we’ve only scratched the surface of its capabilities. There are a variety of parameters that can be set such as expires, home, and of particular interest, remove. Specifying remove: yes will delete the user’s home directory (and files in it), along with their mail spool. If you truly want to be sure and nuke the user from orbit, specify remove: yes in your user task for deletion.


If you go and look at the part5 branch of the GitHub repository, you’ll see that we’ve heavily refactored the main.yml file to rely on include statements. Like good code, a single playbook or Ansible task file shouldn’t be too incredibly long. In the end, our roles/users/tasks/main.yml looks like this:

Hopefully this post has given you some thoughts on how to leverage Ansible for adding and deleting users on your servers. In a future post we might look at how to do the same thing, but with using LDAP and Ansible together.

This Series

Each post in this series is building upon the last. If you missed something, here are the previous posts. We’ve also put everything on this Github repository with branches that contain all of the changes from one part to the next.

To get the most out of walking through this tutorial on your own, download the repository and check out part4 to build your way through to part5.


Ansible and AWS – Part 4

So far in our series we’ve covered some fundamental Ansible basics. So fundamental, in fact, that we really haven’t shared anything that hasn’t been written or covered before. In this post I hope to change that with an example of creating an AWS RDS database (MySQL-powered) solely within an Ansible playbook.

If you’re new to this series, we’re building up an Ansible playbook one step at a time, starting with Part 1. Check out the Github repository that accompanies this series to come up to speed. The final result of this post will be on the part4 branch.


First, some prerequisites, you’ll need to:

  • generate a MySQL password
  • pick a MySQL administrator username
  • create an IAM user in S3 for RDS access

I love this page for one-liner password creation. Here’s one that works nicely on macOS:

date +%s | shasum -a 256 | base64 | head -c 32 ; echo

Whatever you generate will go in your vault file, and for this post we’ll create a new vault file for our staging group.

IAM User

We’re going to use an AWS IAM user with programmatic access to create the RDS database for us. I’m just going to name mine RDSAdministrator and directly attach the policy AmazonRDSFullAccess. Capture your access key and secret access key for use in the vault.

Reorganizing Our Variables

I mentioned in Part 2 that like code, playbooks will invariably be refactored. We’re going to refactor some bits now!

There’s a special group named all (we’ve used it before), and we’re going to use it with our RDS IAM credentials. This is worth paying particularly close attention to.

In group_vars we’ll create two directories, all and staging. staging.yml that we had before will be renamed to staging/vars.yml. all will have a special file named all.yml and both all and staging directories will contain a vault file, so in the end we have something like this:

          |    |
          |    |-all.yml
          |    +-vault

This is important. The use of all and all.yml is another bit of magic. Don’t rename all.yml to vars.yml, it will not work!

Let’s recap what variables we’re placing in each file and why:

AWS_RDS_ACCESS_KEYgroup_vars/all/all.ymlIAM access credentials to RDS will apply to all groups (staging, production, etc.)
AWS_RDS_SECRET_KEYgroup_vars/all/all.ymlIAM access credentials to RDS will apply to all groups (staging, production, etc.)
AWS_RDS_SECURITY_GROUPgroup_vars/all/all.ymlOur security group allowing access to created MySQL databases will apply to all Ansible groups
OPERATING_ENVIRONMENTgroup_vars/staging/vars.ymlThe OPERATING_ENVIRONMENT is set to staging for all servers in that group
MYSQL_ADMIN_USERNAMEgroup_vars/staging/vars.ymlStaging servers will utilize the same MySQL credentials
MYSQL_ADMIN_PASSWORDgroup_vars/staging/vars.ymlStaging servers will utilize the same MySQL credentials
HOSTNAMEhost_vars//vars.ymlEach host has its own hostname, and thus this goes into the host_vars
AWS_S3_ACCESS_KEYhost_vars//vars.ymlIAM access credentials to read S3 buckets is currently limited to a single host
AWS_S3_SECRET_KEYhost_vars//vars.ymlIAM access credentials to read S3 buckets is currently limited to a single host

This is our current organization. Over time we may decide to limit the RDS IAM credentials to a specific host, or move the S3 IAM credentials to the staging group.


Amazon’s Relational Database Service is a powerful tool for instantiating a hosted relational database. With several different types of database engines to choose from I expect increased usage of RDS, especially by smaller shops that need to reserve their budget for focusing on applications and less on managing infrastructure items such as databases.

If you’ve never used RDS before, I highly suggest you go through the AWS Management Console and create one by hand before using an Ansible playbook to create one.

MySQL Access Security Group

We’re going to cheat here a bit, only because I haven’t had the opportunity to work with EC2 security groups with Ansible. But to continue on we’re going to need a security group that we’ll apply to our database instance to allow traffic on port 3306 from our EC2 instances.

I’ve created a group called mysql-sg and set the Inbound rule to accept traffic on port 3306 from any IP in the subnet (which covers my account’s default VPC address range).


Note: If you have abandoned the default VPC in favor of a highly organized set of VPCs for staging, production, etc. environments, you’ll want to adjust this.

Pip and Boto

The Ansible AWS tasks rely on the boto package for interfacing with the AWS APIs (which are quite extensive). We will want to utilize the latest boto package available and install it through pip rather than apt-get install. The following tasks accomplish this for us:

rds Module

Finally, we’ve come to the all-important RDS module! Our previous Ansible modules have been straightforward with just a couple of parameters (think apt, hostname, etc.). The rds module requires a bit more configuration.

Here’s the task and then we’ll discuss each of the parameters.

This is minimal number of parameters I’ve found to be required to get a functioning RDS database up-and-running.

The aws_access_key and aws_secret_key parameters are self-explanatory, and we provide the RDS IAM credentials from our encrypted vault.

The command parameter used here is create, i.e., we are creating a new RDS database. db_engine is set to MySQL since we’re going to be use MySQL as our RDS database engine. Let’s take a look at the rest:

  • region – like many AWS services, RDS databases are located in specific regions. us-east-2 is the region we’re using in this series. For a complete list of regions available for RDS, see Amazon’s documentation.
  • instance_name – our RDS instance needs a name
  • instance_type – the RDS instance type. db.t2.micro is a small free-tier eligible type. For a complete list of instance classes (types), see Amazon’s documentation.
  • username – our MySQL administrator username
  • password – our MySQL administrator password
  • size – the size, in GB, of our database. 10GB is the smallest database size one can create (try going lower)

The next two parameters publicly_accessible and vpc_security_groups dictate who can connect to our RDS database instance. We don’t want our database to be publicly accessible; only instances within our VPC should be able to communicate with this database, so we set publicly_accessible to no. However, without specifying a security group for our instance, no one will be to talk to it. Hence, we specify that our security group created above is to be applied with the vpc_security_groups parameter.

Creating an RDS database takes time. For our db.t2.micro instance, from initial creation to availability took about 10 minutes. To prevent an error such as Timeout waiting for RDS resource staging-mysql we increased our wait_timeout from 300 seconds (5 minutes) to 900 seconds (15 minutes).


You’ll notice we snuck in a new parameter to our rds task called register. register is the Ansible method of storing the result of a task in order to use it later in the playbook. The rds task provides a considerable amount of detailed output that is useful to us, the least of which is the endpoint name of the RDS database we just created. That’s important to know! We’ll use this to create a database in our new MySQL instance with the mysql_db task.

Like the rds task, mysql_db has a few dependencies. For it to function you’ll want to add the mysql-python pip package, which in turn requires libmysqlclient-dev (not including libmysqlclient-dev will throw a mysql_config not found error). This can be accomplished with:

Including mysql-client was not, strictly speaking, required, but it’s a handy package to have installed on a server that accesses a database.

Creating a MySQL database with Ansible

Now that we’ve created our RDS instance, let’s create a database in that instance with Ansible’s mysql_db task.

Remember that register: rds on the rds task is what allows us to use rds.instance.endpoint (for grins you might put in a debug task to see all of the information the rds task returns). We provide our MySQL username and password (this is to login to the RDS instance), and the name of the database we want to create. present is the default for the state parameter, but I like to include it anyway to be explicit in the playbook as to what I’m trying to accomplish.

There are a lot of parameters for the mysql_db module, so make sure and check out the Ansible documentation.

Configuration Information

Undoubtedly we’re going to provide our software with access to the database, and knowing it’s endpoint is important. Let’s save this information to our /etc/environment by adding the following line to templates/environment.j2:

Note that since we’re going to be referencing the output of the rds task in this template we will want to move writing out /etc/environment to the end of the playbook.

If desired, we can also create a .my.cnf file for the ubuntu user that provides automated access to the database. Now, it’s up to you to decide whether or not you’re comfortable with the idea of the password included in .my.cnf; if you aren’t, simply remove it. To create .my.cnf we’ll introduce one additional new module, blockinfile.

blockinfile allows you to add blocks of text to either an existing file, or if you specify create: yes as shown here, will create the file and add the provided content (given in the block parameter).

You can see how running this task performs:

ubuntu@helloworld:~$ cat .my.cnf
password="<Our MySQL administrator password>"

ubuntu@helloworld:~$ mysql
Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 83
Server version: 5.6.39-log MySQL Community Server (GPL)


We’ve covered a lot of ground in this post! Several new modules, refactoring our variables, using Amazon’s RDS service. Whew! Here’s a quick recap of the new modules that you’ll want to study:

  • rds – you can create, modify, and delete RDS instances in your AWS account through this powerful task
  • register – the register keyword allows you to capture the output a task and use that output in subsequent tasks in your playbook
  • mysql_db – once your RDS instance is created, add a new database to it with the mysql_db task
  • blockinfile – another great task to have in your Ansible arsenal, blockinfile can add blocks of text to existing files or create new ones

This Series

Each post in this series is building upon the last. If you missed something, here are the previous posts. We’ve also put everything on this Github repository with branches that contain all of the changes from one part to the next.


Ansible and AWS – Part 3

This is the third article in a series looking at utilizing Ansible with AWS. So far the focus has been Ansible itself, and we’ll continue that in this post and look at the Ansible Vault. The final version of the Ansible playbook are on the part3 branch of this repository.

You will frequently come across occasions where you want to submit something that needs to be kept secret to a code repository. API keys, passwords, credentials, or other sensitive information that really is a part of the configuration of a server or application, and should be revisioned controlled, but shouldn’t be exposed to “the world” are great candidates for things to lock up in an Ansible vault. I’ll illustrate several different applications of a vault, and then share a great technique for diffing vault files.

Vault It!

Any file can be encrypted (vaulted) with the ansible-vault command. now.txt contains:

Now is the time for all good men to come to the aid of their country.

Let’s encrypt it with ansible-vault:

# ansible-vault encrypt now.txt
New Vault password:

Type in a good password and you’ll be prompted to confirm it. If the passwords match you’ll see Encryption successful. Now, cat the file:

# cat now.txt

You’ll notice that ansible-vault did not change the name of your file or add an extension. Thus from looking at the filename you can’t tell this file is a “vault” file.

To edit the file, use ansible-vault edit now.txt. You’ll be prompted for the password (I hope you remember it!), and if provided successfully ansible-vault will open the file with the default editor (on my Mac it’s vi but you can set the EDITOR shell variable explicitly). Make any changes you like and then save them in your editor. ansible-vault will reencrypt the file for you automatically.

ansible-vault view is another handy command; if you just want to look at the contents of the file and not edit them use ansible-vault view FILENAME and supply the password.

Practical Use

The most practical use of the vault capability is to protect sensitive information such as API keys, etc. We’re going to use it for two applications: encrypt AWS IAM credentials for accessing an S3 bucket, and an SSL certificate key.

First though, let’s talk about where to put the actual variables. We could put them in vars.yml and then encrypt the entire file. This is overkill. Better is to create a file called vault (so we know it’s a vault!) and put our values there. But, we’re going to use a little indirection technique like this:


By the way, to be clear, the vault file here is located in our host_vars directory for the target host, alongside our vars.yml for that host.


Again, this is the vars.yml file we created previously for our host.

Our playbooks and template files will reference AWS_ACCESS_KEY and AWS_SECRET_KEY which in turn will get their values from the vault. In this manner you can quickly view your vars.yml and know which values you get from the vault (hence the prefix VAULT_). NB: This is purely a convention I’ve come to find to be extremely useful!

The Real Thing

To access S3 buckets we’re going to use the s3cmd package, so let’s add it to our playbook apt task:

Then, in our AWS console we’ll create a new IAM user named S3Reader and give it Programmatic access. For this user we can Attach existing policies directly and use the existing policy AmazonS3ReadOnlyAccess. You should be given both an access key and a secret key. We’ll plug these both in to our vault file and then encrypt it with ansible-vault encrypt.

Even though s3cmd is installed, it needs a configuration file .s3cfg. We’re going to create one with our credentials (the ones just obtained), again through the use of the template task. Create a file in the templates directory called s3cfg.j2 (make note we aren’t putting a dot in front of the file) with the following content:

access_key={{ AWS_ACCESS_KEY }}
secret_key={{ AWS_SECRET_KEY }}

Now, add the template task that will create the .s3cfg file:

Note the use of the user and group parameters for this task. The file is being created in the /home/ubuntu/ directory and since our Ansible playbook is issuing commands as the root user (the become_user directive in the playbook we’ve glossed over thus far), it would have, by default, set the owner and group of the file to root. We don’t want that, so we explicitly call out that the user and group should be set to ubuntu.

Execute your playbook, this time including --ask-vault to ensure you get prompted for your vault password (we’ll show you in a minute how to set this in a configuration file).

# ansible-playbook playbook.yml --ask-vault

If everything was done correctly you’ll have a new file .s3cfg in your /home/ubuntu directory that has your S3 credentials (the ones you placed in the vault file). Indeed, we can now run the s3cmd ls:

ubuntu@helloworld:~$ s3cmd ls
2016-10-01 14:06  s3://elasticbeanstalk-us-east-1-124186967073
2017-10-04 00:20  s3://iachievedit-cbbucket
2015-12-21 23:47  s3://iachievedit-repos



If you run a website you’re undoubtedly familiar with SSL certificates and the need to have one (or more). You’re also familiar with the private key for your certificate (unless you started your career with Let’s Encrypt and it’s abstracted some of the “magic”). That private key is, strictly speaking, a part of the server configuration, and we want everything on our server to be revisioned controlled, including that private key. So let’s vault it.

I have a private key named key.pem and the contents look like what you’d expect:

# cat key.pem

We don’t want to submit this as-is to revision control, so let’s encrypt it with ansible-vault and put it in a directory called files.

# ansible-vault encrypt $HOME/key.pem
# cd ansible-helloworld
# mkdir files
# cp $HOME/key.pem files/

Note here that you will want to use the same vault password as you did for vault. Also, the directory files (like templates) is a special name to Ansible. Don’t call it file or my_files, etc., just files!

Our key.pem will be copied to /etc/ssl/private, so let’s create a task for that in the playbook:

If you run the playbook now and then check /etc/ssl/private/key.pem on your server you’ll see that Ansible decrypted the file and stored the decrypted contents; precisely what we wanted!

Your Vault Password

If you grow tired of supplying the vault password to --ask-vault each time you run a playbook, or even when using ansible-vault edit you can save the password in plain text in your home directory or other secure location. This is an example of a file you would not submit to revision control!

# export ANSIBLE_VAULT_PASSWORD_FILE=~/.ansible_vault_pass.txt
# ansible-playbook playbook.yml

The file ~/.ansible_vault_pass.txt contains the vault password in the clear. Add the environment variable ANSIBLE_VAULT_PASSWORD_FILE to your shellrc file (e.g., .zshrc, .bashrc, etc.) to make things even simpler.

Vault Diffs

I found this one weird trick just the other day and it’s a lifesaver if you are responsible for reviewing changes to API keys, passwords, etc. Courtesy of Mark Longair on Stack Overflow.

We have two “vaulted” files in our repository, a file named vault in our host_vars directory and key.pem. We’d like to easily diff them for changes, so following along with the Stack Overflow post, create a .gitattributes file with:

host_vars/*/vault diff=ansible-vault merge=binary
files/key.pem diff=ansible-vault merge=binary

Then, set the diff driver for files with that diff=ansible-vault attribute:

# git config --global diff.ansible-vault.textconv "ansible-vault view"

With this, if you make changes to your vaulted files, a git diff will automatically show you the actual diff vs. the encrypted diff (which is, obviously, unreadable).

Final Thoughts

Credentials, API keys, passwords, private keys. It’s been hammered in your head to not submit those to revision control systems, and rightfully so if they aren’t encrypted. But these pieces of information are critical to the successful build of a server or environment, and part of the charm that Ansible brings is the ability to rebuild a server (or sets of servers) and not worry about whether or not you missed a step. With the Ansible vault, however, you can encrypt this information and include it alongside the rest of your Ansible playbook; supply the vault password and watch the magic happen.

This Series

Each post in this series is building upon the last. If you missed something, here are the previous posts. We’ve also put everything on this Github repository with branches that contain all of the changes from one part to the next.


Ansible and AWS – Part 2

In this post we’re going to look at Ansible variables and facts, but mostly at variables (because I haven’t worked with facts much to be honest). This is the second part of our series titled Ansible and AWS and adds to the first, so if you get lost make sure and have a look at Ansible and AWS – Part 1.

At some point you’re going to get to a point where you have two machines that mostly look like one another except for their environment, size, web URL, etc. I’ve come across this when having two servers in separate environments, say, development, staging, and production. They will have different URLs, different amounts of CPU or RAM (which can drive certain configuration values). Or, let’s say each machine backs up data to a given S3 bucket, and that backup script you wrote needs the bucket name. A perfect usecase for an Ansible variable.

So let’s quickly look at the four primary locations for a variable, and then I’ll share several rules of thumb I use as to where to put one:

  • in ansible_hosts
  • in group_vars
  • in host_vars
  • in the playbook

Now I did say primary locations because there are other places; for now we’re ignoring variables and defaults that are included in roles or provided on the commandline. For the canonical reference on variables, see the official documentation.


Don’t put variables here. Moving on.

I kid, somewhat. When first starting out you may see ansible_hosts files that look like this:

[all] HOSTNAME=ansible-helloworld OPERATING_ENVIRONMENT=staging

Terrible form, but let’s try it out in our playbook (see Ansible and AWS – Part 1) and the sample Github repository. We’re going to add another task to our playbook that creates a file on the server based upon a template. The templating language is Jinja2 (don’t worry, subsequent posts will go into Jinja and Ansible templates in much greater detail). First, create a new directory (inside your ansible-helloworld directory) called templates. This is a specific name for Ansible, so don’t name it template or something else:

# cd ansible-helloworld
# mkdir templates

Inside of templates create a file called environment.j2 (.j2 is the extension used by Jinja2 templates) and populate it with the following content:

# Created by Ansible


Note! I personally prefer any file on a server that was created by Ansible to say as much right at the beginning. So many times people have gone on to a server and edited a file without realizing that it was generated and stands a good chance to be overwritten. We could do an entire article on what a good header for a such a file might look like. Hell, I might just do that!

Then, in your playbook add the following task to the end:

Remember that YAML is indentation sensitive, so if you paste this into your playbook, make sure it is properly aligned with the rest of your tasks.

One last step! Locate your hostname task and change the hardcoded ansible-helloworld (that’s our hostname), to "{{ HOSTNAME }}", like this:

If you’re quick on the uptake, you should already know what is going to happen when this playbook is executed. The variable HOSTNAME in the ansible_hosts file is going to be applied in the hostname task, and the OPERATING_ENVIRONMENT variable will be applied in the template task.

Go ahead and run the playbook:

PLAY [all] *********************************************************************

TASK [Gathering Facts] *********************************************************
ok: []

TASK [Set our hostname] ********************************************************
ok: []

TASK [Install base packages] ***************************************************
ok: [] => (item=[u'htop', u'zsh'])

TASK [Create /etc/environment] *************************************************
changed: []

PLAY RECAP *********************************************************************              : ok=4    changed=1    unreachable=0    failed=0

Because there is a new task (the template task), and our hostname didn’t change, we see changed=1.

If you look at /etc/environment on the server, you should see:

cat /etc/environment
# Created by Ansible



Let’s change our hostname in ansible_hosts to simply helloworld:


Rerunning the playbook will change the hostname to helloworld, since that is the new value of the HOSTNAME variable.

Group Variables

Notice in our ansible_hosts file there is the [all] tag? Well, we can change to that split hosts up. Let’s rewrite our ansible_hosts file to look like this:

[staging] HOSTNAME=helloworld

and then create a directory (inside your ansible-helloworld directory) called group_vars. Then in group_vars create a file called staging.yml and in it put:


and run your playbook. If you’re following along verbatim nothing should happen. All we’ve done is extracted variables common to staging servers (like our OPERATING_ENVIRONMENT variable) into a single file that will apply to all of the hosts in the staging group.

Try renaming staging.yml to somethingelse.yml and rerunning. You should get an error regarding an undefined variable, since Ansible wasn’t able to find OPERATING_ENVIRONMENT. When you supplied a properly named group_vars file (staging.yml) it is able to look it up.

Finally, for our group variables, notice the syntax changed from the ansible_hosts “ini-style” syntax to a YAML file. This is important to note!

Host Variables

Now let’s take a look at host variables and how to use them. Create a directory called host_vars, again, inside the ansible-helloworld directory. In it create a directory named the same as how your host is defined in ansible_hosts. My server is being referenced as, but since AWS provides an FQDN, I’ll switch to that to demonstrate how what is in ansible_hosts can be an IP address, alias in /etc/hosts, or an FQDN resolvable by DNS. I’m going to change my ansible_hosts to this:


and then create a file called vars.yml in host_vars/ec2-18-188-72-168.us-east-2.compute.amazonaws.com and place the following content:

HOSTNAME:  helloworld

Try the playbook out!

Take Note

Have you noticed that we’ve eliminated the variables from our ansible_hosts file? You may feel otherwise, but I’m a strong proponent of ansible_hosts files that consist of nothing more than groups of hostnames. You might think at first that you’ll have just a couple of variables per host, but odds are this will grow to a dozen or more quickly! Think about the various things that are configurable on an environment or host basis:

  • what monitoring environment will this host report to?
  • where are syslogs being sent?
  • what S3 bucket is used for backing up configuration files (that aren’t in Ansible)?
  • what are the credentials to that bucket?
  • does the host (or group) have custom nameservers?
  • how is application-specific configuration handled?

And so on. Trust me. Get into the good habit of creating a group_vars and host_vars directories and start putting your variables there.

Variables In Playbooks (and Other Places)

You can also put variables in your playbooks, but I rarely do. If it is in the playbook it’s really less of a variable and more of a strict setting, since it will override anything from your ansible_hosts file, group_vars or host_vars. If you’re set on it, try this out. In your playbook, add vars: like this:

That is, nestle it in between the become_method and tasks. Run it and you’ll see that both your hostname and /etc/environment file changes.


Ansible also provides the ability to reference facts that it has gathered. To be sure, I don’t use this feature as often, but sometimes I’ll need the host’s IP address (perhaps to put it in a configuration file), or how many CPUs it has. Seriously, check out the documentation of all of the awesome facts you can reference in your playbook. Here are some that you may find yourself using, especially if your playbooks require subtle tweaks to support different distributions, amount of memory, CPU, etc.:

  • ansible_distribution_name
  • ansible_distribution_release
  • ansible_distribution_version
  • ansible_eth0 (and properties of it)
  • ansible_memtotal_mb
  • ansible_processor_cores
  • ansible_processor_count

You can use these in your playbook in the same manner as variables:

This is a debug task will just print out a message:

TASK [Total Server RAM (MB)] ***************************************************
ok: [ec2-18-188-72-168.us-east-2.compute.amazonaws.com] => {
    "msg": "Total Server RAM:  990 MB"

Let’s make use of facts in our /etc/environment template:


# Created by Ansible ({{ ansible_date_time.date }} {{ ansible_date_time.time }} {{ ansible_date_time.tz }})


Notice here we’re using the ansible_date_time fact to include in our generated /etc/environment file when the file was created. After running:

ubuntu@helloworld:~$ cat /etc/environment
# Created by Ansible (2018-05-11 11:56:48 UTC)


Final Remarks

This post might seem at first glance to be a little less “meaty” than Part 1, but variables and facts will comprise a large part of your future playbooks (we didn’t even talk about how they are used in roles). As promised here are some guidelines I use when deciding where to put variable definitions:

If the hosts for a given Ansible playbook can be organized into logical groups (such as staging and production), and there are a set of variables that will be common to all of the staging servers, and likewise common to all of the production servers, these variables are a good candidate to put into group_vars. Examples here might be:

  • the endpoints for log shipping
  • the IP addresses for name servers
  • the IP of a monitoring server
  • AWS region information (for example, if staging is in one region and production is in another)

Or, let’s say you run some type of monitor traps that send e-mails on certain events. You might want to send staging alerts to staging-alerts@iachieved.it vs. production-alerts@iachieved.it. Ansible group variables might come in handy here.

If the variable is specific to a host, then obviously you’d put the information in host_vars. I prefer to explicitly set server hostnames, and so HOSTNAME goes into the vars.yml for the host. Application-specific information for that specific host, put it into host_vars.

I’m hard pressed to think of when I’d want to explicitly put a variable in either ansible_hosts or the playbook itself; in ansible_hosts it just clutters up the file and in the playbook it’s effectively a constant.

Now, make no mistake: you will, over time, refactor your playbooks and the locations and groupings of variables. If you’re a perfectionist and lose sleep over whether you’ve spent enough time in discussions over how to architect your playbooks, well, I feel sad for you. It’ll never be right the first time, so make the best decision you can and sleep well knowing it can be refactored as your environment or needs change.

Getting the Code

You can find the finished playbook for this article on the part2 branch of this Github repository.


Ansible and AWS – Part 1

It’s been some time since I’ve posted to this blog, which is a shame, because I do indeed enjoy writing as well as sharing what I’ve learned with the hope it helps someone else. The truth is, however, that writing quality articles takes a lot of time. I also suffer from that thought that surely someone else out there has written on a given topic and done a much better job that I could do. And the reality is, someone probably has, but that shouldn’t stop me from doing something I enjoy.

So with that, here’s the first of what I hope to be several more articles on how to harness the power of Ansible with AWS. I firmly believe in learning by doing and I also believe that at first, you should take few shortcuts. What I mean by that is this: are you the type of person that’s saddened by the fact that some schools allow calculators in grade school? Or, even better, not bothering teaching someone how to use a library? And by that I mean how to get off of your ass, go to the library, navigate the stacks, and find a wealth of information you just might not find on Google? To be fair, I’m not really the Get off my lawn! type, but it does irk me somewhat when I see people automating things they don’t have a fundamental understanding of in the first place.

With that, my approach will be to start off with doing some things manually. Then I’ll illustrate how to take the manual steps and automate them. Sometimes it isn’t worth automating something; usually it’s something you do infrequently enough and the energy required to automate it outweighs the benefit. I usually use the DevOps equivalent of the three times rule. If I’ve done something by hand three times in a row, it’s a good candidate for automation, but not necessarily before then.


Ansible is just so wonderful. If you’ve never heard of it, it falls into the general class of automation tools that allow you to codify a set of instructions for doing or building something. I tried to make that as abstract as possible, because the something could be:

  • installing a web server application
  • updating a configuration file
  • adding a new user to a set of servers
  • applying a new software patch
  • building a virtual machine

and so on. It’s almost as if these sets of instructions are like playbooks or recipes or cookbooks. So much so many of the popular tools call their files variations of these phrases. Ansible uses tasks, roles, and playbooks.

Installing Ansible

To use a tool you’ll have to install it. I develop on a Mac and can tell you that brew install ansible works great (if you don’t have Brew installed on your Mac you should be ashamed of yourself). As of this writing I’m using Ansible version 2.5.2, as that is what brew installed.

Using Ansible

Ansible is easy to use, though at first it may not seem like it. I know you want to cut to the chase and just figure out how to automate that thing and you want it now. Why do I have to do all this stuff just to do this thing I think should be easy! I can hear you say it because I say it all the time. That initial learning curve is always a pain in the ass.

Here are some simple hints to get started as quickly as possible. First, create a folder called ansible-helloworld and use it as home base. Then, you’re going to create two files: ansible.cfg and ansible_hosts.




Now, for ansible_hosts, this is where you put the names of your servers that you’re going to be applying your tasks to. We don’t have any hosts yet, so we need to create one. You can, of course, apply Ansible tasks to your own Mac, but we won’t do that. You could also install VirtualBox and create a virtual machine to play with, but we’ll skip that too and go straight to AWS.

Amazon Web Services

Like Ansible, AWS is awesome. I have on my bookshelf a copy of The Cloud at Your Service, where I was first introduced to AWS. This is awesome, I thought. Rather than hosting a webserver from this old Sony Vaio (whatever happened to that thing anyway?), now I can create a VM in the cloud. Forgive those that are middle-aged and still look back at where computing was, and where it is today, and think wow.

Since AWS was my first love, I’ve tended to stick with it, and to be honest, for good reason. The product has become cheaper over the years even while adding more services such as S3, Route 53, VPCs (there was a time VPCs didn’t exist), and now things like RDS and much more. Yes, there are other cloud computing platforms out there (Azure, Google, Linode, Digital Ocean, Scaleway, and many more), and given the need I could go and probably master them all, but for now I’m quite content with AWS. If you find yourself using another platform, Ansible will still work, you just may need to tweak some of the techniques and steps presented here.

Finally, this isn’t an AWS tutorial by any means, though we will get into some handy Ansible modules that take some of the drudgery out of creating instances and databases (in which case if you’re on another platform you will definitely need to find some other module that does the equivalent). So if you don’t know how to create an EC2 instance and set some basic security groups, you might start here.

One last note before we get back to Ansible, and that’s I’ll be using Ubuntu Server 16.04, also known as Xenial Xerus. You may prefer to work with CentOS, Red Hat Enterprise, or plain Debian. While Ansible is generally distribution agnostic, some things may not work exactly the same.

Back to Ansible

Alright, I hope you created an EC2 instance in AWS. I created one in the Ohio region (us-east-2), and as promised used Ubuntu 16.04 LTS (ami-916f59f4). For basic tutorials go ahead and use the t2.micro instance, it’s a lot cheaper. You should also have ensured that you have access to this instance, that is, your public SSH key was installed on this new instance and your security groups permit you to access it (protecting your EC2 infrastructure with security groups is another great topic that I hope to get to one day).

My instance was given a public IP of and my key was installed, so ssh ubuntu@ logged me right in:

ssh ubuntu@
Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.4.0-1052-aws x86_64)

Hot damn.

One of the unfortunate things about the default AMI for Ubuntu 16.04 is that it doesn’t include python on it. Now granted, Python has only been around since 1991, but still, you’d think it was worth including as a default package in an Ubuntu instance. This is important because Ansible requires python to be on the target instance (we’re talking specifically about Linux instances here). So, until we show you how to create your own AMI, we need to install Python on this VM. Easy enough:

ubuntu@ip-172-31-22-141:~$ sudo apt-get install python
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
  libpython-stdlib libpython2.7-minimal libpython2.7-stdlib python-minimal
  python2.7 python2.7-minimal
Suggested packages:
  python-doc python-tk python2.7-doc binutils binfmt-support
The following NEW packages will be installed:
  libpython-stdlib libpython2.7-minimal libpython2.7-stdlib python
  python-minimal python2.7 python2.7-minimal
0 upgraded, 7 newly installed, 0 to remove and 0 not upgraded.
Need to get 3,877 kB of archives.
After this operation, 16.6 MB of additional disk space will be used.
Do you want to continue? [Y/n]

Of course we entered Y.

Now for the fun part, some Ansible!

Your ansible_hosts File

In your ansible_hosts file just put this:


Now, that’s the public IP address of my instance, and you aren’t going to use that, you’ll use your instance’s IP. The point is your ansible_hosts file needs either an IP or FQDN of the machine(s) that you’re going to be working with. It’s that simple really.

Your First Playbook

You’re ready for your first playbook. Call it whatever your like, we’ll just use playbook.yml for now. And here we go:


That’s it! See you in Part 2!

Ansible Modules

Just kidding. We’ll add a bit more to our first playbook, but first, let’s take a look at the syntax of the file. The extension .yml is on purpose as Ansible roles, tasks, variable configurations, playbooks, etc. are all written using YAML, a “human friendly data serialization standard for all programming languages.” YAML’s syntax is indentation-driven (like Python), so you need to make sure everything is properly indented or you’ll get incomprehensible errors. For a quick check you can install something like yamllint (availabe on the Mac with brew install yamllint).

For a brief moment we’ll ignore the hosts and remote_user tags and focus instead on the tasks section. We have one task listed here: hostname. Ansible provides great reference documentation to look up all of the things you can do with this module. This is a simple one as it only takes one parameter, name.

Now, don’t confuse the name parameter of the hostname module with that of the name parameter of the task. This used to trip me up. It’s easier to see task entries when there are more than one, say let’s do that. I’m going to be very specific on how this is written (and then tell you not to do it):

Notice how the apt module (which also has great reference documentation) is the second list entry under tasks; we know this because of the dash. But then there is some indentation (two spaces) and three key-value pairs (name, state, and update_cache). Then the indentation is backed out and we have name. The second name here (which has the value Install htop) is associated with the task at hand (in our case, apt).

If you’re confused, don’t worry. YAML (and any serialization syntax) can be baffling and bewildering. But if you follow some basic conventions and keep playbooks simple (which will involve breaking things down over time into roles and other tasks), it’ll be easy to read.

Now, what you aren’t supposed to do is put the name parameter for the task at the bottom. Clean that up!

Much better. Now, let’s run it!

ansible-playbook playbook.yml

PLAY [all] *********************************************************************

TASK [Gathering Facts] *********************************************************
ok: []

TASK [Set our hostname] ********************************************************
changed: []

TASK [Install htop] ************************************************************
changed: []

PLAY RECAP *********************************************************************              : ok=3    changed=2    unreachable=0    failed=0


In general, an operation is idempotent if it produces the same result even after being executed multiple times. This is an important property to strive for in Ansible playbooks, though my experience is that it isn’t always achievable. It may appear on the surface that our above playbook is idempotent though it isn’t. For argument’s sake let’s look at what we would desire from an idempotent playbook:

  • if the hostname isn’t set to ansible-helloworld, set it to ansible-helloworld
  • if htop isn’t installed, install it, otherwise do nothing

On the first pass of the playbook we see changed=2, indicating that two tasks changed something on the server. Indeed, we set the hostname and then installed htop. Let’s run it again:

PLAY RECAP *********************************************************************              : ok=3    changed=0    unreachable=0    failed=0

Now, nothing has changed. Run it again!

PLAY RECAP *********************************************************************              : ok=3    changed=0    unreachable=0    failed=0

Success! An idempotent playbook.

Until update_cache: true triggers an upgrade of the htop application from the underlying repository. Strictly speaking this one line prevents the playbook from being idempotent, and if it were critical that the server didn’t receive any apt-get updates, removing update_cache can help but would likely be insufficient.

Final Thoughts

We didn’t cover the remote_user, become, etc. directives in our playbook, but that’s okay. If you remove them the playbook won’t work at all. Let’s add one last interesting twist to the playbook, and that’s installing more than htop to the server. I happen to like Z Shell (and even more, Oh My Zsh) installed on my servers. So let’s install it.

We could create a separate apt task (separate from the one installing htop) to install zsh, but that seems a bit silly. Let’s use the with_items capability like this:

Confusing syntax alert! Believe it or not, it took me some time before this syntax felt natural, because it’s sort of like writing for loops in Apache Ant. It’s clunky, and namely because you’re taking a markup-type language and trying to create logical constructs in it. It just feels weird at first.

The first thing that’s weird is the "{{ item }}" syntax. What’s with the quotes? There weren’t quotes before. Why those braces? Two braces? Not one brace? What is item? Who sets that? Is it magic?

Try removing the quotes. Go ahead. You’ll be sorry you did when you’re greeted with We could be wrong, but this one looks like it might be an issue with missing quotes. Always quote template expression brackets when they start a value. Assholes. So the quotes need to stay there.

The braces mark off template interpolation. Whatever the variable item is set to will be used. The with_items parameter supplies, in turn, each of the items, substituting the items (no pun intended) in the list into the item variable. So in this way you can see how the playbook will run now:

# ansible-playbook playbook.yml

PLAY [all] *********************************************************************

TASK [Gathering Facts] *********************************************************
ok: []

TASK [Set our hostname] ********************************************************
ok: []

TASK [Install htop] ************************************************************
changed: [] => (item=[u'htop', u'zsh'])

PLAY RECAP *********************************************************************              : ok=3    changed=1    unreachable=0    failed=0

That’s it for Part 1 of this series! If you’ve never worked with Ansible before, I hope this was helpful in getting you started; yes, there are other tutorials out there on Ansible but I wanted to lay the foundation for a series of articles that will walk you through how to go from a simple playbook that installs a couple of packages to roles that can install, configure, and back up MySQL databases and much, much more.

If you have any suggestions for things you’d like to see in an article, let me know by posting a comment or on Twitter @iachievedit.



SSH Hardening

DevOps ToolChain, WikiPedia, CC BY-SA 4.0

Why Hardening

Hardening, as I define it, is the process of applying best practices to make it harder for others to obtain unauthorized access to a given service or to the data transmitted by the service. In this post we’ll take a look at hardening SSH access to our server, as well as making it more difficult for others to potentially snoop our SSH traffic.

We’ll be using a fresh AWS EC2 instance running Ubuntu 16.04 for our examples. If you’re running a virtual server in Azure, Digital Ocean, or some other hosting provider, you’ll want to check out how the equivalent of AWS security groups are configured. And of course, these techniques can also be applied to non-virtual systems.

AWS Security Groups

The first step in hardening your SSH server is applying a more restrictive security group to your instance. Think of AWS security groups as custom firewalls you can apply to your instance. Even better, these custom firewalls can apply source-based filtering rules that only allow traffic from subnets or hosts you specify.

Subnet-based rules provides for rules like “Only allow SSH traffic from my development team’s network” If your internal network is segregated and configured such that developers must authenticate to receive a IP address, an additional safeguard is added.

A host-based rule will only allow traffic from a given host (strictly speaking, a given IP address; if a host is behind a NAT then any hosts also behind that NAT will be allowed). This is what we’ll use.


In the above example, we’ve created a security group private-ssh-sg and added a single Inbound rule that allows traffic on port 22 from a specific IP address. This will effectively only allow packets whose source IP is specified in that rule to reach port 22 of the instance.

SSH Cipher Strength

Another technique you can use to harden your SSH server is ensuring that the latest strong key exchange protocols, ciphers, and message authentication code (MAC) algorithms are utilized.

We’ve used several references as a guide to hardening SSH, including Mozilla’s OpenSSH Guidelines as well as ssh-audit, a nice tool designed specifically for this task. Using ssh-audit is as easy as

$ git clone https://github.com/arthepsy/ssh-audit
$ cd ssh-audit
$ ./ssh-audit.py your.ip.address.here


Our first pass uncovers a number of issues:

  • use of weak elliptic curves in the key exchange algorithms supported
  • use of weak elliptic curves in the host-key algorithms supported

ssh-audit goes on to recommend the key exchange, host key, and MAC algorithms to remove.

Let’s look at changes we want to make to our /etc/ssh/sshd_config file:

Restart your SSH daemon with sudo systemctl restart ssh. Note: It’s a good idea to have a second terminal logged in if you bork your SSH configuration and lock yourself out of your instance.

Once we’ve updated our sshd_config configuration, it’s time to run an audit against it.


Nice! Strong key exchange, encryption, and MAC algorithms all around.

If you’re looking for a simple SSH daemon configuration, look no further:

Two-Factor Authentication

There are different interpretations as to what constitutes two-factor or multi-factor authentication. Many believe that two-factor authentication implies an additional authentication code delivered via text message or provided by a key fob. Others may consider the steps taken to obtain access to a given computing resource as a part of the authentication steps (e.g., to obtain access to a given server you must get past the security guard, provide a retinal scan, and so on). In this example, we’re going to use the former interpretation.


We’ve chosen to use Authy in this example for two-factor authentication using a time-based one time password. To get started, install the Authy application on your phone (iOS or Android) and follow the quick-start prompts.

After you’ve successfully set up the application on your phone, you can download the app to your desktop or add it to Google Chrome.

Getting Your EC2 Instance Ready

To use the authentication code provided by Authy to add an additional authentication step for SSH logins requires installing the libpam-google-authenticator module and configuring both SSH and PAM.

Install the module with sudo apt-get install libpam-google-authenticator.

Now, as a user that needs to use two-factor authentication, run google-authenticator to get set up. The application will generate several prompts, the first of which is Do you want authentication tokens to be time-based to which you’ll answer “yes”

google-authenticator will then generate QR-code that you can scan with the Authy phone application, as well as a secret key that can be used with the phone, desktop, or browser application. When using the desktop application I prefer just copy-paste of the secret key.

There are additional prompts from the application to follow:

Do you want me to update your "/home/ubuntu/.google_authenticator" file (y/n) y

Do you want to disallow multiple uses of the same authentication
token? This restricts you to one login about every 30s, but it increases
your chances to notice or even prevent man-in-the-middle attacks (y/n) y

By default, tokens are good for 30 seconds and in order to compensate for
possible time-skew between the client and the server, we allow an extra
token before and after the current time. If you experience problems with poor
time synchronization, you can increase the window from its default
size of 1:30min to about 4min. Do you want to do so (y/n) n

If the computer that you are logging into isn't hardened against brute-force
login attempts, you can enable rate-limiting for the authentication module.
By default, this limits attackers to no more than 3 login attempts every 30s.
Do you want to enable rate-limiting (y/n) y

NB: It is a good idea to save your emergency scratch codes in the event you lose access to the devices that are generating your OTPs.

Now that you’ve configured the authenticator, it’s time to update sshd_config to consult the PAM Google Authenticator module when a user attempts to log in.

Open /etc/ssh/sshd_config as root, and set the following:

ChallengeResponseAuthentication yes
PasswordAuthentication no
AuthenticationMethods publickey,keyboard-interactive

Restart ssh with sudo systemctl restart ssh.

Now, in /etc/pam.d/sshd replace the line @include common-auth with auth required pam_google_authenticator.so.

Once the sshd PAM module has been configured in this manner users will be challenged for a two-factor authentication code, so it’s important that every user on the system be configured with the google-authenticator application.

Test your login!

Try logging in via ssh with the user you’ve just configured for two-factor authentication. You should receive a prompt requesting a verification code (after your key is authenticated).


Enter the code displayed on your Authy app (note that all of your Authy apps will display the same code for the same application configuration) to login.


One Last Thought on Authy

While writing this post I was doing additional research on two-factor authentication implementations; while Authy supports time-based one time passwords, it supports additional methods that require access to their infrastructure. If you don’t care for providing your cellphone number (and a lot of people don’t), try out Authenticator, a Chrome plugin that doesn’t require any account setup.

You’ll notice if you try out different applications that you can use the same secret key and each application will generate the same code at the same time, hence the time-based one time password.


There is no such thing as perfect security save turning off the computer, disconnecting all of its cables, putting it in a trunk, filling that with cement, and tossing it into the Pacific. Even that might not be perfect. In the absence of perfection we can put as many barriers in place between our server and others that shouldn’t have access to it. In this post we’ve taken a look at several of those barriers:

  • source-based firewall rules that only allow access on port 22 from a specific IP or subnet
  • hardened key exchange algorithms, ciphers, and MACs for SSH
  • two-factor authentication that requires both public key authentication as well as an OTP code

We did not cover additional techniques such as configuring SSH to listen on a different port; I’ve found that despite explaining that this is done primarily to minimize port-scanning chatter (who wants to sift through auth or fail2ban logs with script kiddie traffic) it never fails to incite the crowd of folks who just learned the phrase security through obscurity to gather up their pitchforks.

If you have any additional recommendations regarding SSH security hardening, please leave a comment!


An ARM Build Farm with Jenkins

There was a time when setting up a Continuous Integration server took a lot of work. I personally have spent several days wrestling with getting CruiseControl installed, configured, and working just right. Today it is much more straightforward and, for the most part, a simple apt-get install jenkins is all it takes to get a functional Jenkins CI server up and running.

In this tutorial we’re going to look at using Jenkins to set up a build farm of ARM systems. My personal interest in doing so is to support the Swift on ARM group of folks working to get Swift 3.0 support for ARMv7 devices such as the Raspberry Pi 2, BeagleBone Black, etc. While support for cross-compiling Swift is maturing there is still a desire to natively compile on an ARM system.

The Jenkins Build Server

To make things simple we’re going to focus on installing Jenkins on an Ubuntu 14.04 server. Using the instructions on the Jenkins Wiki for an Ubuntu 14.04 system:

The Jenkins daemon will start automatically upon installation. Once it does, open a browser and go to http://YOUR_HOST:8080/, where YOUR_HOST is the server you just installed Jenkins on.



The initial password can be found on the Jenkins server with sudo cat /var/lib/jenkins/secrets/initialAdminPassword, and will be something like “54d5f68be4554b5c8316689728721b37”. Paste it in and click Continue.

You’ll be prompted to choose whether or not to install suggested plugins or specifically select plugins. Go with Install suggested plugins for now. The next screen will be all of the plugins being loaded in. Once complete you’ll move on to creating the first admin user:

Create an Admin User

Create an Admin User

Once entering the admin user information, you’ll see Jenkins is ready!



Click that Start using Jenkins button and (wait for it), start using Jenkins.

Build Agents

We’re interested in using Jenkins to perform native builds on ARM systems. To that end we’ll add slave nodes to our build server.

First, let’s make sure we have an ARM system to compile on. I’ve become a big fan of using Scaleway for spinning up ARM systems. If you have your own ARM device such as BeagleBoard or Raspberry Pi, those will work as well, just be aware that compiling on single-core ARM devices with limited RAM can be, well, painful. With Scaleway you can spin up quad-core ARM systems with 2G of RAM for about $3.50 a month.

I’m going to assume you have an ARM system (either from Scaleway or a physical device that your Jenkins host can communicate with). Let’s look at what it takes to get things configured. There’s a bit of a dance to do to ensure that the master and slave can communicate with each other. In short we need to:

  • Install Java on the ARM system so the slave agent can run
  • Create a build user on the ARM system
  • Create a public/private key pair on the build server
  • Add the build server’s public key to the ARM build user’s authorized_keys file

1. Install Java and create a build user

Your ARM device is going to need Java to run the Jenkins slave agent, so on the ARM system run sudo apt-get install openjdk-7-jre-headless.

Let’s also set up the build user while we’re here with sudo adduser build. Follow the prompts to create the user appropriately.

2. Create a public/private key pair on the build server

On the build server, su to the jenkins user and cd to its home directory. Run ssh-keygen to create the key pair.

3. Add the public key to the authorized_keys file

Now, add the ~/.ssh/id_rsa.pub contents of the Jenkins server jenkins user to the /home/build/.ssh/authorized_keys file of the ARM system build user.

Adding the Build Slave Node

Now, let’s add the ARM device as a build slave. Under Manage Jenkins go to the Manage Nodes link.

Manage Nodes

Manage Nodes

Click on New Node and we’ll give it a name like arm1. Select the Permanent Agent radio button and then click OK.

You’ll move on to a screen that requires some more information (typical!) We’re interested in:

  • Remote root directory
  • Labels
  • Launch method

For Remote root directory we’ll use the home directory of our new build user, so /home/build. For the label, type in arm. It will become apparent as to why in a bit.

For Launch method choose Launch slave agents on Unix machines via SSH. In the host field enter your ARM system’s hostname or IP address that is accessible from your Jenkins build server.

Launch method

Launch method

Now, click on the Add button (the one with the key) to go to the Add Credentials configuration page. For the Kind choose SSH Username with private key. For the Username we’ll use build. For the Private Key select From the Jenkins master ~/.ssh/. Recall earlier when we had you create a public/private key pair for the jenkins user on the master? This is why. When trying to contact the slave agent the Jenkins system will load the private key from the ~/.ssh/ directory of the jenkins user. The slave agent will be ready with the public key in the authorized_keys file.

Adding Credentials

Adding Credentials

Click Add to add the credentials, and then Save to save the new node configuration.

You should be taken back to the Node Management page where you’ll see your new node (probably in an offline state):

arm1 Offline

arm1 Offline

You can either click Refresh status or go to the node’s logs by clicking on the nodename arm1 and then selecting Log. If everything was configured properly you’ll see:

[04/27/16 17:01:42] [SSH] Starting slave process: cd "/home/build" && java  -jar slave.jar
<===[JENKINS REMOTING CAPACITY]===>channel started
Slave.jar version: 2.56
This is a Unix agent
Evacuated stdout
Agent successfully connected and online

Now let’s create a new job and use our ARM agent to build opencv.

Building OpenCV

We know we’re going to be using git to check out the OpenCV source tree from Github, and there are a number of other dependencies required for building, so we need to install all of them on our ARM system:

sudo apt-get install build-essential cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev python-dev python-numpy libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev

OpenCV is traditionally built with the following commands:

These will be our basic build steps, with the exception we’ll use make -j4 to take advantage of our 4 core ARM build agent.

Creating the Build Project

To create a new build project, click on Create New Item in the main Jenkins menu, and choose Freestyle project. We’ll name our project OpenCV 3.0. Click OK at the bottom of the options.

Creating a new Freestyle Project

Creating a new Freestyle Project

You’ll be on the General tab to get started. For now we’re interested in the general project options, Source Code Management section, and Build section.

Since we want to build the OpenCV project on the ARM system we are going to restrict the project to only build on those nodes matching our arm label.

Restrict Project

Restrict Project

For OpenCV 3.0 we’ll choose Git as the SCM type, and then enter the URL to the OpenCV source, https://github.com/itseez/opencv/. Jenkins allows you to specify the branch to build on, and we’ll leave the default master.

GIT for OpenCV

GIT for OpenCV

For the actual build we will enter a single Execute shell build step. It should be noted that this is an example only. There are a number of ways to configure and script out Jenkins project; each project will have different build steps to fit the needs of the underlying task at hand.

Configure and Build OpenCV

Configure and Build OpenCV

Save your project, and then click Build Now! In the Build History pane you’ll see a flashing blue dot (hopefully). Click on the dot to go to the build page where you can select to look at the console output of the build. If everything was set up correctly you should now see OpenCV building away on your ARM build agent!

Building on arm1

Building on arm1

OpenCV 3.0 Build Log

OpenCV 3.0 Build Log

In our example OpenCV took 42 minutes to build on a quad-core ARM system from Scaleway. Not bad.

Final Thoughts

This was clearly not a comprehensive Jenkins tutorial. Folks familiar with the craft know that setting up and configuring continuous integration servers and build projects to produce traceable artificacts is a disclipline unto itself. However, it should be clear that creating a compile farm consisting of slave build agents does not have to be complicated.


Using Monit with Node.js and PM2

As I’ve said before, I’m a bit of a whore when it comes to learning new languages and development frameworks. So it comes as no surprise to myself that at some point I’d start looking at Node.js and JavaScript.

I have another confession to make. I hate JavaScript, even more so than PHP. Nothing about the language is appealing to me, whether its the rules about scoping and the use of var or the bizarre mechanism by which classes are declared (not to mention there are several ways). Semicolons are optional, kind of, but not really. I know plenty of developers who enjoy programming in Objective-C, Python, Ruby, etc.; I have never met anyone who says “Me? I love JavaScript!” Well, perhaps a web UI developer whose only other “language” is CSS or HTML. In fact, a lot of people go out of their way to articulate why JavaScript sucks.

So along comes Node.js, which we can all agree is the new hotness. I’m not sure why it is so appealing. JavaScript on the server! Event-driven programming! Everything is asynchronous and nothing blocks! Okay, great. I didn’t really ask for JavaScript on the server, and event-driven programming is not new. When you develop iOS applications you’re developing in an event-driven environment. Python developers have had the Twisted framework for years. The venerable X system is built upon an event loop. Reading the Node.js hype online one would think event-driven callback execution was invented in the 21st century.

Of course, the Node.js community is also reinventing the wheel in other areas as well. What do the following have in common: brew, apt-get, rpm, gem, easy_install, pip. Every last one is a “package manager” of some sort, aimed at making your life easy, automagically downloading and installing software along with all of its various dependencies onto your system. A new framework is nothing without a package manager that it can call its own, thus the Node.js world gives us the Node Package Manager, or npm. That’s fine. I like to think of myself as a “full-stack developer”, so if I need to learn a new package manager and all of its quirks, so be it.

Unfortunately it didn’t stop there. Node.js has its own collection of application “management” utilities; you know, those helper utilities that aim to provide an “environment” in which to run your application. Apparently Forever was popular for some time until it was displaced by PM2, a “Production process manager for Node.js / io.js applications”

I’m not quite sure when it became en vogue to release version 0 software for production environments, but I suppose it’s all arbitrary (hell, Node.js is what, 0.12?) But true to a version 0 software release, PM2 has given me nothing but fits in creating a system that consistently brings up a Node.js application upon reboot. Particularly in machine-to-machine (M2M) applications this is important; there is frequently no opportunity to ssh into a device that’s on the cellular network and installed out in an oil field tank site. The system must be rock-solid and touch free once it’s installed in the field.

To date the most pernicious bug I’ve come across with PM2 is it completely eating the dump.pm2 file that it ostensibly uses to “resurrect” the environment that was operating. A number of people have reported this issue as well. If I can’t rely on PM2 to consistently restart my Node.js application, I need something to watch the watchers. So who watches the watchers? Monit of course.

Because PM2 refused to cooperate I decided to utilize monit to ensure my Node.js application process was up and running, even after a reboot. In this configuration file example I am checking my process pid (located in the /root/.pm2/pids directory) and then using pm2 start and pm2 stop as the start and stop actions.

NB: Monit executes its scripts with a bare bones environment. If you are ever stumped by why your actions “work on the command line but not with monit”, see this this Stack Overflow post. In the case of PM2, it is critical that the PM2_HOME environment variable be set prior to calling pm2.

The first iteration of my monit configuration looked like this:

Only if this were sufficient, but it isn’t.

For some reason PM2 insists on appending a process ID to the pidfile filename (perhaps for clustering where you need a bunch of processes of the same name), so a simple pidfile check won’t suffice. Other folks even went to the Monit lists looking for wildcard pidfile support and quoted PM2 as the reason why they felt they needed it.

So, now our monit configuration takes advantage of the matching directive and looks like this:

Granted, we should not be running as root here. Future iterations will move the applications to a non-privileged user, but for now this gives us a system that successfully restarts our Node.js applications after a reboot. PM2 is a promising tool and definitely takes care of a lot of mundane tasks we need to accomplish to daemonize an application; unfortunately it is a little rough around the edges when it comes to consistent actions surrounding the ability to survive system restarts. Don’t take my word for it: read the Github issues.


A rolling stone gathers no moss. The more things stay the same, the more things change (or is it the other way around?). I have nothing against new frameworks, but there are times when being an early adopter requires one to pull out a tried-and-true applications to get the job done. In this case our old friend Monit helps up fill in the gaps while Node.js and PM2 mature.