Ansible Best-Practices | Blog der Telekom MMS

I’m working with Ansible since March 2014, starting with version 1.1.

At work I wrote many playbooks and roles to configure operating systems, applications and continuous delivery pipelines. I managed AWS instances, VMWare-Cluster and Xen-Hosts with Ansible.

I also maintain some Open Source Ansible roles on GitHub and wrote some Dockerfiles for operating system images that include Ansible (mainly to test the roles).

Over time my team and I gathered some best practices that we try to follow when writing and running Ansible code in production. The following post will show you what I think are these best practices and why.

Of course I’m not the first to write about Ansible Best-Practices. There are other resources that were very informative and helped me immensely:

On writing Ansible playbooks

Name your tasks ands plays

When writing tasks and plays in Ansible naming them is optional. However you should always give useful names to your tasks ands plays. When you run a playbook without named tasks, you’ll see the following output:

PLAY [localhost] 
********************************

TASK [include_vars] 
********************************
ok: [localhost]

TASK [yum] 
********************************
ok: [localhost]

PLAY [localhost]

********************************

TASK [include_vars]

********************************

ok: [localhost]

TASK [yum]

********************************

ok: [localhost]

When trying to debug failed tasks it’s really helpful to actually know what task failed and what the task should have been doing. Assigning names to your taks will give the following output.

PLAY [Create a new virtual machine] 
********************************

TASK [Include vmware-credentials] 
********************************
ok: [localhost]

TASK [Install required packages with yum] 
********************************
ok: [localhost]

PLAY [Create a new virtual machine]

********************************

TASK [Include vmware-credentials]

********************************

ok: [localhost]

TASK [Install required packages with yum]

********************************

ok: [localhost]

That’s more helpful, isn’t it?

Variables in your task names

Try to be expressive when writing task names. Include as many information as necessary. A good way to do this is to use variables in your task names. For example if you want to determine the host a task is currently running against, you can include a variable in your task name.

Suppose you have the following task, that includes the inventory_hostname variable in its name:

  - name: Create vmware snapshot of {{ inventory_hostname }}
      vmware_guest_snapshot:
        hostname: "{{ vcenter_host }}"
        username: "{{ vcenter_user }}"
        password: "{{ vcenter_pass }}"
        datacenter: "{{ vcenter_dc }}"
        folder: "/vm/"
        name: "{{ inventory_hostname }}"
        description: "before os updates"
        state: present
        snapshot_name: "snap_{{ '%Y-%m-%d-%M' | strftime }}"
      delegate_to: 127.0.0.1

- name: Create vmware snapshot of {{ inventory_hostname }}

vmware_guest_snapshot:

hostname: "{{ vcenter_host }}"

username: "{{ vcenter_user }}"

password: "{{ vcenter_pass }}"

datacenter: "{{ vcenter_dc }}"

folder: "/vm/"

description: "before os updates"

state: present

snapshot_name: "snap_{{ '%Y-%m-%d-%M' | strftime }}"

delegate_to: 127.0.0.1

Running it on a host will produce the following output:

TASK [Create snapshot of webserver01] 
********************************
ok: [webserver01]

TASK [Create snapshot of webserver01]

********************************

ok: [webserver01]

That’s a neat way to determine what the playbook is currently doing.

Omitting superfluous information

One thing you don’t have to do, is to include the name of the role in the task-name. That’s done automatically. See here:

Playbook:

- name: configure sudoers
  hosts: all
  roles:
  - role: sudoers

- name: configure sudoers

hosts: all

roles:

- role: sudoers

Tasks-file of the role:

- name: create sudoers.d-directory if it does not exist
- file:
   path: /etc/sudoers.d/
   owner: root
   group: root
   mode: 0750
   state: directory

- name: create sudoers.d-directory if it does not exist

- file:

path: /etc/sudoers.d/

owner: root

group: root

mode: 0750

state: directory

And the output of the play:

ansible-playbook playbooks/sudoers.yml

PLAY [configure sudoers] 
********************************

TASK [sudoers : create sudoers.d-directory if it does not exist] 
********************************
ok: [webserver01]

ansible-playbook playbooks/sudoers.yml

PLAY [configure sudoers]

********************************

TASK [sudoers : create sudoers.d-directory if it does not exist]

********************************

ok: [webserver01]

Observe how it includes the role-name in the task description without it being explicitly defined!

Use Modules Before Run Commands

This one should be obvious, but for people that come from a classic admin-background and are new to Ansible it often is not:
Ansible is batteries-included and comes with more than 1000 modules to help manage systems. Most times it’s not needed (nor useful!) to fall back to shell commands instead of using modules.

Here’s a simple example. Instead of doing this:

- name: install htop
  hosts: all
  tasks:
    - command: "yum install htop -y"

- name: install htop

hosts: all

tasks:

- command: "yum install htop -y"

do this:

- name: install software
  hosts: all
  tasks:
    - name: "install htop"
      yum:
        name: "htop"
        state: present

- name: install software

hosts: all

tasks:

- name: "install htop"

yum:

state: present

Ansible is helpful in detecting when you should use modules instead of commands. It detects these uses and prints a warning. When running the above task with command Ansible prints:

TASK [command] ***********************
 [WARNING]: Consider using yum module rather than running yum

1 2	TASK [command] *********************** [WARNING]: Consider using yum module rather than running yum

Use copy or template-module instead of lineinfile

It’s often necessary to change single lines in files. When having to do this, many people will use the lineinfile or blockinfile modules to change the file.

However over the years I learned that most times you should in fact not use these modules when wanting to changes files. You should rather use the template– or copy-module to manage not only single lines but the whole file itself.

The reason for that is twofold. First when using lineinfile you often have to use regex. Now you have two problems. More seriously, using regex is often okay, if the regex is simple (or you and the people using your playbooks are experienced with regex)!

The second reason is that you have to know and remember that this particular line in this config-file is managed by Ansible. If you manage the whole file with template you can use the ansible_managed-variable to show that the file is under Ansible control.
Here’s an example. Instead of this:

- lineinfile:
    path: /etc/selinux/config
    regexp: '^SELINUX='
    line: 'SELINUX=enforcing'

- lineinfile:

path: /etc/selinux/config

regexp: '^SELINUX='

line: 'SELINUX=enforcing'

use this:

- copy:
    src: "etc/selinux/config"
    dest: "/etc/selinux/config"

- copy:

src: "etc/selinux/config"

dest: "/etc/selinux/config"

or this:

- template:
    src: "etc/selinux/config.j2"
    dest: "/etc/selinux/config"

- template:

src: "etc/selinux/config.j2"

dest: "/etc/selinux/config"

with the template file looking like this:

# {{ansible_managed}}
SELINUX=enforcing
SELINUXTYPE=targeted

# {{ansible_managed}}

SELINUX=enforcing

SELINUXTYPE=targeted

Bonus: you can use a variable for the selinux-state and simply change it on servers where selinux should not be in enforcing state.

Be explicit when writing tasks

When I’m saying that you should be explicit when writing Ansible tasks it’s best to use an example to show what I mean.
Instead of writing this:

- name: copy files
  hosts: all
  tasks:
    - name: "copy file to server"
      copy:
        src: "foo"
        dest: "/etc/foo/bar/"

- name: copy files

hosts: all

tasks:

- name: "copy file to server"

copy:

src: "foo"

dest: "/etc/foo/bar/"

better write it like this:

- name: copy files
  hosts: all
  tasks:
    - name: "copy file to server"
      copy:
        src: "foo"
        dest: "/etc/foo/bar/"
        owner: "root"
        group: "root"
        mode: "0644"

- name: copy files

hosts: all

tasks:

- name: "copy file to server"

copy:

src: "foo"

dest: "/etc/foo/bar/"

owner: "root"

group: "root"

mode: "0644"

Again there are two reasons for this. The first is of technical nature: When you don’t explicitly declare the owner and group of the file, the owner will be the user that executed Ansible. That’s something that is not always desirable and can be easily avoided by being explicit.

The second reason is more of an organizational or „people-reason“. When people use your playbook or role, they may not always know the defaults of the modules you use or what you want to achieve with the tasks. When being explicit in your tasks, there’s less room for guessing and interpretations.

On documenting tasks

Naming your tasks is important to understand what they are doing but often it is more important to document why the task is doing what it does. If it’s not directly obvious what the task does, simply write some comments on top of the task to explain in more detail what’s happening and why:

# the typo3cms-binary is a console to execute common TYPO3-related tasks
# the console is installed with composer
# the path to the binary is relative to the docroot
# docs: https://docs.typo3.org/typo3cms/extensions/typo3_console/CommandReference
# typo3cms language:update
# Update language file for each extension
- name: update typo3 languages
  tags: typo3cms
  command: "vendor/bin/typo3cms language:update"
  args:
    chdir: "{{build_root}}"

# the typo3cms-binary is a console to execute common TYPO3-related tasks

# the console is installed with composer

# the path to the binary is relative to the docroot

# docs: https://docs.typo3.org/typo3cms/extensions/typo3_console/CommandReference

# typo3cms language:update

# Update language file for each extension

- name: update typo3 languages

tags: typo3cms

command: "vendor/bin/typo3cms language:update"

args:

chdir: "{{build_root}}"

If you have to use the command , shell or raw -modules instead of the „correct“ modules, document why you cannot use the correct modules:

# the svn-module does not support adding files, so we have to use the command-module
- name: add build to svn repo
  command: "svn add build.tar.gz"

# the svn-module does not support adding files, so we have to use the command-module

- name: add build to svn repo

command: "svn add build.tar.gz"

Thanks mikeoquinn for this suggestion!

How to write variables

Prefix your variables

There are some things you should consider when writing variables for your roles. The first thing is that you should prefix them with the name of the role. This makes it easier to know where the variable is used.

Here’s an example. Imagine you’re writing a role to install and configure the Apache web-server (you probably don’t have to). The role is named apache. Now you want to create a variable that configures the default Listen-port.

You’ll probably do it like this:

listen_port: 443

1	listen_port: 443

However you should do it like this:

apache_listen_port: 443

1	apache_listen_port: 443

Other than the reason mentioned earlier, there’s no ambiguity here. You definitely know that this variable belongs to the apache-role. There could be another role for some other kind of software that also defines a listen-port. With prefixed variables this is not a problem since variables have their own namespace now.

By the way, Puppet and Chef are on the advantage here, having namespaces for their roles. Ansible is not designed this way.

On writing and using variables

If you use a variable in Ansible it has to be quoted.

The following example won’t work:

- name: install software
  hosts: all
  tasks:
    - name: "install packages"
      yum:
        name: {{ item }}
        state: present
      with_items:
        - htop

- name: install software

hosts: all

tasks:

- name: "install packages"

yum:

state: present

with_items:

- htop

This however works:

- name: install software
  hosts: all
  tasks:
    - name: "install packages"
      yum:
        name: "{{ item }}"
        state: present
      with_items:
        - htop

- name: install software

hosts: all

tasks:

- name: "install packages"

yum:

state: present

with_items:

- htop

You could also use single ticks and omit the spaces between the curly braces and the variable name. However I found the above to be the most readable style. The most important thing is to stick to one style.

Do not show sensitive data in Ansible output

If you use the template-module and there are passwords or other sensitive data in the file, you do not want these to be shown in the Ansible output. That’s what the no_log-option is for. If added to a task, the output will not be logged.

Here’s an example playbook:

- name: copy information
  hosts: localhost
  tasks:
    - name: Copy super sensitive information to host
      template:
        src: "secret.j2"
        dest: "/etc/secret"

- name: copy information

hosts: localhost

tasks:

- name: Copy super sensitive information to host

template:

src: "secret.j2"

dest: "/etc/secret"

Without no_log: true the output will look like this:

PLAY [test] **********************************************************************************************************************************************************************************

TASK [Copy super sensitive information to host] **********************************************************************************************************************************************
--- before
+++ after: /tmp/tmpS6ymZC/secret.j2
@@ -0,0 +1,1 @@
+PASSWORD=secret

changed: [webserver01]

PLAY [test] **********************************************************************************************************************************************************************************

TASK [Copy super sensitive information to host] **********************************************************************************************************************************************

--- before

+++ after: /tmp/tmpS6ymZC/secret.j2

@@ -0,0 +1,1 @@

+PASSWORD=secret

changed: [webserver01]

With no_log: true it will look like this:

PLAY [test] **********************************************************************************************************************************************************************************

TASK [Copy super sensitive information to host] **********************************************************************************************************************************************
--- before
+++ after: /tmp/tmp23CKjm/secret.j2
@@ -0,0 +1,1 @@
+ [[ Diff output has been hidden because 'no_log: true' was specified for this result ]]

changed: [localhost]

PLAY [test] **********************************************************************************************************************************************************************************

TASK [Copy super sensitive information to host] **********************************************************************************************************************************************

--- before

+++ after: /tmp/tmp23CKjm/secret.j2

@@ -0,0 +1,1 @@

+ [[ Diff output has been hidden because 'no_log: true' was specified for this result ]]

changed: [localhost]

To keep sensisitve data in your playbooks and roles secret, use ansible-vault. There’s extensive documentation from Ansible with good examples so I won’t cover this topic here.

On writing and (re-)using roles

Before writing playbooks and roles it’s always a good idea to check if somebody else already did the work for you. For most common software there’s already a role in Ansible Galaxy. When searching for roles there, sort for Stargazers (and maybe downloads) to find the most popular (and hopefully well maintained) roles.
There are some people and organizations that provide many high-quality roles. geerlingguy, jdauphant, ANXS and (shameless plug) dev-sec provide some great roles.

When you create your role, use ansible-galaxy init to create the initial directory layout and stick to it. Then, if you follow all best practices mentioned here, your roles should be good to publish them on Ansible Galaxy and Github.

On documenting roles

When documenting roles, it’s best to use the template created by ansible-galaxy init. There you have to describe the role and its function, list and explain the variables used, the needed dependencies and provide examples. I always try to add some more documentation of the variables in the form of a table, providing the variable name, the default value and a explanation of the variable:

Name	Default Value	Description
`network_ipv6_enable`	false	true if IPv6 is needed
`ssh_remote_hosts`	[]	one or more hosts and their custom options for the ssh-client. Default is empty. See examples in `defaults/main.yml`.
`ssh_allow_root_with_key`	false	false to disable root login altogether. Set to true to allow root to login via key-based mechanism.

Other best practice considerations

The Ansible directory structure

When structuring your Ansible directory, you’re really free to do what you want. Ansible provides some sane examples in its documentation. This directory can also be a git-repository that gets used by Jenkins or AWX.

In every project we try to use the same structure which looks something like this:

.
├── ansible.cfg
├── ansible_modules
├── group_vars
│   ├── webservers
│   └── all
├── hosts
│   ├── webserver01
│   └── webserver02
├── host_vars
├── modules
├── playbooks
│   └── ansible-cmdb.yml
└── roles
    ├── requirements.yml
    ├── galaxy
        └── dev-sec.ssh-hardening
            └── auditd
        ├── files
        │   ├── auditd.conf
        │   ├── audit.yml
        ├── handlers
        │   └── main.yml
        ├── meta
        │   └── main.yml
        └── tasks
            └── main.yml

├── ansible.cfg

├── ansible_modules

├── group_vars

│ ├── webservers

│ └── all

├── hosts

│ ├── webserver01

│ └── webserver02

├── host_vars

├── modules

├── playbooks

│ └── ansible-cmdb.yml

└── roles

├── requirements.yml

├── galaxy

└── dev-sec.ssh-hardening

└── auditd

├── files

│ ├── auditd.conf

│ ├── audit.yml

├── handlers

│ └── main.yml

├── meta

│ └── main.yml

└── tasks

└── main.yml

The ansible.cfg file

The ansible.cfg has mostly default values. The ones that need to be changed to accompany the above directory structure are the following:

inventory  = ./hosts
library    = ./ansible_modules/
roles_path = ./roles:./roles/galaxy

inventory = ./hosts

library = ./ansible_modules/

roles_path = ./roles:./roles/galaxy

There are more topics I’d like to cover, and I’ll update this article when I wrote down my thoughts on these topics:

mono-repo vs. one repo a role
encryption
Testing roles
On making every value a variable
and auditing.

Colleagues Wanted:
Find our current job vacancies in IT Operations

Job vacancies IT Operations

Sebastian Gumprich

Linux und Open Source Enthusiast. Aus dem traditionellen Betrieb kommend, schlage ich jetzt Brücken zwischen Betrieb und Entwicklung und tauche nebenbei in die Cloud-Native Landschaft ein.

Name	Cookie-Einstellungen
Anbieter	Deutsche Telekom MMS GmbH, Impressum
Zweck	Speicherung der Cookie-Einwilligungen, die beim Aufruf der Website gegeben wurden.
Datenschutzerklärung	https://www.telekom-mms.com/datenschutz.html
Host(s)	blog.telekom-mms.com
Cookie Name	borlabs-cookie
Cookie Laufzeit	1 Jahr

Name	Mapp (ehem. Webtrekk)
Anbieter	Deutsche Telekom MMS GmbH
Zweck	Betrieb und bedarfsgerechte Gestaltung
Datenschutzerklärung	https://leistungen.telekom-mms.com/datenschutz.html
Host(s)	tsystems01.webtrekk.net
Cookie Name	WTEID_, WT_, WTSID_*
Cookie Laufzeit	6 Monate (Cookie), Bis zum Beenden des Browsers (Session Cookie)

Name	Social Media Funktionen
Anbieter	Deutsche Telekom MMS GmbH
Zweck	Ermöglicht es, die Social Funktionen auf der Webseite zu aktivieren, um Inhalte über Social Media zu teilen.
Datenschutzerklärung	https://blog.telekom-mms.com/datenschutzhinweise
Host(s)	blog.telekom-mms.com
Cookie Name	social, lokaler Speicher (blog.telekom-mms.com)
Cookie Laufzeit	1000 Tage (Cookie), Bis zum Löschen des Caches (lokaler Speicher)

Akzeptieren	soundcloud
Name	soundcloud
Anbieter	SoundCloud Limited
Zweck	Wird von SoundCloud eingesetzt, um registrierte Nutzer plattform- und geräteübergreifend zu erkennen
Datenschutzerklärung	https://soundcloud.com/pages/privacy
Host(s)	soundcloud.com, w.soundcloud.com
Cookie Name	sc_anonymous_id, Session Cookie (w.soundcloud.com), lokaler Speicher
Cookie Laufzeit	10 Jahre (Cookie), Bis zum Beenden des Browsers (Session Cookie), Bis zum Löschen des Caches (lokaler Speicher)