Temporarily disabling an ansible task in a playbook

I have a playbook that provisions a vagrant box. Among other things is installs docker:

- name: install docker
  apt:
    name: "docker.io"
    state: latest

For reasons, I wanted this temporarily disabled in repeated runs of the playbook. Enter the when: statement:

- name: install docker
  apt:
    name: "docker.io"
    state: latest
  # Do this for example when you choose docker to be 
  # installed from a vagrant provisioner but do not 
  # want to remove the task from the playbook.
  when: 0 > 1

You can of course switch from the condition of 0 > 1 to a variable that takes the values true or false and even drive this from the command line by setting the value of the variable.

aws ssm describe-instance-information for quick ansible dynamic inventory

The aws ssm agent is very useful when working both with EC2 instances and with machinery outside AWS. Once you add an outside instance by installing and configuring the SSM agent, be it on-premises or a VM at another provider, you can tag it for further granularity with aws ssm add-tags-to-resource --resource-type ManagedInstance --resource-id mi-WXYZWXYZ --tags Key=onpremise,Value=true --region eu-west-1 where mi-WXYZWXYZ is the instance ID you see at the SSM’s managed instances list (alternatively you can get this list with aws ssm describe-instance-information along with lots of other information).

It may the case that sometimes you want to apply with ansible a certain change to those machines that live outside AWS. Yes you can run ansible workbooks via the SSM directly, but this requires ansible installed on said machines. If you need the simplest of dynamic inventories, to $ ansible -u user -i ./lala all -m ping here is the crudest version of ./lala, one that happily ignores the --list argument:

#!/bin/bash
printf "%s%s%s" \
'{ "all": { "hosts": [' \
$(aws ssm describe-instance-information --region eu-west-1 --filter Key=tag:onpremise,Values=true --query "InstanceInformationList[].IPAddress" --output text | tr '[:blank:]' ',') \
'] } }'

You can go all the way scripting something like this for a proper solution though.

Why printf instead of echo above? Because jpmens suggested so.

ansible, timezone and JDK8

While one might think that they can change the timezone of a machine with ansible with:

  tasks:
  - name: set /etc/localtime
    timezone:
      name: UTC

with some JDK apps it is not enought, because the JVM looks at /etc/timezone. So you need to update that file too. Maybe in a more cleaner way than:

  - name: set /etc/timezone
    shell: echo UTC > /etc/timezone
    args:
      creates: /etc/timezone

ansible, docker-compose, iptables and DOCKER-USER

NOTE: manipulating DOCKER-USER is beyond anyone’s sanity. The information bellow seems to work sometimes (like when I wrote the post) and others not. That is why you will find posts with similar advise on the Net that may or may not work for you. I plan to revisit this and figure out what is wrong, making the following information only temporarily correct.

When you want to run ZooNavigator, the recommendation to get you started is via this docker-compose.yml. However, Docker manages your iptables (unless you go the –iptables=false way) and certain ports will be left wide open. This may not be what you want to do. Docker provides the DOCKER-USER chain for user defined rules that are not affected by service restarts and this is where you want to work. Most of my googling resulted in recipes that did not work, because their final rule was to deny anything from 0.0.0.0/0 after having allowed whatever was to be whitelisted. I solved this in the following example playbook, and the rules worked like a charm. Others that may find themselves in the same situation may want to give it a shot:

---
- name: maintain the DOCKER-USER access list
  hosts: zoonavigators
  vars:
    - wl_hosts:
      - "172.31.0.1"
      - "172.31.0.2"
    - wl_ports:
      - "7070"
      - "7071"
  tasks:

  - name: check for iptables-services
    yum:
      name: iptables-services
      state: latest

  - name: enable iptables-services
    service:
      name: iptables
      enabled: yes
      state: started

  - name: flush DOCKER-USER
    iptables:
      chain: DOCKER-USER
      flush: true

  - name: whitelist for DOCKER-USER
    iptables:
      chain: DOCKER-USER
      protocol: tcp
      ctstate: NEW
      syn: match
      source: "{{ item[0] }}"
      destination_port: "{{ item[1] }}"
      jump: ACCEPT
    with_nested:
      - "{{ wl_hosts }}"
      - "{{ wl_ports }}"

  - name: drop non whitelisted connections to DOCKER-USER
    iptables:
      chain: DOCKER-USER
      protocol: tcp
      #source: "0.0.0.0/0"
      destination_port: "{{ item }}"
      jump: DROP
    with_items:
      - "{{ wl_ports }}"

  - name: save new iptables
    command:
      /usr/libexec/iptables/iptables.init save

Line 46 is the key. The obvious choice would have been source: "0.0.0.0/0" but this did not work for me.

[pastebin here]

vagrant, ansible local and docker

This is a minor annoyance to people who want to work with docker on their vagrant boxes and provision them with the ansible_local provisioner.

To have docker installed in your box, you simply need to enable the docker provisoner in your Vagrantfile:

config.vm.provision "docker", run: "once"

Since you’re using the ansible_local provisiner, you might skip this and write a task that installs docker from get.docker.com or wherever it suits you anyway, but I prefer this as vagrant knows how to best install docker onto itself.

Now obviously you can have the provisioner pull images for you, but for any crazy reason you want to pass most, if not all, of the provisioning to ansible. And thus you want to use among others the docker_image module. So you write something like:

- name: install python-docker
  become: true
  apt:
    name: python-docker
    state: present

- name: install docker images
  docker_image:
    name: busybox

Well this is going to greet you with an error message when you up the machine for the fist time:

Error message

TASK [install docker images] ***************************************************
fatal: [default]: FAILED! => {“changed”: false, “msg”: “Error connecting: Error while fetching server API version: (‘Connection aborted.’, error(13, ‘Permission denied’))”}
to retry, use: –limit @/vagrant/ansible.retry

Whereas when you happily run vagrant provision right away:

TASK [install docker images] ***************************************************
changed: [default]

Why does this happen? Because even though the installation of docker makes the vagrant user a member of group docker, this becomes effective with the next login.

The quickest way to bypass this is to make that part of your first run of ansible provisioning as super user:

- name: install docker images
  become: true  
  docker_image:
    name: busybox

I am using the docker_image module only as an example here for lack of a better example with other docker modules on a Saturday morning. Pulling images is something that is of course very easy to do with the vagrant docker provisioner itself.

default: Running ansible-playbook…

PLAY [all] *********************************************************************

TASK [Gathering Facts] *********************************************************
ok: [default]

TASK [install python-docker] ***************************************************
changed: [default]

TASK [install docker images] ***************************************************
changed: [default]

PLAY RECAP *********************************************************************
default : ok=3 changed=2 unreachable=0 failed=0

Vagrant, ansible_local and pip

I try to provision my Vagrant boxes with the ansible_local provisioner. The other day I was using the pip ansible module while I was booting the box, but was getting errors while installing packages. It turns out that the pip version I had when I created the environment needed an upgrade. Sure you can run a pip install pip --upgrade from the command line, but how do you do so within a playbook? Pretty easy it seems:

---
- hosts: all
  tasks:
    - name: create the needed virtual environment and upgrade pip
      pip:
        chdir: /home/vagrant
        virtualenv: work
        virtualenv_command: /usr/bin/python3 -mvenv
        name: pip
        extra_args: --upgrade

    - name: now install the requirements
      pip:
        chdir: /home/vagrant
        virtualenv: work
        virtualenv_command: /usr/bin/python3 -mvenv
        requirements: /vagrant/requirements.txt

(Link to pastebin here in case the YAML above does not render correctly for you.)

I hope it helps you too.

ansible_managed is a string that can be inserted into files written by Ansible’s config templating system. You put the macro string # {{ ansible_managed }} in your jinja2 template and it gets expanded to something meaningful like:

# Ansible managed: /path/to/file/template/hosts.j2 modified on 2014-09-24 10:52:51 by username on hostname

You get a good idea of where the file came from. Unfortunately, templates work only with ansible playbooks and not with the direct ansible command. But even when you use the copy module outside a playbook it is a good practice to put a comment that includes {{ ansible_managed }} at the beginning of the file. It serves as a handy reminder on how this file got installed in the first place. And in the future, if you make a template and a playbook work with it, you’re already set.

On ansible and the script module

Ansible offers the convenience of running scripts on remote servers. But as the documentation notes:

It is usually preferable to write Ansible modules than pushing scripts. Convert your script to an Ansible module for bonus points!

There is a reason for this. Usually you have ansible run a script on your behalf when what you want to do is not achievable via a module or some combination of modules in a playbook. In extreme circumstances you will need to run a script via ansible when the receiving computer has no Python installed.

But there is a problem with running scripts this way: They are opaque.

A playbook that is applied to your machines is actually a model of that part of the machines that you want to manage. And ansible is your sensor that deals with the situation when things go sour.

It is very easy to write a script that does one thing well to one machine and does not check for failure. Now apply this to 100 or 500 machines that are similar, yet have some subtle differences between them. Can you imagine what a rewrite your script needs in order to account for all corner cases? And if you make it bullet-proof, congratulations! You’re half-way through to making your own incompatible version of ansible.

Having said that, I am guilty of running scripts instead of describing work to be done in a playbook. This mostly involves stuff that needs to be executed from a login shell (hello rvm!) which means the script begins with #!/bin/bash. However, in order to exercise better control in such situations I am not running more than one command plus checks for return codes in every script. This breaks the script down in many smaller ones, but allows me a better view when something goes wrong. Because my playbooks instead of having one script directive, they have 5 or six in a row.

You may have not described an accurate model of what you want to do using a playbook’s markup, but at least the name: directive for every single task is accurate enough to let you know what is executing, rather than having it issue a larger script where you wait whether it succeeded or not, and if not try to find out from which point exactly to roll back (if rolling back is possible).

So the new rule is:

When pushing a script through ansible, it should execute one command only plus any checks needed for return status.

ping in Ansible playbooks

The ping module documentation says that it does not make sense in playbooks, but it is useful only for /usr/bin/ansible. Well I think there is a case where you can include it in a playbook, and that is when you disable fact gathering. I really want to know if there is something wrong with connecting to a server, prior to starting executing the whole playbook scenario and be left with a half played one to redo. So, at least for the host sizes that I apply this, it does not hurt to have this as the first task:

---
- hosts: whatever
  user: whoever
  gather_facts: no
  tasks:
  - name: ping all hosts
    ping:

The fact gathering phase implicitly runs the setup module. If your play does not make use of fact computation, you may want to disable it and use ping, just to check how ssh communicates with ansible before feeding it work to do.

Are all the servers running the latest version? Ansible to the rescue

After a certain size of servers, it is impossible to remember whether they are all current or not, or even check a documentation wiki page to find out about. So how can one use ansible to find out the answer? The setup module enters the room. Assuming an all Debian installation one could run:

ansible debian-machines -m setup --tree /tmp/invetory
cd /tmp/inventory
grep ansible_distribution_version * | grep -v 7\.2

This will list Debian machines not running 7.2 (Wheezy). You can build more complex versions of the above to match your infrastructure.

PS: Many thanks to @laserllama and @jpmens.