Ansible Automation

Agentless IT automation — playbooks, inventory, roles, vault & CI/CD integration

01

Overview

Ansible is an open-source, agentless automation tool that uses SSH (or WinRM/SSH for Windows) to configure systems, deploy software, and orchestrate complex workflows. Everything is defined in YAML — no proprietary DSL, no compiled agents, no daemons running on managed nodes. You write a playbook, run it, and Ansible connects to your targets over SSH, executes tasks, and reports back.

The core design principle is idempotency: running the same playbook multiple times produces the same result. If a package is already installed, Ansible skips it. If a file already has the correct content, Ansible leaves it alone. This makes Ansible safe to re-run and suitable for drift correction.

Red Hat Ansible vs community

Ansible exists in two forms. The community project (ansible-core) is free, open-source, and maintained on GitHub. Red Hat Ansible Automation Platform (AAP) is the commercial product that bundles ansible-core with AWX/AAP Controller (web UI, RBAC, scheduling), Automation Hub (curated content), and enterprise support. Most teams start with community Ansible and move to AAP when they need centralized management, audit trails, or role-based access for multiple teams.

Why Ansible is popular

Strengths

  • Agentless — Nothing to install on managed nodes. Just SSH and Python.
  • YAML-based — Playbooks are human-readable, version-controllable, and reviewable in PRs
  • Low barrier to entry — A sysadmin can be productive in hours, not weeks
  • Massive module library — Thousands of modules for cloud, networking, containers, databases, security
  • Idempotent by default — Safe to re-run, enables drift correction
  • Works everywhere — Linux, Windows, network devices, cloud APIs, containers
  • Red Hat backing — Enterprise support, certified content, long-term roadmap

Considerations

  • Performance at scale — SSH-based execution is slower than agent-based tools for 1000+ nodes
  • State management — No built-in state file (unlike Terraform). You describe desired state, but Ansible does not track what it previously did.
  • Error handling — YAML playbooks can get complex with deep conditional logic and error recovery
  • Windows support — Works via WinRM or SSH (officially supported since ansible-core 2.18). Improving rapidly, especially with OpenSSH built into Windows Server 2025, but still not as mature as Linux support
  • Secret managementAnsible Vault is basic; many teams pair it with HashiCorp Vault or cloud KMS
  • No built-in drift detection — Ansible enforces state when run but does not continuously monitor for drift between runs. Event-Driven Ansible (EDA) can help by triggering remediation playbooks in response to external events.

Ansible vs other tools

FeatureAnsibleTerraformPuppetChef
ArchitectureAgentless (SSH/WinRM)Agentless (API)Agent-based (with agentless option)Agent-based
LanguageYAMLHCLPuppet DSL (Ruby)Ruby DSL
Primary useConfig mgmt + orchestrationInfrastructure provisioningConfig mgmtConfig mgmt
StateStateless (desired state per run)State fileAgent reportsAgent reports
Learning curveLowMediumHighHigh
IdempotencyModule-levelBuilt-inBuilt-inBuilt-in
Positioning

Ansible and Terraform are complementary, not competing. Terraform provisions infrastructure (VMs, networks, load balancers). Ansible configures what runs on that infrastructure (packages, users, services, files). A common pattern is Terraform to create the VMs, then Ansible to configure them. Trying to use Ansible for cloud infrastructure provisioning or Terraform for OS-level configuration leads to pain.

02

How Ansible Works

Ansible follows a push-based model. You run ansible-playbook on a control node (your laptop, a CI runner, a bastion host), and it pushes configuration to managed nodes over SSH. There is no central server, no agent, no pull schedule. You decide when to run it.

Architecture

+--------------------------------------------------+ | Control Node | | (laptop, CI runner, bastion host) | | | | ansible-playbook site.yml -i inventory | | | | +------------+ +----------+ +-----------+ | | | Playbook | | Inventory| | ansible. | | | | (YAML) | | (hosts) | | cfg | | | +------------+ +----------+ +-----------+ | +----------+---------------------------------------+ | | SSH (Linux/Windows) / WinRM (Windows) | +-----+-----+-----+-----+ | | | | | v v v v v +----+ +----+ +----+ +----+ +----+ |Node| |Node| |Node| |Node| |Node| | 1 | | 2 | | 3 | | 4 | | 5 | +----+ +----+ +----+ +----+ +----+ (Only requirement: Python + SSH)

Execution flow

When you run ansible-playbook site.yml, this is what happens under the hood:

  1. Parse — Ansible reads the playbook YAML, resolves variables, loads roles, and builds a list of plays
  2. Inventory — Reads the inventory file (or dynamic inventory script) to determine which hosts to target
  3. Fact gathering — Connects to each host via SSH and runs the setup module to collect system facts (OS, IP, memory, disk, etc.)
  4. Task execution — For each task in each play, Ansible generates a small Python script, copies it to the remote host via SFTP/SCP, executes it, captures the output, and removes the script
  5. Result collection — Each task returns JSON with status (changed, ok, failed, skipped). Ansible aggregates results and proceeds to the next task.
  6. Handler notification — If a task reports "changed" and notifies a handler (e.g., restart nginx), the handler runs at the end of the play

Modules and plugins

Modules are the units of work. Each task calls one module (e.g., apt, copy, service). Modules are idempotent — they check current state and only make changes if needed. Modules execute on the remote host.

Plugins extend Ansible's core behavior and run on the control node. Types include connection plugins (SSH, WinRM, Docker), lookup plugins (read from files, environment, Vault), callback plugins (custom output formatting), and filter plugins (Jinja2 filters for data transformation).

Python requirement

Ansible modules are Python scripts that execute on the target. Most modules require Python 3 on managed nodes (Python 2 support was dropped after ansible-core 2.16). The exact minimum Python version depends on your ansible-core release — check the ansible-core support matrix for details. Python is usually already present on modern Linux distributions. For minimal or embedded systems without Python, Ansible provides the raw module which sends raw shell commands without requiring Python, and the script module which copies and executes a script in any language.

Key insight

Ansible is fundamentally an SSH automation framework. Everything it does could be done manually by SSHing to each host and running commands. Ansible provides structure (playbooks), safety (idempotency), scale (parallel execution across hundreds of hosts), and repeatability (version-controlled YAML). If SSH works, Ansible works.

03

Inventory

The inventory defines which hosts Ansible manages and how to connect to them. It can be a static file (INI or YAML format), a dynamic script that queries a cloud API, or a plugin that reads from an external source. The inventory also defines groups, which let you target subsets of hosts with specific plays.

INI format (traditional)

# inventory/hosts.ini

[webservers]
web1.example.com
web2.example.com
web3.example.com ansible_port=2222

[dbservers]
db1.example.com ansible_user=postgres
db2.example.com ansible_user=postgres

[loadbalancers]
lb1.example.com

# Group of groups
[production:children]
webservers
dbservers
loadbalancers

# Variables for all hosts in a group
[webservers:vars]
http_port=8080
max_connections=1000

[all:vars]
ansible_python_interpreter=/usr/bin/python3

YAML format (preferred)

# inventory/hosts.yml
all:
  vars:
    ansible_python_interpreter: /usr/bin/python3
  children:
    production:
      children:
        webservers:
          vars:
            http_port: 8080
            max_connections: 1000
          hosts:
            web1.example.com:
            web2.example.com:
            web3.example.com:
              ansible_port: 2222
        dbservers:
          vars:
            ansible_user: postgres
          hosts:
            db1.example.com:
            db2.example.com:
        loadbalancers:
          hosts:
            lb1.example.com:

Dynamic inventory

For cloud environments where hosts are ephemeral, static files become stale immediately. Dynamic inventory plugins query cloud APIs in real time to build the host list.

# inventory/aws_ec2.yml (dynamic inventory plugin)
plugin: amazon.aws.aws_ec2
regions:
  - us-east-1
  - us-west-2
keyed_groups:
  - key: tags.Environment
    prefix: env
  - key: tags.Role
    prefix: role
  - key: placement.availability_zone
    prefix: az
filters:
  instance-state-name: running
  "tag:ManagedBy": ansible
compose:
  ansible_host: private_ip_address
# Test dynamic inventory
ansible-inventory -i inventory/aws_ec2.yml --graph
ansible-inventory -i inventory/aws_ec2.yml --list

group_vars and host_vars

Variables can be defined per-group or per-host in separate files. Ansible automatically loads them based on directory structure:

# Directory structure
inventory/
  hosts.yml
  group_vars/
    all.yml            # Variables for every host
    webservers.yml     # Variables for webservers group
    dbservers.yml      # Variables for dbservers group
    production.yml     # Variables for production group
  host_vars/
    web1.example.com.yml   # Variables for this specific host
    db1.example.com.yml
# inventory/group_vars/webservers.yml
nginx_version: "1.28"
ssl_certificate_path: /etc/ssl/certs/app.crt
worker_processes: auto
worker_connections: 4096

Inventory patterns

# Target specific groups or hosts
ansible-playbook site.yml -i inventory/ -l webservers       # only webservers
ansible-playbook site.yml -i inventory/ -l 'webservers:&production'  # intersection
ansible-playbook site.yml -i inventory/ -l 'webservers:!web3.example.com'  # exclude
ansible-playbook site.yml -i inventory/ -l '*.example.com'  # wildcard
Recommendation

Use YAML format for inventory — it is consistent with playbooks and supports complex data structures. Use group_vars/host_vars directories rather than inline variables in the inventory file. This keeps secrets separate (you can vault-encrypt individual var files) and makes the inventory readable. For cloud environments, always use dynamic inventory — static files for ephemeral VMs are a maintenance nightmare.

04

Playbooks

A playbook is a YAML file containing one or more plays. Each play targets a group of hosts and defines a list of tasks to execute. Tasks call modules, and the order of tasks in a play is the order of execution. Playbooks are the core of Ansible — they are the automation scripts that define your infrastructure as code.

Playbook structure

# site.yml - a realistic multi-task playbook
---
- name: Configure web servers
  hosts: webservers
  become: true
  gather_facts: true
  vars:
    app_port: 8080
    app_user: appuser

  pre_tasks:
    - name: Update apt cache
      apt:
        update_cache: true
        cache_valid_time: 3600

  tasks:
    - name: Install required packages
      apt:
        name:
          - nginx
          - python3-pip
          - certbot
        state: present

    - name: Create application user
      user:
        name: "{{ app_user }}"
        shell: /bin/bash
        create_home: true
        system: true

    - name: Deploy nginx configuration
      template:
        src: templates/nginx.conf.j2
        dest: /etc/nginx/sites-available/default
        owner: root
        group: root
        mode: '0644'
      notify: Reload nginx

    - name: Deploy application config
      template:
        src: templates/app.conf.j2
        dest: "/home/{{ app_user }}/app.conf"
        owner: "{{ app_user }}"
        mode: '0600'
      notify: Restart application

    - name: Ensure nginx is enabled and running
      service:
        name: nginx
        state: started
        enabled: true

    - name: Open firewall ports
      ufw:
        rule: allow
        port: "{{ item }}"
        proto: tcp
      loop:
        - '80'
        - '443'
        - "{{ app_port }}"

  handlers:
    - name: Reload nginx
      service:
        name: nginx
        state: reloaded

    - name: Restart application
      systemd:
        name: myapp
        state: restarted
        daemon_reload: true

- name: Configure database servers
  hosts: dbservers
  become: true
  roles:
    - role: geerlingguy.postgresql
      vars:
        postgresql_version: "16"
        postgresql_databases:
          - name: myapp
        postgresql_users:
          - name: myapp
            password: "{{ vault_db_password }}"

Key playbook concepts

Tasks

Tasks are the individual actions. Each task calls one module. Tasks run in order, and Ansible stops on the first failure (unless ignore_errors: true is set). Tasks should have a descriptive name for readability in output.

Handlers

Handlers are tasks that only run when notified by another task that reported "changed". They run once at the end of the play, regardless of how many tasks notify them. Common use: restarting a service after config changes.

Become

become: true escalates privileges (sudo). Can be set at the play level or per-task. Use become_user to become a specific user. The connecting user must have sudo access on the target.

Tags

Tags let you run a subset of tasks. Add tags: [deploy, config] to tasks, then run with --tags deploy. Use --skip-tags to exclude. Tags are essential for large playbooks where you want to run only specific parts.

Includes and imports

# Import is static (resolved at parse time)
- import_tasks: tasks/common.yml

# Include is dynamic (resolved at runtime, supports loops and conditionals)
- include_tasks: "tasks/{{ ansible_os_family | lower }}.yml"

# Import a playbook
- import_playbook: webservers.yml
- import_playbook: dbservers.yml
Import vs Include

import_* is static — resolved at playbook parse time. Tags and conditions on an import apply to all tasks inside it. include_* is dynamic — resolved at runtime. This means you can use variables in the filename, but tags on the include statement itself do not propagate to tasks within the included file. To push tags into included tasks, use the apply keyword (e.g., include_tasks: file: db.yml apply: tags: db). Use imports for static structure, includes for dynamic/conditional loading.

05

Variables & Facts

Variables in Ansible come from many sources, and understanding variable precedence is critical. Ansible has 22 levels of variable precedence. When the same variable is defined in multiple places, the highest-precedence source wins.

Variable precedence (simplified, highest wins)

PrioritySourceNotes
Highest--extra-vars (-e)Command line. Always wins. Use for overrides and CI/CD.
HighTask vars (block/task level)Scoped to specific tasks
Highinclude_vars / set_factRuntime-defined variables
MediumPlay vars, vars_files, vars_promptDefined in the playbook
MediumHost facts (ansible_*)Gathered from target system
Low-Medhost_vars/*Per-host variable files
Low-Medgroup_vars/*Per-group variable files (child groups override parents)
LowInventory variablesDefined inline in inventory
LowRole defaults (defaults/main.yml)Designed to be overridden. Lowest role-level precedence.
LowestCommand line defaultsAnsible configuration defaults

Ansible facts

When gather_facts: true (the default), Ansible runs the setup module on each host to collect system information. Facts are available as variables prefixed with ansible_:

# Common facts
ansible_hostname          # web1
ansible_fqdn              # web1.example.com
ansible_distribution      # Ubuntu
ansible_distribution_version  # 22.04
ansible_os_family         # Debian
ansible_memtotal_mb       # 8192
ansible_processor_vcpus   # 4
ansible_default_ipv4.address  # 10.0.1.50
ansible_devices            # disk info
ansible_mounts             # mounted filesystems

# Use facts in templates and conditionals
- name: Install packages (Debian)
  apt:
    name: nginx
    state: present
  when: ansible_os_family == "Debian"

- name: Install packages (RedHat)
  dnf:
    name: nginx
    state: present
  when: ansible_os_family == "RedHat"

Registered variables

- name: Check if application is running
  command: systemctl is-active myapp
  register: app_status
  ignore_errors: true

- name: Start application if not running
  service:
    name: myapp
    state: started
  when: app_status.rc != 0

- name: Debug output
  debug:
    msg: "App status: {{ app_status.stdout }}, return code: {{ app_status.rc }}"

Jinja2 templating

# Variable interpolation
message: "Hello {{ username }}"

# Filters
ip_list: "{{ groups['webservers'] | map('extract', hostvars, 'ansible_host') | list }}"
config_hash: "{{ lookup('file', 'app.conf') | hash('sha256') }}"
default_value: "{{ custom_port | default(8080) }}"

# Conditionals in templates (Jinja2)
{% if environment == 'production' %}
log_level: warn
{% else %}
log_level: debug
{% endif %}

# Loops in templates
{% for host in groups['webservers'] %}
server {{ hostvars[host]['ansible_host'] }}:{{ http_port }};
{% endfor %}

Magic variables

Ansible provides special built-in variables that are always available:

  • inventory_hostname — The name of the current host as defined in inventory
  • groups — Dictionary of all groups and their host lists
  • hostvars — Dictionary of all host variables (access another host's vars)
  • play_hosts — List of hosts in the current play
  • ansible_play_batch — List of hosts in the current batch (respects serial)
  • role_path — Path to the current role directory
Variable Debugging

When a variable has an unexpected value, use ansible-playbook site.yml -e @vars.yml --check -vvv to see where variables come from. The debug module with var: is your best friend. For complex precedence issues, remember: extra vars always win, role defaults always lose. Everything else is a spectrum in between.

06

Roles & Galaxy

Roles are Ansible's mechanism for organizing playbook content into reusable, shareable units. A role bundles tasks, handlers, templates, files, variables, and defaults into a standard directory structure. Instead of a 500-line playbook, you have small, focused roles that can be composed together.

Role directory structure

roles/
  webserver/
    tasks/
      main.yml          # Entry point - task list
      install.yml        # Included by main.yml
      configure.yml
    handlers/
      main.yml          # Handlers (restart services, etc.)
    templates/
      nginx.conf.j2     # Jinja2 templates
      vhost.conf.j2
    files/
      index.html        # Static files to copy
    vars/
      main.yml          # Role variables (high precedence)
    defaults/
      main.yml          # Default variables (low precedence, meant to be overridden)
    meta/
      main.yml          # Role metadata, dependencies, Galaxy info
    tests/
      test.yml          # Test playbook
    README.md

Using roles in playbooks

---
- name: Configure web servers
  hosts: webservers
  become: true
  roles:
    # Simple role inclusion
    - webserver

    # Role with variables
    - role: webserver
      vars:
        nginx_port: 8080
        ssl_enabled: true

    # Role with conditional
    - role: monitoring
      when: enable_monitoring | default(true)

    # Role with tags
    - role: security
      tags: [security, hardening]

Ansible Galaxy

Ansible Galaxy is the public repository for community-shared roles and collections. Instead of writing everything from scratch, you can use battle-tested roles from the community.

# Install a role from Galaxy
ansible-galaxy install geerlingguy.docker
ansible-galaxy install geerlingguy.postgresql

# Install a collection
ansible-galaxy collection install community.general
ansible-galaxy collection install amazon.aws

# Install from a requirements file
ansible-galaxy install -r requirements.yml
ansible-galaxy collection install -r requirements.yml
# requirements.yml
roles:
  - name: geerlingguy.docker
    version: "7.1.0"
  - name: geerlingguy.postgresql
    version: "4.0.3"
  - name: geerlingguy.certbot
    version: "5.1.0"

collections:
  - name: community.general
    version: ">=8.0.0"
  - name: amazon.aws
    version: ">=7.0.0"
  - name: ansible.posix
    version: ">=1.5.0"

Collections vs roles

Roles

  • Bundle tasks, templates, handlers, and variables
  • One role = one purpose (e.g., install nginx)
  • Can contain custom modules (in library/) and plugins, but collections are the preferred distribution format for reusable modules/plugins
  • Installed to ~/.ansible/roles/ or project roles/
  • Simpler, focused on playbook organization

Collections

  • Bundle roles, modules, plugins, and playbooks together
  • Namespaced: amazon.aws, community.general
  • Can contain custom modules and plugins
  • The modern distribution format for Ansible content
  • Installed to ~/.ansible/collections/
Best Practice

Pin versions in requirements.yml. An unpinned Galaxy role can break your playbook when the author pushes a breaking change. Use version constraints (version: "7.1.0" or version: ">=7.0.0,<8.0.0") and test upgrades explicitly. Treat Galaxy roles the same way you treat library dependencies in application code — pin, test, upgrade deliberately.

07

Modules & Plugins

Ansible ships with thousands of modules. Knowing which module to use for a given task is the difference between clean, idempotent automation and fragile shell scripts wrapped in YAML. Here are the modules you will use most often.

Essential modules

ModulePurposeExample
apt / yum / dnfPackage managementapt: name=nginx state=present
copyCopy files to remotecopy: src=app.conf dest=/etc/app.conf
templateDeploy Jinja2 templatestemplate: src=nginx.conf.j2 dest=/etc/nginx/nginx.conf
fileManage files/dirs/linksfile: path=/data state=directory mode='0755'
service / systemdManage servicesservice: name=nginx state=started enabled=true
userManage user accountsuser: name=deploy shell=/bin/bash
lineinfileEnsure a line in a filelineinfile: path=/etc/hosts line="10.0.1.5 db1"
uriHTTP requestsuri: url=https://api.example.com method=GET
commandRun a command (no shell)command: /usr/bin/myapp --init
shellRun via shell (pipes, redirects)shell: cat /etc/hosts | grep db

When to use command/shell vs dedicated modules

Anti-Pattern

Do not use command or shell when a dedicated module exists. For example, shell: apt-get install -y nginx is not idempotent — it runs every time. apt: name=nginx state=present checks first and only installs if needed. Use command/shell only when no module exists for your use case, and always add creates, removes, or when conditions to make them idempotent.

# BAD - not idempotent, runs every time
- name: Install nginx
  shell: apt-get install -y nginx

# GOOD - idempotent, checks state first
- name: Install nginx
  apt:
    name: nginx
    state: present

# ACCEPTABLE - command with idempotency guard
- name: Initialize the application database
  command: /opt/myapp/bin/init-db.sh
  args:
    creates: /opt/myapp/data/.initialized  # Skip if this file exists

# ACCEPTABLE - shell with conditional
- name: Check if cluster is healthy
  shell: kubectl get nodes | grep -c Ready
  register: node_count
  changed_when: false  # This is a read-only check

Template module deep dive

The template module is one of Ansible's most powerful features. It takes a Jinja2 template file and renders it with Ansible variables, then deploys the result to the remote host.

# templates/nginx.conf.j2
upstream app_servers {
{% for host in groups['webservers'] %}
    server {{ hostvars[host]['ansible_host'] }}:{{ app_port }};
{% endfor %}
}

server {
    listen {{ nginx_port | default(80) }};
    server_name {{ server_name }};

    location / {
        proxy_pass http://app_servers;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
{% if ssl_enabled | default(false) %}
    listen 443 ssl;
    ssl_certificate {{ ssl_cert_path }};
    ssl_certificate_key {{ ssl_key_path }};
{% endif %}
}

Custom modules

When no built-in module fits your needs, you can write custom modules in Python. Place them in a library/ directory next to your playbook or in a collection. Custom modules receive arguments as JSON, do their work, and return JSON results with changed, failed, and msg fields. Use the ansible.module_utils.basic.AnsibleModule class for argument parsing, check mode support, and result handling.

Module Documentation

Use ansible-doc <module_name> to see full documentation, examples, and return values for any module. For example, ansible-doc template shows all parameters, defaults, and usage examples. This is faster than searching the web and works offline.

08

CI/CD Integration

Running Ansible in CI/CD pipelines is the standard way to automate deployments. The pattern is straightforward: your pipeline checks out the playbook repo, installs Ansible, and runs ansible-playbook with the appropriate inventory and vault credentials. The challenge is managing SSH keys, secrets, and inventory in a CI environment.

GitHub Actions

# .github/workflows/deploy.yml
name: Deploy Application
on:
  push:
    branches: [main]
  workflow_dispatch:

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install Ansible
        run: |
          pip install ansible boto3

      - name: Install Galaxy dependencies
        run: ansible-galaxy install -r requirements.yml

      - name: Set up SSH key
        run: |
          mkdir -p ~/.ssh
          echo "${{ secrets.SSH_PRIVATE_KEY }}" > ~/.ssh/id_rsa
          chmod 600 ~/.ssh/id_rsa
          ssh-keyscan -H ${{ secrets.DEPLOY_HOST }} >> ~/.ssh/known_hosts

      - name: Run playbook
        env:
          ANSIBLE_VAULT_PASSWORD: ${{ secrets.VAULT_PASSWORD }}
        run: |
          echo "$ANSIBLE_VAULT_PASSWORD" > .vault_pass
          ansible-playbook site.yml \
            -i inventory/production/ \
            --vault-password-file .vault_pass \
            -e "app_version=${{ github.sha }}"
          rm -f .vault_pass

GitLab CI

# .gitlab-ci.yml
stages:
  - lint
  - deploy

lint:
  stage: lint
  image: python:3.11
  script:
    - pip install ansible-lint
    - ansible-lint site.yml

deploy_staging:
  stage: deploy
  image: python:3.11
  environment:
    name: staging
  before_script:
    - pip install ansible boto3
    - ansible-galaxy install -r requirements.yml
    - mkdir -p ~/.ssh
    - echo "$SSH_PRIVATE_KEY" > ~/.ssh/id_rsa
    - chmod 600 ~/.ssh/id_rsa
    - echo "$VAULT_PASSWORD" > .vault_pass
  script:
    - ansible-playbook site.yml
        -i inventory/staging/
        --vault-password-file .vault_pass
        -e "app_version=${CI_COMMIT_SHA}"
  after_script:
    - rm -f .vault_pass ~/.ssh/id_rsa
  only:
    - main

Ansible in Docker

# Dockerfile for an Ansible runner
FROM python:3.11-slim

RUN pip install --no-cache-dir \
    ansible-core \
    boto3 \
    jmespath \
    ansible-lint

COPY requirements.yml /ansible/
RUN ansible-galaxy install -r /ansible/requirements.yml && \
    ansible-galaxy collection install -r /ansible/requirements.yml

WORKDIR /ansible
ENTRYPOINT ["ansible-playbook"]
# Run Ansible from Docker
docker run --rm \
  -v $(pwd):/ansible \
  -v ~/.ssh/id_rsa:/root/.ssh/id_rsa:ro \
  -e ANSIBLE_HOST_KEY_CHECKING=false \
  my-ansible-runner site.yml -i inventory/production/

AWX / Ansible Automation Platform

For teams that need a web UI, RBAC, job scheduling, and audit trails, AWX (open-source) or Ansible Automation Platform (Red Hat commercial) provides a centralized platform. AWX stores credentials securely, manages inventories, and provides a REST API for triggering playbook runs from other systems. It is essentially "Jenkins for Ansible" but purpose-built.

Secrets in CI/CD

Never store vault passwords or SSH keys in your repository. Use your CI/CD platform's secret management (GitHub Secrets, GitLab CI Variables, etc.). The vault password should be injected at runtime via an environment variable or a temporary file that is cleaned up after the run. For production, consider integrating with HashiCorp Vault or cloud KMS to fetch secrets dynamically during playbook execution using lookup plugins.

Ansible Vault provides encryption for sensitive data such as passwords, API keys, and certificates. It uses AES-256 symmetric encryption. You encrypt files or individual strings with a vault password, commit the encrypted content to version control, and provide the vault password at runtime to decrypt.

Encrypting files

# Encrypt an entire file
ansible-vault encrypt group_vars/production/secrets.yml

# Create a new encrypted file
ansible-vault create group_vars/production/secrets.yml

# Edit an encrypted file (decrypts in-place for editing)
ansible-vault edit group_vars/production/secrets.yml

# View encrypted file contents
ansible-vault view group_vars/production/secrets.yml

# Decrypt a file permanently
ansible-vault decrypt group_vars/production/secrets.yml

# Re-key (change the vault password)
ansible-vault rekey group_vars/production/secrets.yml

Encrypting individual strings

# Encrypt a single string (inline in a YAML file)
ansible-vault encrypt_string 'SuperSecretPassword123' --name 'db_password'

# Output (paste this into your vars file):
# db_password: !vault |
#   $ANSIBLE_VAULT;1.1;AES256
#   62313365396662343061393464336163...
# group_vars/production/secrets.yml (mix of plain and encrypted)
app_environment: production
app_debug: false
db_password: !vault |
  $ANSIBLE_VAULT;1.1;AES256
  62313365396662343061393464336163383764356462376564656232...
api_key: !vault |
  $ANSIBLE_VAULT;1.1;AES256
  33356134653765633035313038376432336531303365616438...

Vault IDs (multiple passwords)

Vault IDs let you use different passwords for different environments or sensitivity levels:

# Encrypt with a vault ID
ansible-vault encrypt --vault-id prod@prompt group_vars/production/secrets.yml
ansible-vault encrypt --vault-id dev@prompt group_vars/staging/secrets.yml

# Use a password file per environment
ansible-vault encrypt --vault-id prod@.vault_pass_prod secrets.yml

# Run playbook with multiple vault IDs
ansible-playbook site.yml \
  --vault-id dev@.vault_pass_dev \
  --vault-id prod@.vault_pass_prod

Using vault in playbooks

# Provide vault password interactively
ansible-playbook site.yml --ask-vault-pass

# Provide vault password from a file
ansible-playbook site.yml --vault-password-file .vault_pass

# Provide vault password from an environment variable (CI/CD pattern)
echo "$VAULT_PASSWORD" > /tmp/vault_pass
ansible-playbook site.yml --vault-password-file /tmp/vault_pass
rm -f /tmp/vault_pass

# Or use a script that outputs the password
ansible-playbook site.yml --vault-password-file get_vault_pass.sh
Vault Limitations

Ansible Vault is file-level encryption, not a secrets manager. It does not support access control, audit logs, secret rotation, or dynamic secrets. For production environments, pair Vault-encrypted files for static config with a proper secrets manager (HashiCorp Vault, AWS Secrets Manager, Azure Key Vault) for dynamic secrets. Use the community.hashi_vault.hashi_vault lookup plugin to fetch secrets at runtime without storing them in files at all.

10

Best Practices

Recommended directory layout

ansible-project/
  ansible.cfg              # Project-level Ansible configuration
  site.yml                 # Main playbook (imports others)
  webservers.yml           # Playbook for web tier
  dbservers.yml            # Playbook for database tier
  requirements.yml         # Galaxy role/collection dependencies
  inventory/
    production/
      hosts.yml            # Production inventory
      group_vars/
        all.yml
        webservers.yml
        dbservers/
          main.yml
          vault.yml        # Encrypted secrets
      host_vars/
    staging/
      hosts.yml
      group_vars/
  roles/
    common/                # Shared role (NTP, users, packages)
    webserver/             # Web server role
    database/              # Database role
  playbooks/               # Additional playbooks
    rolling-update.yml
    backup.yml
  templates/               # Global templates (if not in roles)
  files/                   # Global static files
  library/                 # Custom modules
  filter_plugins/          # Custom Jinja2 filters

Idempotency checklist

  • Use dedicated modules instead of command/shell whenever possible
  • When using command/shell, add creates:, removes:, or when: guards
  • Mark read-only commands with changed_when: false
  • Use state: present / state: absent instead of install/remove commands
  • Test with --check mode — a properly idempotent playbook should show zero changes on the second run

Check mode and diff mode

# Dry run - show what WOULD change without making changes
ansible-playbook site.yml --check

# Diff mode - show the exact changes (file diffs)
ansible-playbook site.yml --check --diff

# Combine with limit for safe testing
ansible-playbook site.yml --check --diff --limit web1.example.com

# Some tasks don't support check mode - mark them:
# check_mode: false  (always run, even in check mode)
# check_mode: true   (only run in check mode)

Linting with ansible-lint

# Install
pip install ansible-lint

# Run against a playbook
ansible-lint site.yml

# Run against all YAML in the project
ansible-lint

# Common rules it catches:
# - Using command/shell instead of a dedicated module
# - Missing name on tasks
# - Using deprecated syntax
# - Trailing whitespace
# - Risky file permissions
# - Using bare variables in when clauses
# .ansible-lint (configuration file)
skip_list:
  - yaml[line-length]
  - name[casing]
warn_list:
  - experimental
exclude_paths:
  - .cache/
  - .github/
  - molecule/

Testing with Molecule

Molecule is the standard testing framework for Ansible roles. It creates ephemeral test instances (Docker containers, VMs, cloud instances), runs your role against them, and verifies the result with testinfra or ansible assertions.

# Initialize Molecule for an existing role
cd roles/webserver
molecule init scenario  # configure driver in molecule.yml (default: docker)

# Run the full test lifecycle
molecule test
# This runs: create -> converge -> idempotence -> verify -> destroy

# Run individual steps for development
molecule create       # Spin up test containers
molecule converge     # Run the role
molecule idempotence  # Run again, verify zero changes
molecule verify       # Run verification tests
molecule destroy      # Clean up
Golden Rule

A good Ansible project should pass three tests: (1) ansible-lint reports no errors, (2) --check --diff on a configured system shows zero changes (proving idempotency), and (3) molecule test passes on a clean system (proving the role works from scratch). If all three pass, you have automation you can trust.

11

Consultant's Checklist

When assessing or setting up Ansible automation for a client, verify the following:

Foundation

  • Playbooks and roles are in version control (Git)
  • Inventory is organized by environment (production, staging, dev)
  • Secrets are encrypted with Ansible Vault or external KMS
  • SSH key management is centralized (no shared keys)
  • ansible.cfg is project-scoped, not global

Quality

  • ansible-lint runs in CI on every PR
  • Roles have Molecule tests
  • Playbooks are idempotent (second run = zero changes)
  • No raw command/shell where modules exist
  • Templates use {{ ansible_managed }} header comment

Organization

  • Roles are small and focused (one role = one concern)
  • Galaxy dependencies are pinned in requirements.yml
  • Variables follow naming conventions (role-prefixed)
  • group_vars/host_vars are used instead of inline variables
  • Tags are used for selective execution

Operations

  • CI/CD pipeline runs playbooks (not humans from laptops)
  • Rolling deployments use serial: to limit blast radius
  • --check --diff is run before applying changes to production
  • Callback plugins or AWX provide run history and audit trail
  • Dynamic inventory is used for cloud environments
Maturity Progression

Level 1: Ad-hoc playbooks run manually from a developer's laptop. Level 2: Playbooks in Git, manual execution from a bastion host. Level 3: CI/CD runs playbooks automatically, ansible-lint in PR checks, Vault for secrets. Level 4: AWX/AAP for centralized management, Molecule tests for all roles, dynamic inventory, full audit trail. Most teams should aim for Level 3 as a baseline. Level 4 is for organizations with multiple teams sharing Ansible automation.