User Data and Bootstrap

What is User Data in AWS?

User Data is custom configuration passed to an EC2 instance at launch.
It automates bootstrapping tasks, especially during the first boot.

Common Use Cases

Installing software
Setting environment variables
Configuring system settings
Starting services automatically

Note

Typically written as a shell script or cloud-init configuration.

Warning

⚠️ It runs only on the first launch unless configured otherwise.

Sample User Data Script

#!/bin/bash
yum install -y docker
systemctl start docker
echo "Setting up my app..."

How to Check User Data Logs

🧑‍💻 SSH into the EC2 instance 📂 Run these commands:

cat /var/log/cloud-init-output.log
cat /var/log/cloud-init.log

Alternative (New Interface)

Navigate to 👉 Monitor and Troubleshoot > Get System Log

Chef RunList Integration

Example User Data with Chef

{
  "devices": {
    "sdd": {
      "file_system": "ext4",
      "mount_point": "/apps/proman",
      "mount_opts": "nodev,nosuid"
    }
  },
  "Chef_RunList": [
    "role[gtis_apphosting_ecs_cluster_role_v1840]",
    "role[b-chef-client-wrapper]",
    "role[is_jaas_secrets_linux-install]",
    "role[barclays_proman_csm_wrapper-prod-role]",
    "role[barclays_proman_cloudwatch_agent-mid-linux-prod]",
    "role[barclays_proman_cloudwatch_agent-cwa-evg-prod]"
  ],
  "build_mode": "normal",
  "ecs.config": {
    "ECS_CLUSTER": "proman-prod-ecs-01-Cluster",
    "ECS_ENABLE_CONTAINER_METADATA": "true"
  }
}

Explanation of Each Field

Field	Description
devices	Specifies custom mount points and options for a volume
Chef_RunList	A list of roles assigned via Chef that define what gets installed/configured on the instance
build_mode	Indicates if the build is “normal” or any special variant like testing or DR
ecs.config	ECS specific settings - specifies which ECS cluster the instance should join

What is Chef_RunList?

The Chef_RunList is a key concept in Chef, a configuration management tool used to automate infrastructure. It’s basically a to-do list for the node (your EC2 instance) that tells Chef what roles or recipes to apply in which order.

How it works

Chef Client Runs on the instance (usually via a script or systemd service)
It contacts the Chef Server and pulls down:
- The run list (from your Chef_RunList)
- All the roles and recipes associated with it
Chef then applies them in order. Each role may configure:
- Software packages
- Environment variables
- ECS agents
- Monitoring tools
- Secrets or JAAS files
Logs are usually stored in /var/chef or system logs

How does the instance know where the Chef Server is?

The Chef Client needs a few config files to know:

Where to connect (Chef Server URL/IP)
Which node it is
Which validation key or certs to use

These details are usually in:

/etc/chef/client.rb

Example of client.rb

log_level :info
log_location STDOUT
chef_server_url "https://chef.barclays.net/organizations/prod"
validation_client_name "prod-validator"
node_name "ldwpsr04sujeoy2"

How does this file get there?

Pre-baked in the AMI ✅
- Most likely in enterprise setups
- The AMI already has /etc/chef/client.rb and required certs
Injected at runtime via User Data or cloud-init
- You could have a bootstrap script
Using knife bootstrap
- If the instance is manually or programmatically bootstrapped using knife

Authentication

Chef uses:

A validator key (for the first time)
Or client keys (/etc/chef/client.pem) for secure auth
HTTPS to securely communicate with the server

User Data vs Chef Trade-offs

Aspect	User Data	Chef Recipes/Roles
🔧 Flexibility	Limited to basic scripting	Full config management DSL
♻️ Reusability	Hard to reuse across instances	Easily reusable via roles/cookbooks
📚 Maintainability	Becomes messy FAST	Organized into versioned cookbooks
🔄 Updates	Need to relaunch instance or SSH in	Can update configs via Chef runs
👥 Team collaboration	Not ideal for large teams	Designed for collab & governance
📦 Package/Service Management	Manual	Native resources: package, service, template, etc.
🧪 Testing/Validation	Almost none	Test Kitchen, linting, CI/CD possible
🔐 Secrets management	Manual or via external tools	Chef Vault / Encrypted Data Bags

When is User Data Enough?

Use only user data if:

It’s a one-off test box or dev environment

Setup is super simple (e.g., install NGINX + run app)

You’re prototyping

When to use Chef (or config mgmt)?

Go with Chef when:

You have multiple instances or environments

You need consistent, tested configuration

You want version control, modular design, and rollback

Your org already uses Chef

Hybrid approach

A popular pattern:

Use user data to bootstrap Chef

Let Chef handle the rest This keeps your base AMI clean and setup logic manageable.

Om's Brain

Explorer

10 User Data and Bootstrap

User Data and Bootstrap

What is User Data in AWS?

Common Use Cases

Sample User Data Script

How to Check User Data Logs

Chef RunList Integration

Example User Data with Chef

Explanation of Each Field

What is Chef_RunList?

How it works

How does the instance know where the Chef Server is?

Example of client.rb

How does this file get there?

Authentication

User Data vs Chef Trade-offs

Table of Contents

Graph View

Backlinks

Om's Brain

Explorer

10 User Data and Bootstrap

User Data and Bootstrap

What is User Data in AWS?

Common Use Cases

Sample User Data Script

How to Check User Data Logs

Chef RunList Integration

Example User Data with Chef

Explanation of Each Field

What is Chef_RunList?

How it works

How does the instance know where the Chef Server is?

Example of client.rb

How does this file get there?

Authentication

User Data vs Chef Trade-offs

Related Topics

Table of Contents

Graph View

Backlinks