User Data and Bootstrap

What is User Data in AWS?

  • User Data is custom configuration passed to an EC2 instance at launch.
  • It automates bootstrapping tasks, especially during the first boot.

Common Use Cases

  • Installing software
  • Setting environment variables
  • Configuring system settings
  • Starting services automatically

Note

Typically written as a shell script or cloud-init configuration.

Warning

⚠️ It runs only on the first launch unless configured otherwise.

Sample User Data Script

#!/bin/bash
yum install -y docker
systemctl start docker
echo "Setting up my app..."

How to Check User Data Logs

πŸ§‘β€πŸ’» SSH into the EC2 instance πŸ“‚ Run these commands:

cat /var/log/cloud-init-output.log
cat /var/log/cloud-init.log

Alternative (New Interface)

Navigate to πŸ‘‰ Monitor and Troubleshoot > Get System Log


Chef RunList Integration

Example User Data with Chef

{
  "devices": {
    "sdd": {
      "file_system": "ext4",
      "mount_point": "/apps/proman",
      "mount_opts": "nodev,nosuid"
    }
  },
  "Chef_RunList": [
    "role[gtis_apphosting_ecs_cluster_role_v1840]",
    "role[b-chef-client-wrapper]",
    "role[is_jaas_secrets_linux-install]",
    "role[barclays_proman_csm_wrapper-prod-role]",
    "role[barclays_proman_cloudwatch_agent-mid-linux-prod]",
    "role[barclays_proman_cloudwatch_agent-cwa-evg-prod]"
  ],
  "build_mode": "normal",
  "ecs.config": {
    "ECS_CLUSTER": "proman-prod-ecs-01-Cluster",
    "ECS_ENABLE_CONTAINER_METADATA": "true"
  }
}

Explanation of Each Field

FieldDescription
devicesSpecifies custom mount points and options for a volume
Chef_RunListA list of roles assigned via Chef that define what gets installed/configured on the instance
build_modeIndicates if the build is β€œnormal” or any special variant like testing or DR
ecs.configECS specific settings - specifies which ECS cluster the instance should join

What is Chef_RunList?

The Chef_RunList is a key concept in Chef, a configuration management tool used to automate infrastructure. It’s basically a to-do list for the node (your EC2 instance) that tells Chef what roles or recipes to apply in which order.

How it works

  1. Chef Client Runs on the instance (usually via a script or systemd service)
  2. It contacts the Chef Server and pulls down:
    • The run list (from your Chef_RunList)
    • All the roles and recipes associated with it
  3. Chef then applies them in order. Each role may configure:
    • Software packages
    • Environment variables
    • ECS agents
    • Monitoring tools
    • Secrets or JAAS files
  4. Logs are usually stored in /var/chef or system logs

How does the instance know where the Chef Server is?

The Chef Client needs a few config files to know:

  • Where to connect (Chef Server URL/IP)
  • Which node it is
  • Which validation key or certs to use

These details are usually in:

/etc/chef/client.rb

Example of client.rb

log_level :info
log_location STDOUT
chef_server_url "https://chef.barclays.net/organizations/prod"
validation_client_name "prod-validator"
node_name "ldwpsr04sujeoy2"

How does this file get there?

  1. Pre-baked in the AMI βœ…

    • Most likely in enterprise setups
    • The AMI already has /etc/chef/client.rb and required certs
  2. Injected at runtime via User Data or cloud-init

    • You could have a bootstrap script
  3. Using knife bootstrap

    • If the instance is manually or programmatically bootstrapped using knife

Authentication

Chef uses:

  • A validator key (for the first time)
  • Or client keys (/etc/chef/client.pem) for secure auth
  • HTTPS to securely communicate with the server

User Data vs Chef Trade-offs

AspectUser DataChef Recipes/Roles
πŸ”§ FlexibilityLimited to basic scriptingFull config management DSL
♻️ ReusabilityHard to reuse across instancesEasily reusable via roles/cookbooks
πŸ“š MaintainabilityBecomes messy FASTOrganized into versioned cookbooks
πŸ”„ UpdatesNeed to relaunch instance or SSH inCan update configs via Chef runs
πŸ‘₯ Team collaborationNot ideal for large teamsDesigned for collab & governance
πŸ“¦ Package/Service ManagementManualNative resources: package, service, template, etc.
πŸ§ͺ Testing/ValidationAlmost noneTest Kitchen, linting, CI/CD possible
πŸ” Secrets managementManual or via external toolsChef Vault / Encrypted Data Bags

When is User Data Enough?

Use only user data if:

  • It’s a one-off test box or dev environment
  • Setup is super simple (e.g., install NGINX + run app)
  • You’re prototyping

When to use Chef (or config mgmt)?

Go with Chef when:

  • You have multiple instances or environments
  • You need consistent, tested configuration
  • You want version control, modular design, and rollback
  • Your org already uses Chef

Hybrid approach

A popular pattern:

  • Use user data to bootstrap Chef
  • Let Chef handle the rest This keeps your base AMI clean and setup logic manageable.