User Data and Bootstrap
What is User Data in AWS?
- User Data is custom configuration passed to an EC2 instance at launch.
- It automates bootstrapping tasks, especially during the first boot.
Common Use Cases
- Installing software
- Setting environment variables
- Configuring system settings
- Starting services automatically
Note
Typically written as a shell script or cloud-init configuration.
Warning
β οΈ It runs only on the first launch unless configured otherwise.
Sample User Data Script
#!/bin/bash
yum install -y docker
systemctl start docker
echo "Setting up my app..."How to Check User Data Logs
π§βπ» SSH into the EC2 instance π Run these commands:
cat /var/log/cloud-init-output.log
cat /var/log/cloud-init.logAlternative (New Interface)
Navigate to π Monitor and Troubleshoot > Get System Log
Chef RunList Integration
Example User Data with Chef
{
"devices": {
"sdd": {
"file_system": "ext4",
"mount_point": "/apps/proman",
"mount_opts": "nodev,nosuid"
}
},
"Chef_RunList": [
"role[gtis_apphosting_ecs_cluster_role_v1840]",
"role[b-chef-client-wrapper]",
"role[is_jaas_secrets_linux-install]",
"role[barclays_proman_csm_wrapper-prod-role]",
"role[barclays_proman_cloudwatch_agent-mid-linux-prod]",
"role[barclays_proman_cloudwatch_agent-cwa-evg-prod]"
],
"build_mode": "normal",
"ecs.config": {
"ECS_CLUSTER": "proman-prod-ecs-01-Cluster",
"ECS_ENABLE_CONTAINER_METADATA": "true"
}
}Explanation of Each Field
| Field | Description |
|---|---|
| devices | Specifies custom mount points and options for a volume |
| Chef_RunList | A list of roles assigned via Chef that define what gets installed/configured on the instance |
| build_mode | Indicates if the build is βnormalβ or any special variant like testing or DR |
| ecs.config | ECS specific settings - specifies which ECS cluster the instance should join |
What is Chef_RunList?
The Chef_RunList is a key concept in Chef, a configuration management tool used to automate infrastructure. Itβs basically a to-do list for the node (your EC2 instance) that tells Chef what roles or recipes to apply in which order.
How it works
- Chef Client Runs on the instance (usually via a script or systemd service)
- It contacts the Chef Server and pulls down:
- The run list (from your Chef_RunList)
- All the roles and recipes associated with it
- Chef then applies them in order. Each role may configure:
- Software packages
- Environment variables
- ECS agents
- Monitoring tools
- Secrets or JAAS files
- Logs are usually stored in /var/chef or system logs
How does the instance know where the Chef Server is?
The Chef Client needs a few config files to know:
- Where to connect (Chef Server URL/IP)
- Which node it is
- Which validation key or certs to use
These details are usually in:
/etc/chef/client.rbExample of client.rb
log_level :info
log_location STDOUT
chef_server_url "https://chef.barclays.net/organizations/prod"
validation_client_name "prod-validator"
node_name "ldwpsr04sujeoy2"How does this file get there?
-
Pre-baked in the AMI β
- Most likely in enterprise setups
- The AMI already has /etc/chef/client.rb and required certs
-
Injected at runtime via User Data or cloud-init
- You could have a bootstrap script
-
Using knife bootstrap
- If the instance is manually or programmatically bootstrapped using knife
Authentication
Chef uses:
- A validator key (for the first time)
- Or client keys (/etc/chef/client.pem) for secure auth
- HTTPS to securely communicate with the server
User Data vs Chef Trade-offs
| Aspect | User Data | Chef Recipes/Roles |
|---|---|---|
| π§ Flexibility | Limited to basic scripting | Full config management DSL |
| β»οΈ Reusability | Hard to reuse across instances | Easily reusable via roles/cookbooks |
| π Maintainability | Becomes messy FAST | Organized into versioned cookbooks |
| π Updates | Need to relaunch instance or SSH in | Can update configs via Chef runs |
| π₯ Team collaboration | Not ideal for large teams | Designed for collab & governance |
| π¦ Package/Service Management | Manual | Native resources: package, service, template, etc. |
| π§ͺ Testing/Validation | Almost none | Test Kitchen, linting, CI/CD possible |
| π Secrets management | Manual or via external tools | Chef Vault / Encrypted Data Bags |
When is User Data Enough?
Use only user data if:
- Itβs a one-off test box or dev environment
- Setup is super simple (e.g., install NGINX + run app)
- Youβre prototyping
When to use Chef (or config mgmt)?
Go with Chef when:
- You have multiple instances or environments
- You need consistent, tested configuration
- You want version control, modular design, and rollback
- Your org already uses Chef
Hybrid approach
A popular pattern:
- Use user data to bootstrap Chef
- Let Chef handle the rest This keeps your base AMI clean and setup logic manageable.