AWS (SAA-C03)

Scaling

ELB

ALB

Works at application layer (layer 7)
ALB target groups can be:
- EC2 instances
- ECS tasks
- Lambda functions
- Private IP addresses
ALB have listeners with specific protocols and each listener can route the traffic to different target groups using listener rules
health check is done at the target group level using HTTP and HTTPS protocols
cross zone load balancing is enabled by default
Cannot attach elastic IP to ALB
ALB must be in a public subnet to work
Also supports gRPC protocol
supports Weighted Target Groups routing

NLB

Works at transport layer (layer 4)
extreme performance (can handle millions of requests per second)
TCP and UDP protocols
has one static IP per AZ which can also be elastic IP
NLB target groups can be:
- EC2 instances
- Private IP addresses
- ALBs
health check can be done via TCP, HTTP, HTTPs protocols
cross zone load balancing is disabled by default

GWLB

Works at network layer (layer 3)
Route traffics to 3rd party virtual appliances to do processes like security analysis first before routing to the servers
Uses geneve protocol on port 6081
GWLB target groups can be:
- EC2 instances
- Private IP addresses
Cross zone load balancing is disabled by default

Cross-zone Load Balancing

Distribute the traffic evenly across target groups in different regions

ELBs have security groups too

ELBs are region bound

Sticky Sessions

to make sure the same client will always be routed to the same instance
support for CLB, ALB and NLB
ALB uses cookies which have expiration date that can be controlled

Cookies

Application based cookies
- custom cookies: defined by application and name cannot be AWSALB, AWSALBAPP or AWSALBTG
- application cookies: defined by load balancer and name is AWSALBAPP
Load balancer generated cookies / Duration based cookies:
- generated by load balancer
- name is AWSALB

SSL/TLS

Server name indication (SNI) is the extension of TLS protocol that enables client to specify the domain name it wants to reach through a single server endpoint

Connection Draining / Deregistration Delay

time to allow instances to finish on the fly requests before deregistering
new requests are not sent to the draining instance but instead routed to other healthy instances
can set between 0-3600 seconds (default is 300)
can be disabled by setting it to 0

ASG

ASG uses launch templates to manage ec2 instances
it scales using scaling policy
ASG can use cloudwatch alarms as triggers to scale the instances
EC2 instances can be put into standby state to temporarily remove them from ASG

Scaling Policies

Dynamic scaling
- Target tracking policy
- Simple/step scaling
Scheduled scaling
Predictive scaling

Launch template

Only a launch template can be used to provision capacity across multiple instance types using both On-Demand Instances and Spot Instances to achieve the desired scale, performance, and cost

Termination Policy in order

Based on instance allocation strategy
Oldest Launch Configuration
Oldest Launch Template
Next Billing Hour

Instance states

Pending
InService
Terminating
Terminated
Standby

Lifecycle Hooks

autoscaling:EC2_INSTANCE_LAUNCHING
autoscaling:EC2_INSTANCE_TERMINATING

autoscaling:EC2_INSTANCE_LAUNCHING

When Amazon EC2 Auto Scaling responds to a scale-out event, it launches one or more instances
These instances start in the Pending state
If you added an autoscaling:EC2_INSTANCE_LAUNCHING lifecycle hook to your Auto Scaling group, the instances move from the Pending state to the Pending:Wait state
After you complete the lifecycle action, the instances enter the Pending:Proceed state
When the instances are fully configured, they are attached to the Auto Scaling group and they enter the InService state

autoscaling:EC2_INSTANCE_TERMINATING

When Amazon EC2 Auto Scaling responds to a scale-in event, it terminates one or more instances
These instances are detached from the Auto Scaling group and enter the Terminating state
If you added an autoscaling:EC2_INSTANCE_TERMINATING lifecycle hook to your Auto Scaling group, the instances move from the Terminating state to the Terminating:Wait state
After you complete the lifecycle action, the instances enter the Terminating:Proceed state
When the instances are fully terminated, they enter the Terminated state

Cooldown period

ensures that the Auto Scaling group does not launch or terminate additional EC2 instances before the previous scaling activity takes effect
default is 300secs (5mins)

Databases

DynamoDB

Serverless
Fully managed, highly available NoSQL database with replication across multiple AZs
Millions of requests per seconds, trillions of row, 100s of TB of storage

RDS

RDS storage scales automatically within set maximum storage threshold
Automatically scales the storage if:
- free storage is less than 10% of allocated storage
- low storage lasts at least 5 mins
- 6 hrs have passed since last modification

Read Replicas

up to 15 replicas
support within AZ, cross AZ or cross region
replication is ASYNC and can have some replication delay
each replica can be promoted to their own db
each replica has different endpoint so application have to manage the endpoint calling
for RDS, read replicas dont charge data transfer fees if within same region
Read replicas can also be used as disaster recovery although replication is ASYNC

Multi-AZ

RDS db can be replicated multi AZ for disaster recovery
same DNS endpoint for all multi-AZ replicas
automatic failover standby
can’t be used as read scaling cause multi-AZ replicas are for standby
replication is SYNC

RDS Custom

Managed Oracle and Microsoft SQL Server Database with OS and database customization
RDS: entire database and the OS to be managed by AWS
RDS Custom: full admin access to the underlying OS and the database
Can SSH into underlying EC2 instance

Backup

Auto backup
- daily full backup
- transaction logs are backup every 5 mins
- restore to any point in time oldest to last 5 mins
- can set 1 to 35 days of retention, 0 to disable backup
Manual backup
- take db snapshot
- retention as long as user want
Can create backup and snapshots in multi-AZ
Stopped RDS db also charge cost

Encrypting un-encrypted RDS database

Take a snapshot of the database
Copy it as an encrypted snapshot
Restore a database from the encrypted snapshot
Terminate the previous database

Enhanced Monitoring

Monitor the operating system of your DB instance in real time
When you want to see how different processes or threads use the CPU, Enhanced Monitoring metrics are useful

IAM DB Authentication

works with MySQL and PostgreSQL
An authentication token is a string of characters that you use instead of a password
it’s valid for 15 minutes before it expires

Ways to use SSL encryption

Force SSL
Encrypt from client side

Force SSL

Set the rds.force_ssl parameter to true to force connections to use SSL
The rds.force_ssl parameter is static, so after you change the value, you must reboot your DB instance for the change to take effect

Encrypt from client side

This sets up an SSL connection from a specific client computer, and you must do work on the client to encrypt connections
Must obtain certificates for the client computer, import certificates on the client computer, and then encrypt the connections from the client computer

RDS Proxy for RDS and Aurora

Serverless, autoscaling, highly available (multi-AZ)
RDS Proxy is never publicly accessible (must be accessed from VPC)

Aurora

proprietary of AWS
Aurora storage automatically grows in increments of 10GB, up to 128 TB
up to 15 replicas
sub 10ms replica lag
Aurora costs around 20% more than RDS
shared storage volume with up to 6 copies of the data across 3 AZs
self-healing with peer-to-peer replication
Master(read-write) + up to 15 read-only replicas
1 write endpoint + 1 load balanced reader endpoint
support cross region replication
support read replica auto scaling

Custom Endpoint

can create custom endpoint from subset of read replicas
good for analytics or dev testing env

Aurora Serverless

Automated database instantiation and auto- scaling based on actual usage
pay per second
Cannot change from provisioned to serverless

Global Aurora

1 primary read-write region
up to 5 secondary read-only regions
less than 1 second replication lag
up to 16 read replicas per each secondary region
Promoting another region (for disaster recovery) has an RTO of < 1 minute

DB Cloning

faster than snapshot-and-restore
initially, cloned DB access data from the same storage volume as original DB
when new data or updated data come, use new storage volume
useful for staging db creation from the original prod db

Backup (Aurora)

Auto backup
- 1 to 35 days (can’t be disabled)
Manual backup
- take db snapshot
- retention as long as user want

Read replicas failover priority

Watch the tier (smaller number, higher priority)
Watch the size (larger, the higher priority)

Aurora MySQL Native Function

Can create a native function or a stored procedure that invokes a Lambda function whenever a row in a table is modified in the database

Failover Scenerios

Single Instance

Aurora will attempt to create a new DB Instance in the same Availability Zone as the original instance
This replacement of the original instance is done on a best-effort basis and may not succeed, for example, if there is an issue that is broadly affecting the Availability Zone

Read Replica

Amazon Aurora flips the canonical name record (CNAME) for your DB Instance to point at the healthy replica, which in turn is promoted to become the new primary
Start-to-finish failover typically completes within 30 seconds

Aurora Serverless

Aurora will automatically recreate the DB instance in a different AZ

IAM DB Authentication

works with MySQL and PostgreSQL
An authentication token is a string of characters that you use instead of a password
it’s valid for 15 minutes before it expires

ElasticCache

to get managed Redis or Memcached
Redis: used for gaming leaderboards, application cache, geospatial data
Memcached: used for use cases like DB cache or user session store
Redis’s sorted set can be used for leaderboard ranking use cases
HIPAA-compatible
Have multi-AZ configuration
Can have up to 5 read replicas across multiple AZs

Neptune

Graph DB

DocumentDB

AWS service for MongoDB

KeySpaces

AWS service for Apache Cassandra

DNS

Route53

A highly available, scalable, fully managed and Authoritative DNS
The only AWS service which provides 100% availability SLA

Record Types

A - map to ipv4
AAAA - map to ipv6
CNAME - map to another domain name (can’t be root or top node namespace or zone apex)
Alias - can map root or top nodes to AWS resources (eg; alb endpoints) (extension of A or AAAA type)
NS - name servers for the hosted zones (for dns traffic routing)

Name Servers

Physical servers that resolve the DNS requests by looking at the records stored in hosted zones
NS record in a hosted zone route the DNS request traffic to name servers

Cost

$0.50 per month per hosted zone

Hosted Zones

Public
Private (within VPC)

Routing Policies

Simple
Weighted
Latency-based
Failover
Geolocation
Geoproximity
IP-based routing
Multi-value

Failover

active-active
active-passive

active-active

Both systems are running and can be served as failover

active-passive

Only one system is serving and another one is standby as failover occurs

s3 static website routing

To route s3 static website using Route53, name of the s3 bucket must be the same as domain name

Containerization

ECS

Launch Types

EC2
Fargate

EC2 Launch Type

Must provision & maintain the infrastructure (the EC2 instances)
Each EC2 Instance must run the ECS Agent to register in the ECS Cluster

Fargate Launch Type

No need to provision the infrastructure (no EC2 instances to manage)

IAM Roles

EC2 Instance Profile
ECS Task Role

Data Volumes

EBS volumes of each EC2 instance
Can use EFS
Fargate+EFS = Serverless

AWS Application Auto Scaling

Automatically increase/decrease the desired number of ECS tasks

Scaling Methods

Target Tracking
Step Scaling
Scheduled Scaling

Cluster Capacity Auto Scaling

Use ECS Cluster Capacity Provider to automatically provision and scale the infrastructure for your ECSTasks
Capacity Provider paired with an Auto Scaling Group

ECR

Store and manage Docker images on AWS
Fully integrated with ECS, backed by Amazon S3

EKS

EKS supports EC2 if you want to deploy worker nodes or Fargate to deploy serverless containers

ECS Anywhere and EKS Anywhere

Extends AWS ECS and EKS functionality to run containers on any infrastructure, including on-premises servers, edge devices, or virtual machines outside AWS
Allows organizations to use ECS and EKS as the orchestration layer for hybrid or multi-cloud deployments

AWS App Runner

Fully managed service designed to automatically deploy and scale web applications and APIs from source code or a container image, with minimal configuration
No infrastructure experience required, just need source code or container image
Automatic code building, deploying, scaling, highly available, load balancer, encryption

AWS ElasticBeanStalk

Platform-as-a-Service (PaaS) that makes it easy to deploy, manage, and scale web applications and services
Manages the infrastructure (compute, storage, networking) but still allows customization if needed
Provides real-time monitoring of application health, resource usage, and logs

Serverless

Services

Lambda
Dynamodb
Cognito
API Gateway
S3
SNS and SQS
Kinesis
Aurora Serverless
Step Functions
Fargate

Lambda

Pay per request and compute time
Free tier of 1,000,000 AWS Lambda requests and 400,000 GBs of compute time
Outside of a VPC by default
If assigned a VPC and subnet, lambda will create ENI in the subnet/VPC
Can be invoked by using lambda function URL

Pricing

Pay per call
- First 1,000,000 requests are free
- 0.20 per 1 million requests thereafter ($0.0000002 per request)
Pay per duration
- 400,000 GB-seconds of compute time per month for FREE
- 400,000 seconds if function is 1GB RAM
- 3,200,000 seconds if function is 128 MB RAM
- After that $1.00 for 600,000 GB-seconds

Execution

Memory allocation: 128 MB – 10GB (1 MB increments)
Maximum execution time: 900 seconds (15 minutes)
Environment variables (4 KB)
Disk capacity in the “function container” (in /tmp): 512 MB to 10GB
Concurrency executions: 1000 (can be increased) per region

Deployment

Lambda function deployment size (compressed .zip): 50 MB
Size of uncompressed deployment (code + dependencies): 250 MB
Can use the /tmp directory to load other files at startup
Size of environment variables: 4 KB

Lambda SnapStart for JAVA

Lambda initializes the function at publish time
Takes a snapshot of memory and disk state of the initialized function
Snapshot is cached for low-latency access

Running Container Images

Container image must be built using AWS provided base image tailored specifically for AWS Lambda

API Gateway

Endpoint Types

Edge-optimized
Regional
Private

Edge-optimized

Requests are routed through the CloudFront Edge locations (improves latency)
The API Gateway still lives in only one region

Regional

For clients within the same region
Could manually combine with CloudFront (more control over the caching strategies and the distribution)

Private

Can only be accessed from own VPC using an interface VPC endpoint (ENI)
Have to use a resource policy to define access

User Authentication

IAM Roles (useful for internal applications)
Cognito (identity for external users – example mobile users)
Custom Authorizer (your own logic)
Custom Domain Name HTTPS security through integration with AWS Certificate Manager (ACM)

Supports API Caching and Request Throttling too

Step Functions

Build serverless visual workflow to orchestrate your Lambda functions

AWS Cognito

Give users an identity to interact with the web or mobile application on AWS

Cognito User Pool

Sign in functionality for app users
Create a serverless database of user for the web & mobile apps
Integrate with API Gateway & Application Load Balancer

Cognito Identity Pool (Federated Identity)

Provide AWS credentials to users so they can access AWS resources directly
Integrate with Cognito User Pools as an identity provider
Get identities for “users” so they obtain temporary AWS credentials

Data Analytics

Amazon Athena

Serverless query service to analyze data stored in Amazon S3
Supports CSV, JSON, ORC, Avro, and Parquet
$5.00 per TB of data scanned
Commonly used with Amazon Quicksight for reporting/dashboards

Federated Query

To run SQL queries across data stored in relational, non-relational, object, and custom data sources (AWS or on-premises)
Uses Data Source Connectors that run on AWS Lambda to run Federated Queries
Store the results back in Amazon S3

Performance Improvement

Use columnar data (Apache Parquet or ORC) for cost-savings
Compress data for smaller retrievals
Partition datasets in S3 for easy querying on virtual columns
Use larger files (> 128 MB) to minimize overhead

RedShift

based on Postgresql but OLAP: online analytical processing (analytics and data warehousing)
10x better performance than other data warehouses, scale to PBs of data
Columnar storage of data (instead of row based) & parallel query engine

Modes

Provisioned Cluster
Serverless Cluster

Provisioned Cluster

Choose instance types in advance
Can reserve instances for cost savings

Redshift Clusters

Leader Node
Compute Node

Leader Node

for query planning, results aggregation

Compute Node

for performing the queries, send results to leader

Snapshots and DR

Snapshots are point-in-time backups of a cluster, stored internally in S3
can restore a snapshot into a new cluster
Automatically every 8 hours, every 5 GB or can be scheduled
Set retention between 1 to 35 days
Can manually take snapshots too
Can enable cross-region snapshots

Data Loading into RedShift

with Kinesis Data Firehose
s3 using copy command
without enhanced VPC routing
with enhanced VPC routing
EC2 Instance JDBC driver

RedShift Spectrum

to run query on data stored in s3 without loading the data

Amazon OpenSearch

Successor to ElasticSearch
common to use OpenSearch as a complement to another database as a database search API
Ingestion from Kinesis Data Firehose, AWS IoT, and CloudWatch Logs
Comes with OpenSearch Dashboards for visualization

Modes

Managed Cluster
Serverless Cluster

Amazon EMR

Amazon Elastic MapReduce
The clusters can be made of hundreds of EC2 instances with autoscaling and can be integrated with spot instances
EMR comes bundled with Apache Spark, HBase, Presto, Flink
EMR takes care of all the provisioning and configuration

Node Types

Master Node
Core Node
Task Node

Master Node

Manage the cluster, coordinate, manage health – long running

Core Node

Run tasks and store data – long running

Task Node

Just to run tasks – usually Spot

Purchasing Options

On demand
Reserved (min 1 yr)
Spot Instances

Modes

Long running cluster
Transient cluster

Amazon QuickSight

Serverless machine learning-powered BI service to create interactive dashboards
In-memory computation using SPICE engine if data is imported into QuickSight
Define Users and Groups (separate from IAM)

AWS Glue

managed ETL service

Glue Job Bookmarks

prevent re-processing old data

Glue Elastic Views

Combine and replicate data across multiple data stores using SQL
No custom code, Glue monitors for changes in the source data, serverless
Leverages a “virtual table” (materialized view)

Glue DataBrew

Prebuilt transformations

Glue Studio

GUI for ETL jobs

Glue Streaming ETL

for streaming data
built on Apache Spark Structured Streaming
compatible with Kinesis Data Streaming, Kafka, MSK

AWS LakeFormation

To build data lake
Created data lakes are stored in s3
Built on top of AWS Glue
Can be used to consolidate data from multiple accounts into a single account as a central datalake

MSK (Amazon Managed Streaming for Kafka)

Alternative to Amazon Kinesis

MSK Serverless

Run Apache Kafka on MSK without managing the capacity
MSK automatically provisions resources and scales compute & storage

AWS Data Exchange

service that makes it easy to find, subscribe to, and use third-party data in the AWS cloud

AWS Data Pipeline

enables you to automate the movement, transformation, and processing of data across different AWS services and on-premises data sources
useful for creating complex data workflows that involve scheduling, dependency management, and data transformations

Monitoring

CloudWatch

CloudWatch Metrics

CloudWatch provides metrics for every services in AWS
Metrics belong to namespaces (eg: S3, ECS, EC2,…)
Dimension is an attribute of a metric (eg: instance id, environment, etc…)
Up to 30 dimensions per metric
Can create CloudWatch Custom Metrics

Metric Streams

Continually stream CloudWatch metrics to a destination of your choice, with near-real-time delivery and low latency (to Kinesis Data Firehose, 3rd party service providers)
Option to filter metrics to only stream a subset of them

Cloudwatch Logs

organized into log groups and log streams
Can define log expiration policies (never expire, 1 day to 10 years…)
Logs are encrypted by default
Can setup KMS-based encryption with your own keys

Can send logs to

Amazon S3 (exports)
Kinesis Data Streams
Kinesis Data Firehose
AWS Lambda
OpenSearch

Log sources

SDK, CloudWatch Logs Agent, CloudWatch Unified Agent
Elastic Beanstalk: collection of logs from application
ECS: collection from containers
AWS Lambda: collection from function logs
VPC Flow Logs: VPC specific logs - API Gateway
CloudTrail based on filter
Route53: Log DNS queries

Log Insights

Search and analyze log data stored in CloudWatch Logs

S3 Export

Log data can take up to 12 hours to become available for export
The API call is CreateExportTask
use Logs Subscriptions

Log Subscriptions

Get a real-time log events from CloudWatch Logs for processing and analysis
Send to Kinesis Data Streams, Kinesis Data Firehose, or Lambda
Subscription Filter: filter which log events are delivered to the destination
Can do cross-account subscription

CloudWatch Agents

To collect logs from EC2 instances or on-premise servers

Log Agents

Older version
Can only collect logs

Unified Agents

Can collect logs and also the instance metrics (eg: CPU, RAM, Disk info, etc)

CloudWatch Alarms

Alarms are used to trigger notifications for any metric

Alarm States

OK
Insufficient Data
In Alarm

Alarm Target Actions

EC2 instances (stop, terminate, reboot, etc)
EC2 Auto Scaling
Amazon SNS

Composite Alarm

Can trigger multiple alarms in conjunction
AND and OR conditions

EC2 Recovery

CloudWatch alarm can trigger the recovery of the Amazon EC2 instance, in case the instance fails.
The instance, however, should only be configured with an Amazon EBS volume
Recovered instance is identical to the original instance, including the instance ID, private IP addresses, Elastic IP addresses, and all instance metadata

CloudWatch Insights

CloudWatch Container Insights
CloudWatch Lambda Insights
CloudWatch Contributor Insights
CloudWatch Application Insights

CloudWatch Container Insights

ECS, EKS, Kubernetes on EC2, Fargate, needs agent for Kubernetes

CloudWatch Lambda Insights

Detailed metrics to troubleshoot serverless applications

CloudWatch Contributors Insights

Find “Top-N” Contributors through CloudWatch Logs

CloudWatch Application Insights

Automatic dashboard to troubleshoot your application and related AWS services

CloudTrail

Provides governance, compliance and audit for your AWS Account
Can be integrated with EventBridge to trigger AWS services based on CloudTrail events
Cloudtrail log files are encrypted by default

CloudTrail Events

Management Events
Data Events
CloudTrail Insights Events

Management Events

Operations that are performed on resources in your AWS account
By default, trails are configured to log management events.

Data Events

Granula data object activities like Amazon S3 object-level activity, AWS Lambda function execution activity

CloudTrail Insights Events

Analyze anomalies in write events to detect unusual patterns

Events retention

Events are stored for 90 days in CloudTrail
To keep events beyond this period, log them to S3 and use Athena

AWS Config

Helps with auditing and recording compliance of your AWS resources
Helps record configurations and changes over time
AWS Config is a per-region service
Can be aggregated across regions and accounts

Config Rules

Can use AWS managed config rules
Can make custom config rules
no free tier, $0.003 p erco n f i gu r a t i o ni t e m recor d e d p erre g i o n,$ 0.001 per config rule evaluation per region

Config Resource

View compliance of a resource over time
View configuration of a resource over time
View CloudTrail API calls of a resource over time

Remediation

Automate remediation of non-compliant resources using SSM Automation Documents
Use AWS-Managed Automation Documents or create custom Automation Documents
Can set Remediation Retries if the resource is still non-compliant after auto-remediation

Notification

Use EventBridge to trigger notifications when AWS resources are non-compliant
Ability to send configuration changes and compliance state notifications to SNS (all events – use SNS Filtering or filter at client-side)

AWS Trusted Advisor

optimize costs, increase performance, improve security and resilience, and operate at scale in the cloud
recommends actions to remediate any deviations from best practices
can do service quota checks by writing an AWS Lambda function that refreshes the AWS Trusted Advisor Service Limits checks and set it to run every 24 hours

AWS X-ray

X-Ray collects data about the requests and responses, tracks latency, identifies performance bottlenecks, and detects errors, helping developers and operations teams understand how their applications behave in real-time

Service Map

X-Ray generates a service map that visualizes the relationships and interactions between the services in your application. This map highlights performance bottlenecks, latency issues, and error rates.

Disaster Recovery

RPO and RTO

Recovery Point Objective: Time between disaster and last backup point
Recovery Time Objective: Time between disaster and system recover time

DR Strategies

Backup and Restore
Pilot Light
Warm Standby
Hot Site / Multi Site Approach

Backup and Restore

Cheapest
High RPO, High RTO

Pilot Light

A most-minimal version of the app is always running in the cloud

Warm Standby

A scaled-down version of the full system is always up and running

Hot Site/ Multi Site

Full Production Scale is running both on AWS and On Premise

AWS Database Migration Service (DMS)

Can migrate databases both heterogeneously and homogeneously from different sources to targets (eg: from on-premise Oracle to AWS Aurora)
Must create an EC2 instance to perform the replication tasks
If the source and target db uses different db engines (eg: Oracle and Postgresql), Schema Conversion Tool (SCT) must be used
AWS DMS supports multi-AZ deployment
In addition to databases, s3 and kinesis can also be the source or target
full load and change data capture (CDC) replication task can be used to migrate and also track the on-going data changes

RDS and Aurora DB Migration

MySQL
PostgreSQL

MySQL

RDS to Aurora:
- DB Snapshots from RDS MySQL restored as MySQL Aurora DB
- Create an Aurora Read Replica from your RDS MySQL, and when the replication lag is 0, promote it as its own DB cluster
External to Aurora:
- Backup onto s3 and import from s3 to Aurora
- Use mysqldump utility to directly migrate into Aurora
- Can also use DMS

PostgreSQL

RDS to Aurora:
- DB Snapshots from RDS PostgreSQL restored as PostgreSQL Aurora DB
- Create an Aurora Read Replica from your RDS PostgreSQL, and when the replication lag is 0, promote it as its own DB cluster
External to Aurora:
- Create a backup, put it in Amazon S3 and import it using the aws_s3 Aurora extension
- Can also use DMS

AWS Backup

Centrally manage and automate backups across AWS services
Supports cross-region backups
Supports cross-account backups

Supported Services

Amazon EC2 / Amazon EBS
Amazon S3
Amazon RDS (all DBs engines) / Amazon Aurora / Amazon DynamoDB
Amazon DocumentDB / Amazon Neptune
Amazon EFS / Amazon FSx (Lustre & Windows File Server)
AWS Storage Gateway (Volume Gateway)

Features

PITR for supported services
On-demand and scheduled backups
Tag based backup policies
Backup Plans
Backup Vault Lock

Backup Plans

Can configure:
- Backup frequency
- Backup window
- Transition to cold storage
- Retention period

Backup Vault Lock

WORM (Write Once Read Many)
Even the root user cannot delete backups inside the locked Vault

AWS ADS and MGN

Application Discovery Service (ADS)
Application Migration Service (MGN)

Plan migration projects by gathering information about on-premises data centers like server utilization data and dependency mapping
Resulting data can be viewed within AWS Migration Hub

Agentless Discovery

Uses AWS Agentless Discovery Connector
Discover VMinventory, configuration, and performance history such as CPU, memory, and disk usage

Agent-based Discovery

Uses AWS Application Discovery Agent
System configuration, system performance, running processes, and details of the network connections between systems

MGN

The “AWS evolution” of CloudEndure Migration, replacing AWS Server Migration Service (SMS)
Lift-and-shift (rehost) solution
Converts physical, virtual, and cloud-based servers to run natively on AWS
Migrate data by installing AWS Replication Agent on source servers

Compute

EC2

Storage

EBS
EFS
EC2 Instance Store

EBS

bound to specific AZs
by default, root volume is set to delete on termination
Only gp2/gp3 and io1/io2 can be used as boot volumes
EBS volumes support live configuration changes while in production which means that you can modify the volume type, volume size, and IOPS capacity without service interruptions

EBS Volume Types

gp2 (SSD)
gp3 (SSD)
io1 (SSD)
io2 block express (SSD)
st1 (HDD)
sc1 (HDD)

gp2

1 GiB - 16TiB
can burst IOPS to 3,000
Size of the volume and IOPS are linked
max IOPS is 16,000
if 3 IOPS per GB, max IOPS at 5,334 GB

gp3

1 GiB - 16TiB
Baseline of 3,000 IOPS and throughput of 125 MiB/s
Can increase IOPS up to 16,000 and throughput up to 1000 MiB/s independently

io1

4 GiB - 16TiB
Max IOPS: 64,000 for Nitro EC2 instances & 32,000 for other
Can increase IOPS independently from storage size

io2 Block Express

4 GiB - 64 TiB
Sub-millisecond latency
Max IOPS: 256,000 with an IOPS:GiB ratio of 1,000:1

Snapshots

snapshots can be copied across AZs
snapshots can be moved to snapshot archives which is 75% cheaper but can take 24 to 72 hrs to restore
snapshots can be moved to recycle bins and retention period can be set from 1 day to 1 year
fast snapshot restore: Force full initialization of snapshot to have no latency on the first use
snapshots can be created automatedly using Amazon Data Lifecycle Manager (DLM)
The EBS volume can be used while the snapshot is in progress

EBS Encryption

Copying an unencrypted snapshot allows encryption
Snapshots of encrypted volumes are encrypted

Encrypt an Unencrypted EBS Volume

Create an EBS snapshot of the volume
Encrypt the EBS snapshot ( using copy )
Create new EBS volume from the snapshot ( the volume will also be encrypted )

Copying encrypted snapshots across regions

Take snapshot of the encrypted volume
Copy the snapshot and encrypt using key B in region B
Restore the volume

Copying encrypted snapshots cross accounts

Create snapshot encrypted with own KMS key
Attach KMS key policy to authorize cross account decrypt access
Share encrypted snapshot
Encrypt the snapshot using KMS key B in account B
Restore the volume

EBS Multi Attach

only io1/io2 volume types can support multi attach
one volume can be attached to multiple instances within same AZ
up to 16 instances at the same time

EFS

network file system (NFS) that can be mounted on many EC2 instances
EFS can be attached to EC2 instances in multiple AZs
have to use security group to control access to EFS
can only be used with linux based AMIs
pay per use, no capacity planning

Performance Modes

General purpose
Max I/O

Throughput Modes

Bursting
Provisioned
Elastic

Bursting

scales with storage
burst up to 100MiB/s

Provisioned

set the throughput regardless of storage size

Elastic

automatically scales throughput up or down based on the workloads
Up to 3GiB/s for reads and 1GiB/s for writes

Storage Tiers

Standard
IA
Archive

Storage Life Cycle

Maximum day that can be configured using storage life cycle is 365 days

Availability Modes

standard (Multi-AZ)
one zone (Single-AZ)

EFS One Zone IA

IA storage tier with one zone availability mode

Instance Store

closely attached to EC2 instance
better I/O than EBS
destroyed when the instance is stopped

RAID 0 vs RAID 1

EBS and Instance Store supports RAID 0 configuration

RAID 0

Data are spread across multiple EBS or Instance store volumes and all volumes act as single storage
Increased throughput

RAID 1

Data are duplicated in all the EBS and Instance store volumes
For data redundancy

Instance Types

General Purpose (M, T)
Compute optimized (C)
Memory optimized (R)
Accelerated (G, P)
Storage optimized (I)

Compute Optimized (C)

Batch processing
HPC
Media transcoding
Scientific modeling
Dedicated gaming servers

Memory Optimized (R)

High performance databases
Cache stores
In memory BIs
In memory big data processing

Storage Optimized (I)

High performance OLTP
For high sequential I/O

Tenancy

default
dedicated
host

default

shared tenancy

dedicated

dedicated tenancy (eg: dedicated instances)

host

dedicated host

Security Group

Control ins/outs of the instance
VPC bound
Can attach to multiple instances
Only contains ‘Allow’ rules
Can reference by IP or by other SGs
Inbound traffics are blocked by default
Outbound traffics are allowed by default

Purchasing Options

On-demand Instances
Reserved Instances
Saving Plans
Spot Instances
Dedicated Hosts
Dedicated Instances
Capacity Reservation

On-demand Instances

Pay by second after 1 min

Reserved Instances

Reserved for 1 or 3 years
Payments: upfront, no upfront, partial upfront
Convertible reserved instance: can change instance attributes

Saving Plans

Reserved to a certain type of usage ($/hr)
Reserved for 1 or 3 years
Locked to an instance family and region
Usage beyond saving plans are charge at on-demand price

Spot Instances

Can get up to 90% discount
Can lose the instance when the current price gets larger than max price you pay
have 2 mins grace period at termination time
Cancelling a spot request does not terminate the instances
First cancel the request and then terminate the instances
Spot fleets: spot instances + optional on-demand instances
Spot fleet allocation strategies:
- lowestPrice
- diversified
- capacityOptimized
- priceCapacityOptimized

Dedicated Host

most expensive option
book entire server
visibility down to port level
can do instance placement
options:
- on demand
- reserved

Dedicated Instances

own hardware within account
cannot do instance placement

Capacity Reservation

Pay whether use the instances or not within reserved period
Capacity Reservations enable you to reserve compute capacity for your EC2 instances in a specific AZ for any duration (can also be in hourly duration)

Elastic IP

Can attach to one instance at a time
Can only have 5 IPs per account (can ask AWS to increase)

Placement Groups

Cluster
Spread
Partition

Cluster

Cluster instances into a low latency group within a single AZ
It is recommended that you launch the number of instances that you need in the placement group in a single launch request
use the same instance type for all instances in the placement group
If you try to add more instances to the placement group later, or if you try to launch more than one instance type in the placement group, you increase your chances of getting an insufficient capacity error
Need to re-launch the cluster when insufficient capacity error occurs

Spread

Spread instances across different hardwares across AZs
Only 7 instances per group per AZ

Partition

Many instances can share a partition (a rack of hardware) and partitions are distributed across AZs
Only 7 partitions per AZ

Elastic Network Interface (ENI)

One instance can have multiple ENIs attached with one primary private IPv4 and many secondary private IPv4s
ENIs are bound to specific AZs
Public IPv4 is assigned to an ENI according to ip assign rule of the subnet that the ENI belongs to
One elastic IP address per one private IP

EC2 Instance Stages

Stop
Terminate
Hibernate

Stop

Data on non-root EBS volume are preserved
All data on the attached instance-store devices will be lost
Underlying host can be changed when restarted
Elastic IP and ENIs are still attached

Terminate

If the EBS volume is set to be destroyed, all the data are lost

Hibernate

Data and states on RAM are saved on EBS and restart from the saved state
Instance ram size must be less than 150GB
Root volume must be EBS and encrypted
An instance cannot be hibernated for more than 60 days
It is not possible to enable or disable hibernation for an instance after it has been launched; Have to configure at launch time

AMI

AMIs can be accessed using:
- AWS public AMIs
- Custom made AMIs
- AMIs found/sold on AWS marketplace
AMIs can be used to copy instances across AZs, Regions and Accounts
AMI includes one or more snapshots, so if AMI is copied, snapshots are copied along with it
Copying an AMI backed by an encrypted snapshot cannot result in an unencrypted target snapshot

EC2 Enhanced Networking

Elastic Network Adapter (ENA)
Elastic Fabric Adapter (EFA)

ENA

up to 100 Gbps
can support windows instances

EFA

Improved ENA for HPC
only works for Linux

Automation and Orchestration

AWS Batch
AWS ParallelCluster

AWS Batch

Managed service that helps you efficiently run batch processing jobs at scale
AWS Batch handles the provisioning, scaling, and management of compute resources required for batch jobs

AWS ParallelCluster

Open-source cluster management tool provided by AWS that simplifies the deployment, configuration, and management of high-performance computing (HPC) clusters on the AWS Cloud

There is vCPU-based On-Demand Instance limit per region

EC2 Billing

Pending: will not be billed
Running: will be billed
Stopping: will not be billed
Terminated: will not be billed
Stopping (to hibernate): will be billed
Terminated (reserved instance): will be billed

AWS Outposts

Fully managed service that extends AWS infrastructure, services, APIs, and tools to your on-premises data center or edge location
Brings AWS infrastructure (hardware and software) to your physical data center or on-premises environment
Supports core AWS services like Amazon EC2, ECS/EKS, RDS, S3, and EBS locally

AWS Wavelength

Brings AWS compute and storage services to the edge of telecommunications (telco) 5G networks, enabling developers to build applications that require ultra-low latency for end users and devices
AWS Wavelength extends AWS infrastructure into Wavelength Zones, which are zones within telco provider data centers connected to 5G networks
Applications deployed in these zones process data close to users, reducing the latency introduced by routing to traditional AWS regions

Access Control

IAM

IAM users can be grouped into IAM groups
Permission policies can be assigned to IAM groups (or)
Can be assigned to users by mean of inline policy
Least privilege permission
One user can belong to multiple different groups, thus can have multiple permission policies
Groups can only contain users (cannot contain other groups)
Admin can set password policy for IAM users
AWS cloudshell is not available in every region
AWS services can do actions on behalf of user by being assigned IAM roles which include one or more IAM policies
Access is allowed only if explicit “Allow” permission is defined

MFA Options

Authenticator apps
Universal 2nd Factor (U2F)

MFA Options Security Key

Hardware key fob MFA device
Hardware key fob MFA device for AWS GovCloud

IAM security tools

Can generate IAM security credentials report of IAM users (account level)
IAM access adviser (user level)

AWS Organizations

Allows to manage multiple AWS accounts
The main account is the management account
Other accounts are member accounts
Member accounts can only be part of one organization

Organization Units (OUs)

Accounts in the organization are organized into OUs
OUs can be nested

Security Control Policy (SCP)

IAM policies applied to OU or Accounts to restrict Users and Roles
They do not apply to the management account (full admin power)
They do not affect the service-linked roles

Resource-based Policy vs IAM Roles

Some services provide resource-based policy but some only IAM role
Cross-account resource access can be done either by account A assuming role in account B or by defining resource-based policy for the resource in account B
Trust policy is also a type of resource-based policy

AWS Services with Resource-based Policy

Lambda
SNS
SQS
S3
API Gateway
KMS

AWS Services with IAM Roles

Kinesis streams
ECS tasks
…

IAM Permission Boundaries

Advanced feature to use a managed policy to set the maximum permissions an IAM entity can get
IAM Permission Boundaries are supported for users and roles only (not groups)

IAM Identity Center

One login (single sign-on) for all AWS accounts in AWS Organizations, business applications, and third-party applications (e.g., Salesforce, Office 365, etc.)
IAM users in Identity Center management account can be assigned with permission sets which allow them to access accounts and also specific resources in OUs
Can manage users and groups directly within AWS Identity Center or integrate with external identity providers like Microsoft Active Directory, Okta, or Azure AD

AWS ControlTower

Easy way to set up and govern a secure and compliant multi-account AWS environment based on best practices
AWS Control Tower uses AWS Organizations to create accounts

Preventive Guardrail

using SCPs (e.g., Restrict Regions across all your accounts)

Detective Guardrail

using AWS Config (e.g., identify untagged resources)

AWS Resource Access Manager (RAM)

To easily and securely share your resources with your AWS accounts

AWS ActiveDirectory (AD)

AWS Managed Microsoft AD
AD Connector
Simple AD

AWS Managed Microsoft AD

Create your own AD in AWS to manage users
Establish “trust” connections with your on-premises AD

AD Connector

Proxy for on-premise AD

Simple AD

AWS managed
Cannot be joined with on-prem ADs

AWS Federated Access

Federated Access in AWS refers to the ability to grant users from external identity providers (IdPs) access to AWS resources without having to create and manage AWS-specific IAM (Identity and Access Management) users for each individual

Types

Federation with IAM Identity Center
Federation with IAM
Federation with Amazon Cognito identity pools

Federation with IAM Identity Center

Users in IAM Identity Center are granted short-term credentials to your AWS resources
IAM Identity Center supports identity federation with SAML (Security Assertion Markup Language) 2.0 to provide federated single sign-on access for users who are authorized to use applications within the AWS access portal
Users can then single sign-on into services that support SAML, including the AWS Management Console and third-party applications, such as Microsoft 365, SAP Concur, and Salesforce

Federation with IAM Role

For single, standalone AWS account
User Logs In to IdP
IdP Sends Authentication Token to AWS
AWS Grants Temporary Credentials through STS
User Accesses AWS Services

CDN

Cloudfront

Cloudfront is a CDN service that caches the cloud contents at POPs (216 currently)
Cloudfront origin can be:
- S3
- EC2
- ALB
- any HTTP endpoint
Cloudfront can do geo restriction to allow or block users from specific countries using allowlist and blocklist
Should use in front of S3 if the file size is less than 1GB
Can use field level encryption to protect sensitive data for specific content
Can route to multiple origins based on the content type
Can use an origin group with primary and secondary origins to configure for high-availability and failover
Can generate Signed URL and Signed cookies

Global Accelerator

2 anycast IPs are created
anycast IPs send the traffic to the edge locations and edge locations send the traffic to the application endpoint
Uses internal AWS network
Can be used to distribute a portion of traffic to a particular deployment using enpoint weights
Good for gaming, IoT or voice over IP services

Cloudfront vs Global Accelerator

Cloudfront caches the contents at the edge location and serve the content from the edge location
global accelerator uses TCP or UDP to route the traffics through the edge location to the application
global accelerator doesn’t have cache service like cloudfront
both have DDoS protection using AWS shield

Storage

S3

max size of an object is 5TB
if an object is more than 5GB, have to use multi-part upload
blocking public access setting can be set at account level

Versioning

if versioning is enabled for a bucket, previous versions of the object are preserved when overwritten
if an object is deleted, it is not truly deleted but marked with the delete marker and then previous versions can be restored by deleting the delete marker
Once versioning is enabled for a bucket, it cannot be disabled, can only be suspended

Replication

replication is done by creating replication rule at the source s3 bucket
both source and destination bucket have to enable bucket versioning
only new objects are replicated
have to use s3 batch replicate to replicate existing and failed replication objects
can replicate buckets in different regions

Storage Classes

standard
standard IA
- good for once a month access
one-zone IA
- good for once a month access
glacier instant retrieval
- millisec retrieval
- good for data accessed once a quarter
- min storage duration of 90 days
glacier flexible retrieval
- expedited (1-5 mins), standard (3-5 hrs), bulk (5-12 hrs)
- min storage duration of 90 days
glacier deep archive
- standard (12 hrs), bulk (48 hrs)
- min storage duration of 180 days
intelligent tiering
- frequent access
- infrequent access: objects not accessed for 30 days
- archive instant access: objects not accessed for 90 days
- archive access (optional): configurable from 90 to 700+ days
- deep archive access (optional): configurable from 180 to 700+ days

Provisioned Capacity (for Glacier Flexible Expedited Retrieval)

ensures that your retrieval capacity for expedited retrievals is available when you need it
unit of capacity provides that at least three expedited retrievals can be performed every five minutes and provides up to 150 MB/s of retrieval throughput

Lifecycle Rules / Lifecycle Policies

Transition rule: to move objects from one class to another
Expiration rule: to delete expired objects
Object level rules

Requester Pay

requester of the object pays for the network costs
requester have to be an authenticated IAM user of an AWS account
After a bucket is configured to be a Requester Pays bucket, requesters must include x-amz-request-payer in their API request header, for DELETE, GET, HEAD, POST, and PUT requests, or as a parameter in a REST request to show that they understand that they will be charged for the request and the data download

Event Notifications

send messages/events to SNS, SQS (only standard queue) or Lambda function when an object action is triggered (eg: ObjectCreated:Put, ObjectCreated:Post, …)
receiving services have to be configured with IAM policy to receive event notification from s3

Performance

each s3 prefix can achieve 3500 put/copy/post/delete requests/sec and 5500 get/head requests/sec
if objects are distributed across 4 prefix, user can have 22000 get/head requests/sec and 14000 put/copy/post/delete requests/sec
how to further optimize s3 performance:
- multi-part upload
- s3 transfer acceleration
- s3 byte range fetches

Batch Operations

to perform bulk operations on existing s3 objects with a single request
to get the list of objects:
- use s3 inventory
- filter using s3 select
- and use s3 batch operation to do processings

Encryption

Server side encryption (SSE)
SSE-S3: encrypt with aws managed key
SSE-KMS: encrypt with KMS key
SSE-C: encrypt with customer provided key
Client side encryption (CSE)

CORS

Need to be enabled to access objects from web browsers

MFA Delete

Only root account can enable/disable MFA delete of a S3 bucket

Access Logs

To capture detailed records of requests made to the S3 bucket
Provide insights into who accessed the bucket, from where, and how they interacted with the objects

Presigned URLs

Time-limited URL that grants temporary access to an S3 object

Glacier Vault Lock

write once read many model
glacier vault lock has policy and that policy cannot be changed after set once
if an object is moved to glacier vault, it cannot be deleted anymore

S3 Object Lock

write once read many model
bucket versioning must be enabled
block an object version deletion for a period of time

Retention Modes

compliance - no one can delete the object or change the retention policy
governance - some(admin) users can delete the object or change the retention policy

Legal Hold

protect the object indefinitely
independent from retention period
legal hold can be placed and removed on an object by using s3:PutObjectLegalHold IAM permission

S3 Access Points

each AP points to each bucket
s3 access points can have own DNS names
can be internet origin or vpc origin
can have policy of it’s own
so the bucket policy can be simple

S3 Objects Lambda Access Points

Object lambda access points enable users to have modified s3 object by pointing to the lambda function which access the original s3 object and do modifications before sending to the object lambda access point

AWS Snow Family

snowcone and snowball edge are devices used for offline data migration
order the snowcone or snowball edge devices from AWS, load the devices with data, send back the devices to AWS and AWS will transfer the data from devices to s3 buckets
snowcone can handle 8TB hdd - 14TB ssd, migration size up to terabytes
snowball edge can handle 80TB - 210TB, migration size up to petabytes
snowball edge supports storage clustering
can do edge computing on snow devices by running lambda functions or ec2 instances at the edge
snowcone is capable with 2 cpu and 4gb of ram
snowball edge on the other hand is compute-optimized and storage-optimized
snowball cannot transfer the data directly to s3 glacier
snowmobile is used to move petabytes to exabytes of data, transfer data with container-sized trucks

AWS FSx

fully-managed high performance file systems on AWS

Types

FSx for Lustre
FSx for Windows file server
FSx for NetApp ONTAP
FSx for openZFS

AWS Storage Gateway

Bridge between on-premises data and cloud data

Types

s3 file gateway
FSx file gateway
Volume gateway (cached or stored)
Tape gateway

Volume Gateway Cached Mode

Only subset of data is stored in on-premise volume gateway

Volume Gateway Stored Mode

Full and redundant data is stored in on-premise volume gateway

AWS Transfer Family

A fully-managed service for file transfers into and out of Amazon S3 or Amazon EFS using the FTP protocol

Supported Protocols

AWS Transfer for FTP (File Transfer Protocol)
AWS Transfer for FTPS (File Transfer Protocol over SSL)
AWS Transfer for SFTP (Secure File Transfer Protocol)

AWS DataSync

Move large amount of data to and from (can be scheduled using agent tasks)
On-premise/Other clouds to AWS
AWS to AWS
Only AWS data transfer service that can directly transfer the data to S3 Glacier

Supported Storage Services

S3
S3 Glacier
EFS
FSx

Application Integration/Messaging

SQS

Producer/Consumer Model

Standard Queue

Unlimited throughput, unlimited number of messages in queue
Default retention of messages: 4 days, maximum of 14 days
Low latency (<10 ms on publish and receive)
Limitation of 256KB per message sent
Can have duplicate messages
Can have out of order messages
Default visibility timeout of 30 sec
Cannot set priority value to each message

FIFO Queue

Limited throughput: 300 msg/s without batching, 3000 msg/s with
Exactly-once send capability (by removing duplicates)
Messages are processed in order by the consumer
Use deduplication ID and message group ID to ensure exactly-once capability

Encryption

In-flight encryption using HTTPS API
At-rest encryption using KMS keys
Client-side encryption if the client wants to perform encryption/decryption itself

Access Policy

Similar to s3 bucket policy to control the access to the queue

Long Polling

When a consumer requests messages from the queue, it can optionally “wait” for messages to arrive if there are none in the queue
The wait time can be between 1 sec to 20 sec (20 sec preferable)
Can configure by setting ReceiveMessageWaitTimeSeconds to a number greater than zero

Dead Letter Queues

Dead-letter queues can be used by other queues (source queues) as a target for messages that can’t be processed (consumed) successfully

Delay Queue

Delay queues let you postpone the delivery of new messages to a queue for several seconds
The default (minimum) delay for a queue is 0 sec
The maximum is 15 minutes

SNS

Pub/Sub Model

Topics

Publisher pushes events to a topic and each subscriber to the topic will get all the events
Up to 12,500,000 subscriptions per topic
100,000 topics limit

FIFO SNS

Similar features as SQS FIFO
Can have SQS Standard and FIFO queues as subscribers
same throughput as SQS FIFO

Encryption

In-flight encryption using HTTPS API
At-rest encryption using KMS keys
Client-side encryption if the client wants to perform encryption/decryption itself

Access Policy

Similar to s3 bucket policy to control the access to the queue

Message Filtering

JSON policy used to filter messages sent to SNS topic’s subscriptions
If a subscription doesn’t have a filter policy, it receives every message

Fan-out (SNS+SQS)

Push once in SNS, receive in all SQS queues that are subscribers
Cross-Region Delivery: works with SQS Queues in other regions

Kinesis

Producer/Consumer Model

Kinesis Data Streams

Streaming service for ingest at scale
data contain partition key and data blob
data with same partition keys always go into same shard
Once data is inserted in Kinesis, it can’t be deleted (immutability)
Ability to reprocess (replay) data
Retention between 1 day to 365 days, default of 1 day
Cannot autoscale, have to be pre-provisioned

Capacity Modes

Provisioned Mode
On-demand Mode

Provisioned Mode

choose the number of shards provisioned, scale manually or using API
Each shard gets 1MB/s in (or 1000 records per second)
Each shard gets 2MB/s out (classic or enhanced fan-out consumer)
Pay per shard provisioned per hour

On-demand Mode

Default capacity provisioned (4 MB/s in or 4000 records per second)
Scales automatically based on observed throughput peak during the last 30 days
Pay per stream per hour & data in/out per GB

Security

In-flight encryption using HTTPS API
At-rest encryption using KMS keys
Client-side encryption if the client wants to perform encryption/decryption itself
VPC Endpoints available for Kinesis to access from within the VPC

Enhanced Fan-out

Standard: 2MB/s per shard (shared between multiple consumers)
Enhanced fan-out: 2MB/s per shard per consumer

Kinesis Data Firehose

Load streaming data into S3 / Redshift / OpenSearch / 3rd party / custom HTTP
Fully Managed Service, no administration, automatic scaling, serverless
Pay for data going through Firehose
Near real-time
Supports custom data transformations using AWS Lambda
Doesn’t guarantee the order of message delivery and processing

Kinesis Data Analytics

Real-time analytics on Kinesis Data Streams & Firehose using SQL
Add reference data from Amazon S3 to enrich streaming data
Fully managed, no servers to provision
Automatic scaling

Kinesis Video Streams

EventBridge

Trigger AWS services based on events sent by other AWS services or 3rd party integrations
Can archive and replay events for debugging purposes

Trigger Types

Schedule
Event Patterns

Schedule

Cron jobs (scheduled scripts)

Event Patterns

Event rules to react to a service doing something

Event Buses

Default event bus (AWS services)
Partner event bus (3rd parties)
Custom event bus

Schema Registry

The Schema Registry allows you to generate code for your application, that will know in advance how data is structured in the event bus
Schema can be versioned

Resource-based Policy

Manage permissions for a specific Event Bus
Allow/deny events from another AWS account or AWS region
Aggregate all events from your AWS Organization in a single AWS account or AWS region

Amazon MQ

Service for on-premise message broker protocols such as: MQTT, AMQP, STOMP, Openwire, WSS

Amazon Simple Workflow Service (SWF)

Amazon SWF is a web service that makes it easy to coordinate work across distributed application components

AWS AppFlow

To transfer and integrate data between AWS services and external SaaS platforms
Keeping SaaS data synchronized with AWS resources

AWS AppSync

A managed service for building real-time GraphQL APIs to power data-driven applications
Simplifies building GraphQL APIs for querying, mutating, and subscribing to data
Allows combining multiple data sources (e.g., DynamoDB, RDS, Lambda) into a single unified API

Machine Learning

Rekognition

for CV
labeling
content moderation
Face Detection and Analysis (gender, age range, emotions…)
Face Search and Verification
Celebrity Recognition
Pathing (ex: for sports game analysis)

Amazon Transcribe

Speech to text

Features

Automatically remove Personally Identifiable Information (PII) using Redaction
Automatic Language Identification for multi-lingual audio

Amazon Polly

Text to speech

Features

Lexicon upload for acronyms and stylized words
Speech customization with Speech Synthesis Markup Language (SSML)

Amazon Translate

Language translation

Amazon Lex

Chatbots
Call center bots
Natural Language Understanding to recognize the intent of text, callers

Amazon Connect

Cloud contact center

Amazon Comprehend

Fully managed NLP service

Amazon Comprehend Medical

Uses NLP to detect Protected Health Information (PHI)

Amazon SageMaker

Fully managed service for developers / data scientists to label data, build and deploy ML models

Amazon Forecast

For timeseries analysis

Amazon Kendra

Fully managed document search service powered by Machine Learning
Sources can be text, pdf, HTML, PowerPoint, MS Word, databases

Amazon Personalize

Recommendation system service

Amazon Textract

For OCR and IE

Security

Encryption

In-flight encryption
Server-side encryption
Client-side encryption

In-flight encryption

Data is encrypted before sending and decrypted after receiving
TLS certificate is used in HTTPS

Server-side encryption

Data is encrypted after receiving by server and decrypted before sending to the client

Client-side encryption

Data is encrypted by the client and never decrypted by the server

KMS

Fully integrated with IAM for authorization
Able to audit KMS Key usage using CloudTrail
KMS Key Encryption also available through API calls (SDK, CLI)
Have to pay for API call to KMS ($0.03 / 10,000 calls)
If a KMS key is deleted, it is in ‘pending deletion’ state for 7–30 days, with a default of 30 days and can be recovered

Asymmetric vs Symmetric Keys

Symmetric Keys (AES-256)
Asymmetric Keys (RSA & ECC key pair)

Symmetric Keys

Single key for both encryption and decryption
AWS services integrated with KMS use symmetric keys
Never get access to the KMS Key unencrypted (must call KMS API to use)

Asymmetric Keys

Public (Encrypt) and Private Key (Decrypt) pair
The public key is downloadable, but the Private Key can’t be accessed unencrypted

KMS Key Types

AWS Owned Keys (free): SSE-S3, SSE-SQS, SSE-DDB (default key)
AWS Managed Keys (free): (aws/service-name, example: aws/rds or aws/ebs)
Customer managed keys created in KMS: $1 / month
Customer managed keys imported: $1 / month

Automatic Key Rotation

AWS-managed KMS Key: automatic every 1 year
Customer-managed KMS Key: automatic (must be enabled) or on-demand
Imported KMS Key: only manual rotation possible using alias

Key Policies

Control access to KMS keys, “similar” to S3 bucket policies

Default Key Policy

Created if you don’t provide a specific KMS Key Policy
Complete access to the key to the root user

Custom Key Policy

Define users, roles that can access the KMS key
Define who can administer the key

Multi-region Keys

MRK is bound to a single region but replicas are replicated to multiple regions
To be able to decrypt the data encrypted in a different region
For the use cases of global client-side encryptions like global dynamodb client-side encryption, global aurora client-side encryption

Replicating encrypted S3 objects

For objects encrypted with SSE-KMS:
- Specify which KMS Key to encrypt the objects within the target bucket
- Adapt the KMS Key Policy for the target key
- An IAM Role with kms:Decrypt for the source KMS Key and kms:Encrypt for the target KMS Key
- You might get KMS throttling errors, in which case you can ask for a Service Quotas increase

AWS CloudHSM

Fully managed service that provides customers with dedicated hardware security modules to securely generate and use encryption keys
AWS CloudHSM is a fully managed service, meaning AWS takes care of hardware maintenance, updates, and availability
Customer retains full control over the cryptographic key management and security configurations

AWS System Manager (SSM) Parameter Store

Secure storage for configuration and secrets
Optional Encryption using KMS
Parameters can be stored in hierarchies

Tiers

Standard
Advanced

Parameter Policies

Allow to assign a TTL to a parameter (expiration date) to force updating or deleting sensitive data such as passwords

AWS SecretsManager

Secure storage of secrets
Capability to force rotation of secrets every X days
Automate generation of secrets on rotation (uses Lambda)
Integration with database services like RDS, Aurora, Redshift, DocumentDB
Secrets are encrypted using KMS

Multi-region secrets

Replicate Secrets across multiple AWS Regions
Secrets Manager keeps read replicas in sync with the primary Secret
Ability to promote a read replica Secret to a standalone Secret

AWS Certificate Manager

Easily provision, manage, and deploy TLS Certificates
Supports both public and private TLS certificates
Free of charge for public TLS certificates
Can generate certificates too
Certificates generated with ACM are automatically renewed

Integrations

ELB
Cloudfront distributions
APIs on API Gateway
Cannot use from EC2

API Gateway

Edge-optimized
Regional
Private (cannot use ACM)

Edge-optimized

ACM is integrated with Cloutdfront distribution
The TLS Certificate must be in the same region as CloudFront

Regional

The TLS Certificate must be imported on API Gateway, in the same region as the API Stage

Web Application Firewall (WAF)

Protects your web applications from common web exploits (Layer 7)

Integrations

ALB
API Gateway
Cloudfront
AppSync GraphQL API
Cognito User Pool

Web Access Control List (Web ACL)

IP Set: up to 10,000 IP addresses
HTTP headers, HTTP body, or URI strings Protects from common attack - SQL injection and Cross-Site Scripting (XSS)
Size constraints
geo-match (block countries)
Rate-based rules (to count occurrences of events) – for DDoS protection
Web ACL are Regional except for CloudFront
A rule group is a reusable set of rules that can be added to a web ACL

AWS Shield

Protect from DDoS Attacks

Modes

Standard
Advanced

Standard

Free service that is activated for every AWS customer
Provides protection from attacks such as SYN/UDP Floods, Reflection attacks and other layer3/4 attacks

Advanced

Optional DDoS mitigation service
$3,000 per month per organization
24/7 access to AWS DDoS response team (DRP)
Shield Advanced automatic application layer DDoS mitigation automatically creates, evaluates and deploys AWS WAF rules to mitigate layer 7 attacks

Supported Services

EC2
ELB
CloudFront
Global Accelerator
Route 53
Elastic IP

AWS Network Firewall

Detail in VPC section
1

AWS Firewall Manager

Manage firewall rules in all accounts of an AWS Organization
Rules are applied to new resources as they are created (good for compliance) across all and future accounts in your Organization

Security Policies

common set of security rules
WAF rules (ALB, API Gateways, CloudFront)
AWS Shield Advanced (ALB, CLB, NLB, Elastic IP, CloudFront)
Security Groups for EC2, ALB and ENI resources in VPC
AWS Network Firewall (VPC Level)
Route 53 Resolver DNS Firewall
Policies are created at the region level

AWS GuardDuty

Managed threat detection service
Analyze threat from input data like CloudTrail events, VPC flow logs, etc
Notify the findings through EventBridge

Foundational Data Sources

CloudTrail Events Logs
VPC Flow Logs
DNS Logs

Other Data Sources

S3 data event logs
EKS audit logs
Lambda network activity logs
RDS login activity logs
EBS volume data

AWS Inspector

Automated Security Assessments for:
- EC2
- Container Images push to Amazon ECR
- Lambda Functions
Reporting & integration with AWS Security Hub
Send findings to Amazon Event Bridge

EC2

Leveraging the AWS System Manager (SSM) agent
Analyze against unintended network accessibility
Analyze the running OS against known vulnerabilities

Container Images push to Amazon ECR

Assessment of Container Images as they are pushed

Lambda Functions

Identifies software vulnerabilities in function code and package dependencies
Assessment of functions as they are deployed

AWS Macie

Find sensitive Personally Indentifiable Information (PII) in data stored on S3

AWS Artifact

To view, assess and manage the security reports as well as other AWS compliance-related information

AWS Security Hub

Security service that provides a comprehensive view of your security posture across AWS accounts
Security Hub collects and aggregates security findings from multiple AWS services such as Amazon GuardDuty, Amazon Macie, Amazon Inspector, and AWS Config, as well as from third-party security solutions

AWS Security Token Service (STS)

Service that you can use to create and provide trusted users with temporary security credentials that can control access to your AWS resources
Temporary security credentials work almost identically to the long-term access key credentials that your IAM users can use

VPC

Default VPC

Default VPC has Internet connectivity through internet gateway and all EC2 instances inside it have public IPv4 addresses

Own VPC

Can create max 5 per region (but soft limit)
Max CIDR per VPC is 5

CIDR size

Min: /28 (16 IP addresses)
Max: /16 (65536 IP addresses)

Allowed CIDR ranges (private)

10.0.0.0 – 10.255.255.255 (10.0.0.0/8)
172.16.0.0 – 172.31.255.255 (172.16.0.0/12)
192.168.0.0 – 192.168.255.255 (192.168.0.0/16)

Subnets

AWS reserves 5 IP addresses (first 4 & last 1) in each subnet
x.x.x.0 – Network Address
x.x.x.1 – reserved by AWS for the VPC router
x.x.x.2 – reserved by AWS for mapping to Amazon-provided DNS
x.x.x.3 – reserved by AWS for future use
x.x.x.255 – Network Broadcast Address. AWS does not support broadcast in a VPC, therefore the address is reserved
Each subnet maps to single AZ
Every subnet created is automatically associated with the main route table for the VPC.

IPv6-only Subnet

Can only support Nitro instances

Internet Gateway

Allows resources (e.g. EC2 instances) in a VPC connect to the Internet
It scales horizontally and is highly available and redundant
Must be created separately from a VPC and attach to a VPC
Subnet route tables must be configured to route the traffic to internet gateway to access the internet
Subnet becomes public subnet when it is connected to and routed through an internet gateway

Bastion Host

BH is an instance in a public subnet which have access to other instances in the private subnet
To be able to ssh into private instances via BH
SG of the BH have to allow port 22 from internet and SG of private instances must allow ssh from SG of the bastion host

NAT Instance

An instance in the public subnet through which the private instances can access to the internet
Must have Elastic IP attached to it
Must disable EC2 setting: Source / destination Check
An instance can be NAT instance by configuring using NAT AMIs
Route tables of private subnets must be configured to route traffic from private subnets to the NAT Instance

NAT Instance SG rules

Inbound:
- Allow HTTP / HTTPS traffic coming from Private Subnets
- Allow SSH from source network (access is provided through Internet Gateway)
Outbound:
- Allow HTTP / HTTPS traffic to the Internet

NAT Gateway

AWS-managed NAT instance
Higher bandwidth, high availability, no administration
Pay per hour for usage and bandwidth
NAT GW is AZ-bound
Uses an Elastic IP
Can’t be used by EC2 instance in the same subnet (only from other subnets)
Private Subnet ⇒ NATGW ⇒ IGW
5 Gbps of bandwidth with automatic scaling up to 100 Gbps

SGs and NACLs

SGs

Operates at instance level
Stateful (always allow return traffic)
Only support ‘Allow’ rules
Evaluate all the rules before deciding to allow
Newly created SG will ‘Deny’ every inbound traffic and ‘Allow’ every outbound traffic

NACLs

Operates at subnet level
Stateless
Supports both ‘Allow’ and ‘Deny’ rules
One NACL per subnet, new subnets are assigned the Default NACL
NACLs and subnets are decoupled and NACLs live in VPC
Default NACL is “allow all”
Newly created NACLs will deny everything (inbound or outbound)
NACL have to be configured to allow inbound and outbound ephemeral ports since it is stateless

NACL Rules

Rules have a number (1-32766), higher precedence with a lower number
First rule match will drive the decision
The last rule is an asterisk (*) and denies a request in case of no rule match

VPC Peering

Privately connect two VPCs using AWS network
Peer VPCs must not have overlapping CIDRs
VPC Peering connection is NOT transitive
Route tables of subnets in both VPC have to be updated to route the traffic to other VPC through peer connection
Can create VPC Peering connection between VPCs in different AWS accounts/regions
Can reference a security group in a peered VPC (cross accounts but same region)

VPC End Points

VPC Endpoints (powered by AWS PrivateLink) allows to connect to AWS services using a private network instead of using the public Internet
Remove the need of IGW, NATGW, … to access AWS Services

Types

Interface Endpoint
Gateway Endpoint

Interface Endpoint

Provisions an ENI (private IP address) as an entry point (must attach a Security Group)
Supports most AWS services
$p er h o u r +$ per GB of data processed
Can be used to connect to another VPC
Uses AWS PrivateLink to connect the endpoint to services

Gateway Endpoint

Provisions a gateway and must be used as a target in a route table (does not use security groups)
Free
Supports S3 and DynamoDB
If S3 or DynamoDB is not in the same region as the subnet, Gateway Endpoint cannot be used since Gateway Endpoint is a regional service (use NAT gateway or Interface Endpoint instead)
can attach an endpoint policy that controls access to the service to which you are connecting
does not use AWS PrivateLink

Flow Logs

Capture information about IP traffic going into your interfaces
Can query VPC flow logs using Athena on S3 or CloudWatch Logs Insights

Flow Logs data can go into:

S3
Cloudwatch logs
Kinesis Data Firehose

Site-to-site VPN Connection

To connect VPC with on-prem servers through private VPN connection over public network
Site-to-site VPN connection can be used as a backup connection to Dx connection

Need 2 things:

Virtual Private Gateway (VGW)
Customer Gateway (CGW)

VGW

VPN concentrator on the AWS side of the VPN connection
VGW is created and attached to the VPC from which you want to create the Site-to-Site VPN connection
Need to enable Route Propagation for the VGW in the route table that is associated with the subnets in the VPC

CGW

Software application or physical device on customer side of the VPN connection
Need public Internet-routable IP address for the Customer Gateway device
If CGW is private, need NAT device to enable public routing

VPN Cloudhub

Provide secure communication between multiple sites, if you have multiple VPN connections
To set it up, connect multiple VPN connections on the same VGW, setup dynamic routing and configure route tables

Direct Connect (Dx)

Provides a dedicated private connection from a remote network to your VPC
Dedicated connection must be setup between the data center and AWS Direct Connect locations
Need to setup a VGW at VPC side
Lead times are often longer than 1 month to establish a new connection

Connection Flows

Private VPC Connection
Public Resources Connection

Private Connection Flow

VGW ⇒ Dx Connector in Dx locations ⇒ Customer router in Dx locations ⇒ Customer router in customer network

Public Connection Flow

Public AWS resources (like s3) ⇒ Dx Connector in Dx locations ⇒ Customer router in Dx locations ⇒ Customer router in customer network

Direct Connect Gateway

If you want to setup a Direct Connect to one or more VPC in many different regions (same account), you must use a Direct Connect Gateway
Dx connection connects to Direct Connect Gateway and Direct Connect Gateway connects to multiple VGWs

Connection Types

Dedicated Connections
Hosted Connections

Dedicated Connections

1Gbps,10 Gbps and 100 Gbps capacity
Physical ethernet port dedicated to a customer
Request made to AWS first, then completed by “AWS Direct Connect Partners”

Hosted Connections

50Mbps, 500 Mbps, to 10 Gbps
Connection requests are made via “AWS Direct Connect Partners”
Capacity can be added or removed on demand
1, 2, 5, 10 Gbps available at select AWS Direct Connect Partners

Encryption

Data in transit is not encrypted but is private
AWS Direct Connect + VPN provides an IPsec-encrypted private connection

Resiliency

High resiliency
Max resiliency

High resiliency

One connection at multiple Dx locations

Max resiliency

Maximum resilience is achieved by separate connections terminating on separate devices in more than one location.

Transit Gateway

Transit Gateway sits in the middle to connect multiple VPCs transitively and can also connect to Dx Gateway and Site-to-site VPN connections
Regional resource
Share cross-account using Resource Access Manager (RAM)
You can peer Transit Gateways across regions
Route Tables: limit which VPC can talk with other VPC
Supports IP Multicast
Can peer multiple transit gateways in multiple regions

Site-to-site VPN ECMP (Equal Cost Multiple Paths)

Routing strategy to allow to forward a packet over multiple best path
Use case: create multiple Site- to-Site VPN connections to increase the bandwidth of your connection to AWS

VPC Traffic Mirroring

Capture and mirror the traffic to send the mirrored traffic into own security appliances to analyze, monitor or troubleshoot
Source and Target can be in the same VPC or different VPCs (VPC Peering)

Egress-only Internet Gateway

Used for IPv6 only
Similar to a NAT Gateway but for IPv6
Must update the Route Tables
Allows instances in your VPC outbound connections over IPv6 while preventing the internet to initiate an IPv6 connection to your instances

AWS Network Firewall

Protect entire VPC
From Layer 3 to Layer 7 protection
Internally uses AWS Gateway Load Balancer
Rules can be centrally managed cross- account by AWS Firewall Manager to apply to many VPCs
Can send logs of rule matches to Amazon S3, CloudWatch Logs, Kinesis Data Firehose

Protect directions

VPC to VPC traffic
Outbound to internet
Inbound from internet
To/from Direct Connect & Site-to-Site VPN

Fine-grained Controls

IP & port - example: 10,000s of IPs filtering
Protocol – example: block the SMB protocol for outbound communications
Stateful domain list rule groups: only allow outbound traffic to *.mycorp.com or third-party software repo
General pattern matching using regex
etc

Cost

Cost Explorer

Visualize, understand, and manage AWS costs and usage over time
Create custom reports that analyze cost and usage data
Monthly, hourly, resource level granularity
Forecast usage up to 12 months based on previous usage
Have API support with pagination

Cost Anomaly Detection

Continuously monitor cost and usage using ML to detect unusual spends
Monitor AWS services, member accounts, cost allocation tags, or cost categories
Sends the anomaly detection report with root-cause analysis
Get notified with individual alerts or daily/weekly summary (using SNS)

Om's Brain

Explorer

3 Mind map

AWS (SAA-C03)

Scaling

ELB

ALB

NLB

GWLB

Cross-zone Load Balancing

ELBs have security groups too

ELBs are region bound

Sticky Sessions

Cookies

SSL/TLS

Connection Draining / Deregistration Delay

ASG

Scaling Policies

Launch template

Termination Policy in order

Instance states

Lifecycle Hooks

autoscaling:EC2_INSTANCE_LAUNCHING

autoscaling:EC2_INSTANCE_TERMINATING

Cooldown period

Databases

DynamoDB

RDS

Read Replicas

Multi-AZ

RDS Custom

Backup

Encrypting un-encrypted RDS database

Enhanced Monitoring

IAM DB Authentication

Ways to use SSL encryption

Force SSL

Encrypt from client side

RDS Proxy for RDS and Aurora

Aurora

Custom Endpoint

Aurora Serverless

Global Aurora

DB Cloning

Backup (Aurora)

Read replicas failover priority

Aurora MySQL Native Function

Failover Scenerios

Single Instance

Read Replica

Aurora Serverless

IAM DB Authentication

ElasticCache

Neptune

DocumentDB

KeySpaces

DNS

Route53

Record Types

Name Servers

Cost

Hosted Zones

Routing Policies

Failover

active-active

active-passive

s3 static website routing

Containerization

ECS

Launch Types

EC2 Launch Type

Fargate Launch Type

IAM Roles

Data Volumes

AWS Application Auto Scaling

Scaling Methods

Cluster Capacity Auto Scaling

ECR

EKS

ECS Anywhere and EKS Anywhere