AWS (SAA-C03)

Scaling

ELB

ALB

  • Works at application layer (layer 7)
  • ALB target groups can be:
    • EC2 instances
    • ECS tasks
    • Lambda functions
    • Private IP addresses
  • ALB have listeners with specific protocols and each listener can route the traffic to different target groups using listener rules
  • health check is done at the target group level using HTTP and HTTPS protocols
  • cross zone load balancing is enabled by default
  • Cannot attach elastic IP to ALB
  • ALB must be in a public subnet to work
  • Also supports gRPC protocol
  • supports Weighted Target Groups routing

NLB

  • Works at transport layer (layer 4)
  • extreme performance (can handle millions of requests per second)
  • TCP and UDP protocols
  • has one static IP per AZ which can also be elastic IP
  • NLB target groups can be:
    • EC2 instances
    • Private IP addresses
    • ALBs
  • health check can be done via TCP, HTTP, HTTPs protocols
  • cross zone load balancing is disabled by default

GWLB

  • Works at network layer (layer 3)
  • Route traffics to 3rd party virtual appliances to do processes like security analysis first before routing to the servers
  • Uses geneve protocol on port 6081
  • GWLB target groups can be:
    • EC2 instances
    • Private IP addresses
  • Cross zone load balancing is disabled by default

Cross-zone Load Balancing

  • Distribute the traffic evenly across target groups in different regions

ELBs have security groups too

ELBs are region bound

Sticky Sessions

  • to make sure the same client will always be routed to the same instance
  • support for CLB, ALB and NLB
  • ALB uses cookies which have expiration date that can be controlled

Cookies

  • Application based cookies
    • custom cookies: defined by application and name cannot be AWSALB, AWSALBAPP or AWSALBTG
    • application cookies: defined by load balancer and name is AWSALBAPP
  • Load balancer generated cookies / Duration based cookies:
    • generated by load balancer
    • name is AWSALB

SSL/TLS

  • Server name indication (SNI) is the extension of TLS protocol that enables client to specify the domain name it wants to reach through a single server endpoint

Connection Draining / Deregistration Delay

  • time to allow instances to finish on the fly requests before deregistering
  • new requests are not sent to the draining instance but instead routed to other healthy instances
  • can set between 0-3600 seconds (default is 300)
  • can be disabled by setting it to 0

ASG

  • ASG uses launch templates to manage ec2 instances
  • it scales using scaling policy
  • ASG can use cloudwatch alarms as triggers to scale the instances
  • EC2 instances can be put into standby state to temporarily remove them from ASG

Scaling Policies

  • Dynamic scaling
    • Target tracking policy
    • Simple/step scaling
  • Scheduled scaling
  • Predictive scaling

Launch template

  • Only a launch template can be used to provision capacity across multiple instance types using both On-Demand Instances and Spot Instances to achieve the desired scale, performance, and cost

Termination Policy in order

  • Based on instance allocation strategy
  • Oldest Launch Configuration
  • Oldest Launch Template
  • Next Billing Hour

Instance states

  • Pending
  • InService
  • Terminating
  • Terminated
  • Standby

Lifecycle Hooks

  • autoscaling:EC2_INSTANCE_LAUNCHING
  • autoscaling:EC2_INSTANCE_TERMINATING

autoscaling:EC2_INSTANCE_LAUNCHING

  • When Amazon EC2 Auto Scaling responds to a scale-out event, it launches one or more instances
  • These instances start in the Pending state
  • If you added an autoscaling:EC2_INSTANCE_LAUNCHING lifecycle hook to your Auto Scaling group, the instances move from the Pending state to the Pending:Wait state
  • After you complete the lifecycle action, the instances enter the Pending:Proceed state
  • When the instances are fully configured, they are attached to the Auto Scaling group and they enter the InService state

autoscaling:EC2_INSTANCE_TERMINATING

  • When Amazon EC2 Auto Scaling responds to a scale-in event, it terminates one or more instances
  • These instances are detached from the Auto Scaling group and enter the Terminating state
  • If you added an autoscaling:EC2_INSTANCE_TERMINATING lifecycle hook to your Auto Scaling group, the instances move from the Terminating state to the Terminating:Wait state
  • After you complete the lifecycle action, the instances enter the Terminating:Proceed state
  • When the instances are fully terminated, they enter the Terminated state

Cooldown period

  • ensures that the Auto Scaling group does not launch or terminate additional EC2 instances before the previous scaling activity takes effect
  • default is 300secs (5mins)

Databases

DynamoDB

  • Serverless
  • Fully managed, highly available NoSQL database with replication across multiple AZs
  • Millions of requests per seconds, trillions of row, 100s of TB of storage

RDS

  • RDS storage scales automatically within set maximum storage threshold
  • Automatically scales the storage if:
    • free storage is less than 10% of allocated storage
    • low storage lasts at least 5 mins
    • 6 hrs have passed since last modification

Read Replicas

  • up to 15 replicas
  • support within AZ, cross AZ or cross region
  • replication is ASYNC and can have some replication delay
  • each replica can be promoted to their own db
  • each replica has different endpoint so application have to manage the endpoint calling
  • for RDS, read replicas dont charge data transfer fees if within same region
  • Read replicas can also be used as disaster recovery although replication is ASYNC

Multi-AZ

  • RDS db can be replicated multi AZ for disaster recovery
  • same DNS endpoint for all multi-AZ replicas
  • automatic failover standby
  • can’t be used as read scaling cause multi-AZ replicas are for standby
  • replication is SYNC

RDS Custom

  • Managed Oracle and Microsoft SQL Server Database with OS and database customization
  • RDS: entire database and the OS to be managed by AWS
  • RDS Custom: full admin access to the underlying OS and the database
  • Can SSH into underlying EC2 instance

Backup

  • Auto backup
    • daily full backup
    • transaction logs are backup every 5 mins
    • restore to any point in time oldest to last 5 mins
    • can set 1 to 35 days of retention, 0 to disable backup
  • Manual backup
    • take db snapshot
    • retention as long as user want
  • Can create backup and snapshots in multi-AZ
  • Stopped RDS db also charge cost

Encrypting un-encrypted RDS database

  • Take a snapshot of the database
  • Copy it as an encrypted snapshot
  • Restore a database from the encrypted snapshot
  • Terminate the previous database

Enhanced Monitoring

  • Monitor the operating system of your DB instance in real time
  • When you want to see how different processes or threads use the CPU, Enhanced Monitoring metrics are useful

IAM DB Authentication

  • works with MySQL and PostgreSQL
  • An authentication token is a string of characters that you use instead of a password
  • it’s valid for 15 minutes before it expires

Ways to use SSL encryption

  • Force SSL
  • Encrypt from client side

Force SSL

  • Set the rds.force_ssl parameter to true to force connections to use SSL
  • The rds.force_ssl parameter is static, so after you change the value, you must reboot your DB instance for the change to take effect

Encrypt from client side

  • This sets up an SSL connection from a specific client computer, and you must do work on the client to encrypt connections
  • Must obtain certificates for the client computer, import certificates on the client computer, and then encrypt the connections from the client computer

RDS Proxy for RDS and Aurora

  • Serverless, autoscaling, highly available (multi-AZ)
  • RDS Proxy is never publicly accessible (must be accessed from VPC)

Aurora

  • proprietary of AWS
  • Aurora storage automatically grows in increments of 10GB, up to 128 TB
  • up to 15 replicas
  • sub 10ms replica lag
  • Aurora costs around 20% more than RDS
  • shared storage volume with up to 6 copies of the data across 3 AZs
  • self-healing with peer-to-peer replication
  • Master(read-write) + up to 15 read-only replicas
  • 1 write endpoint + 1 load balanced reader endpoint
  • support cross region replication
  • support read replica auto scaling

Custom Endpoint

  • can create custom endpoint from subset of read replicas
  • good for analytics or dev testing env

Aurora Serverless

  • Automated database instantiation and auto- scaling based on actual usage
  • pay per second
  • Cannot change from provisioned to serverless

Global Aurora

  • 1 primary read-write region
  • up to 5 secondary read-only regions
  • less than 1 second replication lag
  • up to 16 read replicas per each secondary region
  • Promoting another region (for disaster recovery) has an RTO of < 1 minute

DB Cloning

  • faster than snapshot-and-restore
  • initially, cloned DB access data from the same storage volume as original DB
  • when new data or updated data come, use new storage volume
  • useful for staging db creation from the original prod db

Backup (Aurora)

  • Auto backup
    • 1 to 35 days (can’t be disabled)
  • Manual backup
    • take db snapshot
    • retention as long as user want

Read replicas failover priority

  • Watch the tier (smaller number, higher priority)
  • Watch the size (larger, the higher priority)

Aurora MySQL Native Function

  • Can create a native function or a stored procedure that invokes a Lambda function whenever a row in a table is modified in the database

Failover Scenerios

Single Instance

  • Aurora will attempt to create a new DB Instance in the same Availability Zone as the original instance
  • This replacement of the original instance is done on a best-effort basis and may not succeed, for example, if there is an issue that is broadly affecting the Availability Zone

Read Replica

  • Amazon Aurora flips the canonical name record (CNAME) for your DB Instance to point at the healthy replica, which in turn is promoted to become the new primary
  • Start-to-finish failover typically completes within 30 seconds

Aurora Serverless

  • Aurora will automatically recreate the DB instance in a different AZ

IAM DB Authentication

  • works with MySQL and PostgreSQL
  • An authentication token is a string of characters that you use instead of a password
  • it’s valid for 15 minutes before it expires

ElasticCache

  • to get managed Redis or Memcached
  • Redis: used for gaming leaderboards, application cache, geospatial data
  • Memcached: used for use cases like DB cache or user session store
  • Redis’s sorted set can be used for leaderboard ranking use cases
  • HIPAA-compatible
  • Have multi-AZ configuration
  • Can have up to 5 read replicas across multiple AZs

Neptune

  • Graph DB

DocumentDB

  • AWS service for MongoDB

KeySpaces

  • AWS service for Apache Cassandra

DNS

Route53

  • A highly available, scalable, fully managed and Authoritative DNS
  • The only AWS service which provides 100% availability SLA

Record Types

  • A - map to ipv4
  • AAAA - map to ipv6
  • CNAME - map to another domain name (can’t be root or top node namespace or zone apex)
  • Alias - can map root or top nodes to AWS resources (eg; alb endpoints) (extension of A or AAAA type)
  • NS - name servers for the hosted zones (for dns traffic routing)

Name Servers

  • Physical servers that resolve the DNS requests by looking at the records stored in hosted zones
  • NS record in a hosted zone route the DNS request traffic to name servers

Cost

  • $0.50 per month per hosted zone

Hosted Zones

  • Public
  • Private (within VPC)

Routing Policies

  • Simple
  • Weighted
  • Latency-based
  • Failover
  • Geolocation
  • Geoproximity
  • IP-based routing
  • Multi-value

Failover

  • active-active
  • active-passive

active-active

  • Both systems are running and can be served as failover

active-passive

  • Only one system is serving and another one is standby as failover occurs

s3 static website routing

  • To route s3 static website using Route53, name of the s3 bucket must be the same as domain name

Containerization

ECS

Launch Types

  • EC2
  • Fargate

EC2 Launch Type

  • Must provision & maintain the infrastructure (the EC2 instances)
  • Each EC2 Instance must run the ECS Agent to register in the ECS Cluster

Fargate Launch Type

  • No need to provision the infrastructure (no EC2 instances to manage)

IAM Roles

  • EC2 Instance Profile
  • ECS Task Role

Data Volumes

  • EBS volumes of each EC2 instance
  • Can use EFS
  • Fargate+EFS = Serverless

AWS Application Auto Scaling

  • Automatically increase/decrease the desired number of ECS tasks

Scaling Methods

  • Target Tracking
  • Step Scaling
  • Scheduled Scaling

Cluster Capacity Auto Scaling

  • Use ECS Cluster Capacity Provider to automatically provision and scale the infrastructure for your ECSTasks
  • Capacity Provider paired with an Auto Scaling Group

ECR

  • Store and manage Docker images on AWS
  • Fully integrated with ECS, backed by Amazon S3

EKS

  • EKS supports EC2 if you want to deploy worker nodes or Fargate to deploy serverless containers

ECS Anywhere and EKS Anywhere

  • Extends AWS ECS and EKS functionality to run containers on any infrastructure, including on-premises servers, edge devices, or virtual machines outside AWS
  • Allows organizations to use ECS and EKS as the orchestration layer for hybrid or multi-cloud deployments

AWS App Runner

  • Fully managed service designed to automatically deploy and scale web applications and APIs from source code or a container image, with minimal configuration
  • No infrastructure experience required, just need source code or container image
  • Automatic code building, deploying, scaling, highly available, load balancer, encryption

AWS ElasticBeanStalk

  • Platform-as-a-Service (PaaS) that makes it easy to deploy, manage, and scale web applications and services
  • Manages the infrastructure (compute, storage, networking) but still allows customization if needed
  • Provides real-time monitoring of application health, resource usage, and logs

Serverless

Services

  • Lambda
  • Dynamodb
  • Cognito
  • API Gateway
  • S3
  • SNS and SQS
  • Kinesis
  • Aurora Serverless
  • Step Functions
  • Fargate

Lambda

  • Pay per request and compute time
  • Free tier of 1,000,000 AWS Lambda requests and 400,000 GBs of compute time
  • Outside of a VPC by default
  • If assigned a VPC and subnet, lambda will create ENI in the subnet/VPC
  • Can be invoked by using lambda function URL

Pricing

  • Pay per call
    • First 1,000,000 requests are free
    • 0.20 per 1 million requests thereafter ($0.0000002 per request)
  • Pay per duration
    • 400,000 GB-seconds of compute time per month for FREE
    • 400,000 seconds if function is 1GB RAM
    • 3,200,000 seconds if function is 128 MB RAM
    • After that $1.00 for 600,000 GB-seconds

Execution

  • Memory allocation: 128 MB – 10GB (1 MB increments)
  • Maximum execution time: 900 seconds (15 minutes)
  • Environment variables (4 KB)
  • Disk capacity in the “function container” (in /tmp): 512 MB to 10GB
  • Concurrency executions: 1000 (can be increased) per region

Deployment

  • Lambda function deployment size (compressed .zip): 50 MB
  • Size of uncompressed deployment (code + dependencies): 250 MB
  • Can use the /tmp directory to load other files at startup
  • Size of environment variables: 4 KB

Lambda SnapStart for JAVA

  • Lambda initializes the function at publish time
  • Takes a snapshot of memory and disk state of the initialized function
  • Snapshot is cached for low-latency access

Running Container Images

  • Container image must be built using AWS provided base image tailored specifically for AWS Lambda

API Gateway

Endpoint Types

  • Edge-optimized
  • Regional
  • Private

Edge-optimized

  • Requests are routed through the CloudFront Edge locations (improves latency)
  • The API Gateway still lives in only one region

Regional

  • For clients within the same region
  • Could manually combine with CloudFront (more control over the caching strategies and the distribution)

Private

  • Can only be accessed from own VPC using an interface VPC endpoint (ENI)
  • Have to use a resource policy to define access

User Authentication

  • IAM Roles (useful for internal applications)
  • Cognito (identity for external users – example mobile users)
  • Custom Authorizer (your own logic)
  • Custom Domain Name HTTPS security through integration with AWS Certificate Manager (ACM)

Supports API Caching and Request Throttling too

Step Functions

  • Build serverless visual workflow to orchestrate your Lambda functions

AWS Cognito

  • Give users an identity to interact with the web or mobile application on AWS

Cognito User Pool

  • Sign in functionality for app users
  • Create a serverless database of user for the web & mobile apps
  • Integrate with API Gateway & Application Load Balancer

Cognito Identity Pool (Federated Identity)

  • Provide AWS credentials to users so they can access AWS resources directly
  • Integrate with Cognito User Pools as an identity provider
  • Get identities for “users” so they obtain temporary AWS credentials

Data Analytics

Amazon Athena

  • Serverless query service to analyze data stored in Amazon S3
  • Supports CSV, JSON, ORC, Avro, and Parquet
  • $5.00 per TB of data scanned
  • Commonly used with Amazon Quicksight for reporting/dashboards

Federated Query

  • To run SQL queries across data stored in relational, non-relational, object, and custom data sources (AWS or on-premises)
  • Uses Data Source Connectors that run on AWS Lambda to run Federated Queries
  • Store the results back in Amazon S3

Performance Improvement

  • Use columnar data (Apache Parquet or ORC) for cost-savings
  • Compress data for smaller retrievals
  • Partition datasets in S3 for easy querying on virtual columns
  • Use larger files (> 128 MB) to minimize overhead

RedShift

  • based on Postgresql but OLAP: online analytical processing (analytics and data warehousing)
  • 10x better performance than other data warehouses, scale to PBs of data
  • Columnar storage of data (instead of row based) & parallel query engine

Modes

  • Provisioned Cluster
  • Serverless Cluster

Provisioned Cluster

  • Choose instance types in advance
  • Can reserve instances for cost savings

Redshift Clusters

  • Leader Node
  • Compute Node

Leader Node

  • for query planning, results aggregation

Compute Node

  • for performing the queries, send results to leader

Snapshots and DR

  • Snapshots are point-in-time backups of a cluster, stored internally in S3
  • can restore a snapshot into a new cluster
  • Automatically every 8 hours, every 5 GB or can be scheduled
  • Set retention between 1 to 35 days
  • Can manually take snapshots too
  • Can enable cross-region snapshots

Data Loading into RedShift

  • with Kinesis Data Firehose
  • s3 using copy command
  • without enhanced VPC routing
  • with enhanced VPC routing
  • EC2 Instance JDBC driver

RedShift Spectrum

  • to run query on data stored in s3 without loading the data

Amazon OpenSearch

  • Successor to ElasticSearch
  • common to use OpenSearch as a complement to another database as a database search API
  • Ingestion from Kinesis Data Firehose, AWS IoT, and CloudWatch Logs
  • Comes with OpenSearch Dashboards for visualization

Modes

  • Managed Cluster
  • Serverless Cluster

Amazon EMR

  • Amazon Elastic MapReduce
  • The clusters can be made of hundreds of EC2 instances with autoscaling and can be integrated with spot instances
  • EMR comes bundled with Apache Spark, HBase, Presto, Flink
  • EMR takes care of all the provisioning and configuration

Node Types

  • Master Node
  • Core Node
  • Task Node

Master Node

  • Manage the cluster, coordinate, manage health – long running

Core Node

  • Run tasks and store data – long running

Task Node

  • Just to run tasks – usually Spot

Purchasing Options

  • On demand
  • Reserved (min 1 yr)
  • Spot Instances

Modes

  • Long running cluster
  • Transient cluster

Amazon QuickSight

  • Serverless machine learning-powered BI service to create interactive dashboards
  • In-memory computation using SPICE engine if data is imported into QuickSight
  • Define Users and Groups (separate from IAM)

AWS Glue

  • managed ETL service

Glue Job Bookmarks

  • prevent re-processing old data

Glue Elastic Views

  • Combine and replicate data across multiple data stores using SQL
  • No custom code, Glue monitors for changes in the source data, serverless
  • Leverages a “virtual table” (materialized view)

Glue DataBrew

  • Prebuilt transformations

Glue Studio

  • GUI for ETL jobs

Glue Streaming ETL

  • for streaming data
  • built on Apache Spark Structured Streaming
  • compatible with Kinesis Data Streaming, Kafka, MSK

AWS LakeFormation

  • To build data lake
  • Created data lakes are stored in s3
  • Built on top of AWS Glue
  • Can be used to consolidate data from multiple accounts into a single account as a central datalake

MSK (Amazon Managed Streaming for Kafka)

  • Alternative to Amazon Kinesis

MSK Serverless

  • Run Apache Kafka on MSK without managing the capacity
  • MSK automatically provisions resources and scales compute & storage

AWS Data Exchange

  • service that makes it easy to find, subscribe to, and use third-party data in the AWS cloud

AWS Data Pipeline

  • enables you to automate the movement, transformation, and processing of data across different AWS services and on-premises data sources
  • useful for creating complex data workflows that involve scheduling, dependency management, and data transformations

Monitoring

CloudWatch

CloudWatch Metrics

  • CloudWatch provides metrics for every services in AWS
  • Metrics belong to namespaces (eg: S3, ECS, EC2,…)
  • Dimension is an attribute of a metric (eg: instance id, environment, etc…)
  • Up to 30 dimensions per metric
  • Can create CloudWatch Custom Metrics

Metric Streams

  • Continually stream CloudWatch metrics to a destination of your choice, with near-real-time delivery and low latency (to Kinesis Data Firehose, 3rd party service providers)
  • Option to filter metrics to only stream a subset of them

Cloudwatch Logs

  • organized into log groups and log streams
  • Can define log expiration policies (never expire, 1 day to 10 years…)
  • Logs are encrypted by default
  • Can setup KMS-based encryption with your own keys

Can send logs to

  • Amazon S3 (exports)
  • Kinesis Data Streams
  • Kinesis Data Firehose
  • AWS Lambda
  • OpenSearch

Log sources

  • SDK, CloudWatch Logs Agent, CloudWatch Unified Agent
  • Elastic Beanstalk: collection of logs from application
  • ECS: collection from containers
  • AWS Lambda: collection from function logs
  • VPC Flow Logs: VPC specific logs - API Gateway
  • CloudTrail based on filter
  • Route53: Log DNS queries

Log Insights

  • Search and analyze log data stored in CloudWatch Logs

S3 Export

  • Log data can take up to 12 hours to become available for export
  • The API call is CreateExportTask
  • use Logs Subscriptions

Log Subscriptions

  • Get a real-time log events from CloudWatch Logs for processing and analysis
  • Send to Kinesis Data Streams, Kinesis Data Firehose, or Lambda
  • Subscription Filter: filter which log events are delivered to the destination
  • Can do cross-account subscription

CloudWatch Agents

  • To collect logs from EC2 instances or on-premise servers

Log Agents

  • Older version
  • Can only collect logs

Unified Agents

  • Can collect logs and also the instance metrics (eg: CPU, RAM, Disk info, etc)

CloudWatch Alarms

  • Alarms are used to trigger notifications for any metric

Alarm States

  • OK
  • Insufficient Data
  • In Alarm

Alarm Target Actions

  • EC2 instances (stop, terminate, reboot, etc)
  • EC2 Auto Scaling
  • Amazon SNS

Composite Alarm

  • Can trigger multiple alarms in conjunction
  • AND and OR conditions

EC2 Recovery

  • CloudWatch alarm can trigger the recovery of the Amazon EC2 instance, in case the instance fails.
  • The instance, however, should only be configured with an Amazon EBS volume
  • Recovered instance is identical to the original instance, including the instance ID, private IP addresses, Elastic IP addresses, and all instance metadata

CloudWatch Insights

  • CloudWatch Container Insights
  • CloudWatch Lambda Insights
  • CloudWatch Contributor Insights
  • CloudWatch Application Insights

CloudWatch Container Insights

  • ECS, EKS, Kubernetes on EC2, Fargate, needs agent for Kubernetes

CloudWatch Lambda Insights

  • Detailed metrics to troubleshoot serverless applications

CloudWatch Contributors Insights

  • Find “Top-N” Contributors through CloudWatch Logs

CloudWatch Application Insights

  • Automatic dashboard to troubleshoot your application and related AWS services

CloudTrail

  • Provides governance, compliance and audit for your AWS Account
  • Can be integrated with EventBridge to trigger AWS services based on CloudTrail events
  • Cloudtrail log files are encrypted by default

CloudTrail Events

  • Management Events
  • Data Events
  • CloudTrail Insights Events

Management Events

  • Operations that are performed on resources in your AWS account
  • By default, trails are configured to log management events.

Data Events

  • Granula data object activities like Amazon S3 object-level activity, AWS Lambda function execution activity

CloudTrail Insights Events

  • Analyze anomalies in write events to detect unusual patterns

Events retention

  • Events are stored for 90 days in CloudTrail
  • To keep events beyond this period, log them to S3 and use Athena

AWS Config

  • Helps with auditing and recording compliance of your AWS resources
  • Helps record configurations and changes over time
  • AWS Config is a per-region service
  • Can be aggregated across regions and accounts

Config Rules

  • Can use AWS managed config rules
  • Can make custom config rules
  • no free tier, 0.001 per config rule evaluation per region

Config Resource

  • View compliance of a resource over time
  • View configuration of a resource over time
  • View CloudTrail API calls of a resource over time

Remediation

  • Automate remediation of non-compliant resources using SSM Automation Documents
  • Use AWS-Managed Automation Documents or create custom Automation Documents
  • Can set Remediation Retries if the resource is still non-compliant after auto-remediation

Notification

  • Use EventBridge to trigger notifications when AWS resources are non-compliant
  • Ability to send configuration changes and compliance state notifications to SNS (all events – use SNS Filtering or filter at client-side)

AWS Trusted Advisor

  • optimize costs, increase performance, improve security and resilience, and operate at scale in the cloud
  • recommends actions to remediate any deviations from best practices
  • can do service quota checks by writing an AWS Lambda function that refreshes the AWS Trusted Advisor Service Limits checks and set it to run every 24 hours

AWS X-ray

  • X-Ray collects data about the requests and responses, tracks latency, identifies performance bottlenecks, and detects errors, helping developers and operations teams understand how their applications behave in real-time

Service Map

  • X-Ray generates a service map that visualizes the relationships and interactions between the services in your application. This map highlights performance bottlenecks, latency issues, and error rates.

Disaster Recovery

RPO and RTO

  • Recovery Point Objective: Time between disaster and last backup point
  • Recovery Time Objective: Time between disaster and system recover time

DR Strategies

  • Backup and Restore
  • Pilot Light
  • Warm Standby
  • Hot Site / Multi Site Approach

Backup and Restore

  • Cheapest
  • High RPO, High RTO

Pilot Light

  • A most-minimal version of the app is always running in the cloud

Warm Standby

  • A scaled-down version of the full system is always up and running

Hot Site/ Multi Site

  • Full Production Scale is running both on AWS and On Premise

AWS Database Migration Service (DMS)

  • Can migrate databases both heterogeneously and homogeneously from different sources to targets (eg: from on-premise Oracle to AWS Aurora)
  • Must create an EC2 instance to perform the replication tasks
  • If the source and target db uses different db engines (eg: Oracle and Postgresql), Schema Conversion Tool (SCT) must be used
  • AWS DMS supports multi-AZ deployment
  • In addition to databases, s3 and kinesis can also be the source or target
  • full load and change data capture (CDC) replication task can be used to migrate and also track the on-going data changes

RDS and Aurora DB Migration

  • MySQL
  • PostgreSQL

MySQL

  • RDS to Aurora:
    • DB Snapshots from RDS MySQL restored as MySQL Aurora DB
    • Create an Aurora Read Replica from your RDS MySQL, and when the replication lag is 0, promote it as its own DB cluster
  • External to Aurora:
    • Backup onto s3 and import from s3 to Aurora
    • Use mysqldump utility to directly migrate into Aurora
    • Can also use DMS

PostgreSQL

  • RDS to Aurora:
    • DB Snapshots from RDS PostgreSQL restored as PostgreSQL Aurora DB
    • Create an Aurora Read Replica from your RDS PostgreSQL, and when the replication lag is 0, promote it as its own DB cluster
  • External to Aurora:
    • Create a backup, put it in Amazon S3 and import it using the aws_s3 Aurora extension
    • Can also use DMS

AWS Backup

  • Centrally manage and automate backups across AWS services
  • Supports cross-region backups
  • Supports cross-account backups

Supported Services

  • Amazon EC2 / Amazon EBS
  • Amazon S3
  • Amazon RDS (all DBs engines) / Amazon Aurora / Amazon DynamoDB
  • Amazon DocumentDB / Amazon Neptune
  • Amazon EFS / Amazon FSx (Lustre & Windows File Server)
  • AWS Storage Gateway (Volume Gateway)

Features

  • PITR for supported services
  • On-demand and scheduled backups
  • Tag based backup policies
  • Backup Plans
  • Backup Vault Lock

Backup Plans

  • Can configure:
    • Backup frequency
    • Backup window
    • Transition to cold storage
    • Retention period

Backup Vault Lock

  • WORM (Write Once Read Many)
  • Even the root user cannot delete backups inside the locked Vault

AWS ADS and MGN

  • Application Discovery Service (ADS)
  • Application Migration Service (MGN)

ADS

  • Plan migration projects by gathering information about on-premises data centers like server utilization data and dependency mapping
  • Resulting data can be viewed within AWS Migration Hub

Agentless Discovery

  • Uses AWS Agentless Discovery Connector
  • Discover VMinventory, configuration, and performance history such as CPU, memory, and disk usage

Agent-based Discovery

  • Uses AWS Application Discovery Agent
  • System configuration, system performance, running processes, and details of the network connections between systems

MGN

  • The “AWS evolution” of CloudEndure Migration, replacing AWS Server Migration Service (SMS)
  • Lift-and-shift (rehost) solution
  • Converts physical, virtual, and cloud-based servers to run natively on AWS
  • Migrate data by installing AWS Replication Agent on source servers

Compute

EC2

Storage

  • EBS
  • EFS
  • EC2 Instance Store

EBS

  • bound to specific AZs
  • by default, root volume is set to delete on termination
  • Only gp2/gp3 and io1/io2 can be used as boot volumes
  • EBS volumes support live configuration changes while in production which means that you can modify the volume type, volume size, and IOPS capacity without service interruptions

EBS Volume Types

  • gp2 (SSD)
  • gp3 (SSD)
  • io1 (SSD)
  • io2 block express (SSD)
  • st1 (HDD)
  • sc1 (HDD)

gp2

  • 1 GiB - 16TiB
  • can burst IOPS to 3,000
  • Size of the volume and IOPS are linked
  • max IOPS is 16,000
  • if 3 IOPS per GB, max IOPS at 5,334 GB

gp3

  • 1 GiB - 16TiB
  • Baseline of 3,000 IOPS and throughput of 125 MiB/s
  • Can increase IOPS up to 16,000 and throughput up to 1000 MiB/s independently

io1

  • 4 GiB - 16TiB
  • Max IOPS: 64,000 for Nitro EC2 instances & 32,000 for other
  • Can increase IOPS independently from storage size

io2 Block Express

  • 4 GiB - 64 TiB
  • Sub-millisecond latency
  • Max IOPS: 256,000 with an IOPS:GiB ratio of 1,000:1

Snapshots

  • snapshots can be copied across AZs
  • snapshots can be moved to snapshot archives which is 75% cheaper but can take 24 to 72 hrs to restore
  • snapshots can be moved to recycle bins and retention period can be set from 1 day to 1 year
  • fast snapshot restore: Force full initialization of snapshot to have no latency on the first use
  • snapshots can be created automatedly using Amazon Data Lifecycle Manager (DLM)
  • The EBS volume can be used while the snapshot is in progress

EBS Encryption

  • Copying an unencrypted snapshot allows encryption
  • Snapshots of encrypted volumes are encrypted

Encrypt an Unencrypted EBS Volume

  • Create an EBS snapshot of the volume
  • Encrypt the EBS snapshot ( using copy )
  • Create new EBS volume from the snapshot ( the volume will also be encrypted )

Copying encrypted snapshots across regions

  • Take snapshot of the encrypted volume
  • Copy the snapshot and encrypt using key B in region B
  • Restore the volume

Copying encrypted snapshots cross accounts

  • Create snapshot encrypted with own KMS key
  • Attach KMS key policy to authorize cross account decrypt access
  • Share encrypted snapshot
  • Encrypt the snapshot using KMS key B in account B
  • Restore the volume

EBS Multi Attach

  • only io1/io2 volume types can support multi attach
  • one volume can be attached to multiple instances within same AZ
  • up to 16 instances at the same time

EFS

  • network file system (NFS) that can be mounted on many EC2 instances
  • EFS can be attached to EC2 instances in multiple AZs
  • have to use security group to control access to EFS
  • can only be used with linux based AMIs
  • pay per use, no capacity planning

Performance Modes

  • General purpose
  • Max I/O

Throughput Modes

  • Bursting
  • Provisioned
  • Elastic

Bursting

  • scales with storage
  • burst up to 100MiB/s

Provisioned

  • set the throughput regardless of storage size

Elastic

  • automatically scales throughput up or down based on the workloads
  • Up to 3GiB/s for reads and 1GiB/s for writes

Storage Tiers

  • Standard
  • IA
  • Archive

Storage Life Cycle

  • Maximum day that can be configured using storage life cycle is 365 days

Availability Modes

  • standard (Multi-AZ)
  • one zone (Single-AZ)

EFS One Zone IA

  • IA storage tier with one zone availability mode

Instance Store

  • closely attached to EC2 instance
  • better I/O than EBS
  • destroyed when the instance is stopped

RAID 0 vs RAID 1

  • EBS and Instance Store supports RAID 0 configuration

RAID 0

  • Data are spread across multiple EBS or Instance store volumes and all volumes act as single storage
  • Increased throughput

RAID 1

  • Data are duplicated in all the EBS and Instance store volumes
  • For data redundancy

Instance Types

  • General Purpose (M, T)
  • Compute optimized (C)
  • Memory optimized (R)
  • Accelerated (G, P)
  • Storage optimized (I)

Compute Optimized (C)

  • Batch processing
  • HPC
  • Media transcoding
  • Scientific modeling
  • Dedicated gaming servers

Memory Optimized (R)

  • High performance databases
  • Cache stores
  • In memory BIs
  • In memory big data processing

Storage Optimized (I)

  • High performance OLTP
  • For high sequential I/O

Tenancy

  • default
  • dedicated
  • host

default

  • shared tenancy

dedicated

  • dedicated tenancy (eg: dedicated instances)

host

  • dedicated host

Security Group

  • Control ins/outs of the instance
  • VPC bound
  • Can attach to multiple instances
  • Only contains ‘Allow’ rules
  • Can reference by IP or by other SGs
  • Inbound traffics are blocked by default
  • Outbound traffics are allowed by default

Purchasing Options

  • On-demand Instances
  • Reserved Instances
  • Saving Plans
  • Spot Instances
  • Dedicated Hosts
  • Dedicated Instances
  • Capacity Reservation

On-demand Instances

  • Pay by second after 1 min

Reserved Instances

  • Reserved for 1 or 3 years
  • Payments: upfront, no upfront, partial upfront
  • Convertible reserved instance: can change instance attributes

Saving Plans

  • Reserved to a certain type of usage ($/hr)
  • Reserved for 1 or 3 years
  • Locked to an instance family and region
  • Usage beyond saving plans are charge at on-demand price

Spot Instances

  • Can get up to 90% discount
  • Can lose the instance when the current price gets larger than max price you pay
  • have 2 mins grace period at termination time
  • Cancelling a spot request does not terminate the instances
  • First cancel the request and then terminate the instances
  • Spot fleets: spot instances + optional on-demand instances
  • Spot fleet allocation strategies:
    • lowestPrice
    • diversified
    • capacityOptimized
    • priceCapacityOptimized

Dedicated Host

  • most expensive option
  • book entire server
  • visibility down to port level
  • can do instance placement
  • options:
    • on demand
    • reserved

Dedicated Instances

  • own hardware within account
  • cannot do instance placement

Capacity Reservation

  • Pay whether use the instances or not within reserved period
  • Capacity Reservations enable you to reserve compute capacity for your EC2 instances in a specific AZ for any duration (can also be in hourly duration)

Elastic IP

  • Can attach to one instance at a time
  • Can only have 5 IPs per account (can ask AWS to increase)

Placement Groups

  • Cluster
  • Spread
  • Partition

Cluster

  • Cluster instances into a low latency group within a single AZ
  • It is recommended that you launch the number of instances that you need in the placement group in a single launch request
  • use the same instance type for all instances in the placement group
  • If you try to add more instances to the placement group later, or if you try to launch more than one instance type in the placement group, you increase your chances of getting an insufficient capacity error
  • Need to re-launch the cluster when insufficient capacity error occurs

Spread

  • Spread instances across different hardwares across AZs
  • Only 7 instances per group per AZ

Partition

  • Many instances can share a partition (a rack of hardware) and partitions are distributed across AZs
  • Only 7 partitions per AZ

Elastic Network Interface (ENI)

  • One instance can have multiple ENIs attached with one primary private IPv4 and many secondary private IPv4s
  • ENIs are bound to specific AZs
  • Public IPv4 is assigned to an ENI according to ip assign rule of the subnet that the ENI belongs to
  • One elastic IP address per one private IP

EC2 Instance Stages

  • Stop
  • Terminate
  • Hibernate

Stop

  • Data on non-root EBS volume are preserved
  • All data on the attached instance-store devices will be lost
  • Underlying host can be changed when restarted
  • Elastic IP and ENIs are still attached

Terminate

  • If the EBS volume is set to be destroyed, all the data are lost

Hibernate

  • Data and states on RAM are saved on EBS and restart from the saved state
  • Instance ram size must be less than 150GB
  • Root volume must be EBS and encrypted
  • An instance cannot be hibernated for more than 60 days
  • It is not possible to enable or disable hibernation for an instance after it has been launched; Have to configure at launch time

AMI

  • AMIs can be accessed using:
    • AWS public AMIs
    • Custom made AMIs
    • AMIs found/sold on AWS marketplace
  • AMIs can be used to copy instances across AZs, Regions and Accounts
  • AMI includes one or more snapshots, so if AMI is copied, snapshots are copied along with it
  • Copying an AMI backed by an encrypted snapshot cannot result in an unencrypted target snapshot

EC2 Enhanced Networking

  • Elastic Network Adapter (ENA)
  • Elastic Fabric Adapter (EFA)

ENA

  • up to 100 Gbps
  • can support windows instances

EFA

  • Improved ENA for HPC
  • only works for Linux

Automation and Orchestration

  • AWS Batch
  • AWS ParallelCluster

AWS Batch

  • Managed service that helps you efficiently run batch processing jobs at scale
  • AWS Batch handles the provisioning, scaling, and management of compute resources required for batch jobs

AWS ParallelCluster

  • Open-source cluster management tool provided by AWS that simplifies the deployment, configuration, and management of high-performance computing (HPC) clusters on the AWS Cloud

There is vCPU-based On-Demand Instance limit per region

EC2 Billing

  • Pending: will not be billed
  • Running: will be billed
  • Stopping: will not be billed
  • Terminated: will not be billed
  • Stopping (to hibernate): will be billed
  • Terminated (reserved instance): will be billed

AWS Outposts

  • Fully managed service that extends AWS infrastructure, services, APIs, and tools to your on-premises data center or edge location
  • Brings AWS infrastructure (hardware and software) to your physical data center or on-premises environment
  • Supports core AWS services like Amazon EC2, ECS/EKS, RDS, S3, and EBS locally

AWS Wavelength

  • Brings AWS compute and storage services to the edge of telecommunications (telco) 5G networks, enabling developers to build applications that require ultra-low latency for end users and devices
  • AWS Wavelength extends AWS infrastructure into Wavelength Zones, which are zones within telco provider data centers connected to 5G networks
  • Applications deployed in these zones process data close to users, reducing the latency introduced by routing to traditional AWS regions

Access Control

IAM

  • IAM users can be grouped into IAM groups
  • Permission policies can be assigned to IAM groups (or)
  • Can be assigned to users by mean of inline policy
  • Least privilege permission
  • One user can belong to multiple different groups, thus can have multiple permission policies
  • Groups can only contain users (cannot contain other groups)
  • Admin can set password policy for IAM users
  • AWS cloudshell is not available in every region
  • AWS services can do actions on behalf of user by being assigned IAM roles which include one or more IAM policies
  • Access is allowed only if explicit “Allow” permission is defined

MFA Options

  • Authenticator apps
  • Universal 2nd Factor (U2F)

MFA Options Security Key

  • Hardware key fob MFA device
  • Hardware key fob MFA device for AWS GovCloud

IAM security tools

  • Can generate IAM security credentials report of IAM users (account level)
  • IAM access adviser (user level)

AWS Organizations

  • Allows to manage multiple AWS accounts
  • The main account is the management account
  • Other accounts are member accounts
  • Member accounts can only be part of one organization

Organization Units (OUs)

  • Accounts in the organization are organized into OUs
  • OUs can be nested

Security Control Policy (SCP)

  • IAM policies applied to OU or Accounts to restrict Users and Roles
  • They do not apply to the management account (full admin power)
  • They do not affect the service-linked roles

Resource-based Policy vs IAM Roles

  • Some services provide resource-based policy but some only IAM role
  • Cross-account resource access can be done either by account A assuming role in account B or by defining resource-based policy for the resource in account B
  • Trust policy is also a type of resource-based policy

AWS Services with Resource-based Policy

  • Lambda
  • SNS
  • SQS
  • S3
  • API Gateway
  • KMS

AWS Services with IAM Roles

  • Kinesis streams
  • ECS tasks

IAM Permission Boundaries

  • Advanced feature to use a managed policy to set the maximum permissions an IAM entity can get
  • IAM Permission Boundaries are supported for users and roles only (not groups)

IAM Identity Center

  • One login (single sign-on) for all AWS accounts in AWS Organizations, business applications, and third-party applications (e.g., Salesforce, Office 365, etc.)
  • IAM users in Identity Center management account can be assigned with permission sets which allow them to access accounts and also specific resources in OUs
  • Can manage users and groups directly within AWS Identity Center or integrate with external identity providers like Microsoft Active Directory, Okta, or Azure AD

AWS ControlTower

  • Easy way to set up and govern a secure and compliant multi-account AWS environment based on best practices
  • AWS Control Tower uses AWS Organizations to create accounts

Preventive Guardrail

  • using SCPs (e.g., Restrict Regions across all your accounts)

Detective Guardrail

  • using AWS Config (e.g., identify untagged resources)

AWS Resource Access Manager (RAM)

  • To easily and securely share your resources with your AWS accounts

AWS ActiveDirectory (AD)

  • AWS Managed Microsoft AD
  • AD Connector
  • Simple AD

AWS Managed Microsoft AD

  • Create your own AD in AWS to manage users
  • Establish “trust” connections with your on-premises AD

AD Connector

  • Proxy for on-premise AD

Simple AD

  • AWS managed
  • Cannot be joined with on-prem ADs

AWS Federated Access

  • Federated Access in AWS refers to the ability to grant users from external identity providers (IdPs) access to AWS resources without having to create and manage AWS-specific IAM (Identity and Access Management) users for each individual

Types

  • Federation with IAM Identity Center
  • Federation with IAM
  • Federation with Amazon Cognito identity pools

Federation with IAM Identity Center

  • Users in IAM Identity Center are granted short-term credentials to your AWS resources
  • IAM Identity Center supports identity federation with SAML (Security Assertion Markup Language) 2.0 to provide federated single sign-on access for users who are authorized to use applications within the AWS access portal
  • Users can then single sign-on into services that support SAML, including the AWS Management Console and third-party applications, such as Microsoft 365, SAP Concur, and Salesforce

Federation with IAM Role

  • For single, standalone AWS account
  • User Logs In to IdP
  • IdP Sends Authentication Token to AWS
  • AWS Grants Temporary Credentials through STS
  • User Accesses AWS Services

CDN

Cloudfront

  • Cloudfront is a CDN service that caches the cloud contents at POPs (216 currently)
  • Cloudfront origin can be:
    • S3
    • EC2
    • ALB
    • any HTTP endpoint
  • Cloudfront can do geo restriction to allow or block users from specific countries using allowlist and blocklist
  • Should use in front of S3 if the file size is less than 1GB
  • Can use field level encryption to protect sensitive data for specific content
  • Can route to multiple origins based on the content type
  • Can use an origin group with primary and secondary origins to configure for high-availability and failover
  • Can generate Signed URL and Signed cookies

Global Accelerator

  • 2 anycast IPs are created
  • anycast IPs send the traffic to the edge locations and edge locations send the traffic to the application endpoint
  • Uses internal AWS network
  • Can be used to distribute a portion of traffic to a particular deployment using enpoint weights
  • Good for gaming, IoT or voice over IP services

Cloudfront vs Global Accelerator

  • Cloudfront caches the contents at the edge location and serve the content from the edge location
  • global accelerator uses TCP or UDP to route the traffics through the edge location to the application
  • global accelerator doesn’t have cache service like cloudfront
  • both have DDoS protection using AWS shield

Storage

S3

  • max size of an object is 5TB
  • if an object is more than 5GB, have to use multi-part upload
  • blocking public access setting can be set at account level

Versioning

  • if versioning is enabled for a bucket, previous versions of the object are preserved when overwritten
  • if an object is deleted, it is not truly deleted but marked with the delete marker and then previous versions can be restored by deleting the delete marker
  • Once versioning is enabled for a bucket, it cannot be disabled, can only be suspended

Replication

  • replication is done by creating replication rule at the source s3 bucket
  • both source and destination bucket have to enable bucket versioning
  • only new objects are replicated
  • have to use s3 batch replicate to replicate existing and failed replication objects
  • can replicate buckets in different regions

Storage Classes

  • standard
  • standard IA
    • good for once a month access
  • one-zone IA
    • good for once a month access
  • glacier instant retrieval
    • millisec retrieval
    • good for data accessed once a quarter
    • min storage duration of 90 days
  • glacier flexible retrieval
    • expedited (1-5 mins), standard (3-5 hrs), bulk (5-12 hrs)
    • min storage duration of 90 days
  • glacier deep archive
    • standard (12 hrs), bulk (48 hrs)
    • min storage duration of 180 days
  • intelligent tiering
    • frequent access
    • infrequent access: objects not accessed for 30 days
    • archive instant access: objects not accessed for 90 days
    • archive access (optional): configurable from 90 to 700+ days
    • deep archive access (optional): configurable from 180 to 700+ days

Provisioned Capacity (for Glacier Flexible Expedited Retrieval)

  • ensures that your retrieval capacity for expedited retrievals is available when you need it
  • unit of capacity provides that at least three expedited retrievals can be performed every five minutes and provides up to 150 MB/s of retrieval throughput

Lifecycle Rules / Lifecycle Policies

  • Transition rule: to move objects from one class to another
  • Expiration rule: to delete expired objects
  • Object level rules

Requester Pay

  • requester of the object pays for the network costs
  • requester have to be an authenticated IAM user of an AWS account
  • After a bucket is configured to be a Requester Pays bucket, requesters must include x-amz-request-payer in their API request header, for DELETE, GET, HEAD, POST, and PUT requests, or as a parameter in a REST request to show that they understand that they will be charged for the request and the data download

Event Notifications

  • send messages/events to SNS, SQS (only standard queue) or Lambda function when an object action is triggered (eg: ObjectCreated:Put, ObjectCreated:Post, …)
  • receiving services have to be configured with IAM policy to receive event notification from s3

Performance

  • each s3 prefix can achieve 3500 put/copy/post/delete requests/sec and 5500 get/head requests/sec
  • if objects are distributed across 4 prefix, user can have 22000 get/head requests/sec and 14000 put/copy/post/delete requests/sec
  • how to further optimize s3 performance:
    • multi-part upload
    • s3 transfer acceleration
    • s3 byte range fetches

Batch Operations

  • to perform bulk operations on existing s3 objects with a single request
  • to get the list of objects:
    • use s3 inventory
    • filter using s3 select
    • and use s3 batch operation to do processings

Encryption

  • Server side encryption (SSE)
  • SSE-S3: encrypt with aws managed key
  • SSE-KMS: encrypt with KMS key
  • SSE-C: encrypt with customer provided key
  • Client side encryption (CSE)

CORS

  • Need to be enabled to access objects from web browsers

MFA Delete

  • Only root account can enable/disable MFA delete of a S3 bucket

Access Logs

  • To capture detailed records of requests made to the S3 bucket
  • Provide insights into who accessed the bucket, from where, and how they interacted with the objects

Presigned URLs

  • Time-limited URL that grants temporary access to an S3 object

Glacier Vault Lock

  • write once read many model
  • glacier vault lock has policy and that policy cannot be changed after set once
  • if an object is moved to glacier vault, it cannot be deleted anymore

S3 Object Lock

  • write once read many model
  • bucket versioning must be enabled
  • block an object version deletion for a period of time

Retention Modes

  • compliance - no one can delete the object or change the retention policy
  • governance - some(admin) users can delete the object or change the retention policy
  • protect the object indefinitely
  • independent from retention period
  • legal hold can be placed and removed on an object by using s3:PutObjectLegalHold IAM permission

S3 Access Points

  • each AP points to each bucket
  • s3 access points can have own DNS names
  • can be internet origin or vpc origin
  • can have policy of it’s own
  • so the bucket policy can be simple

S3 Objects Lambda Access Points

  • Object lambda access points enable users to have modified s3 object by pointing to the lambda function which access the original s3 object and do modifications before sending to the object lambda access point

AWS Snow Family

  • snowcone and snowball edge are devices used for offline data migration
  • order the snowcone or snowball edge devices from AWS, load the devices with data, send back the devices to AWS and AWS will transfer the data from devices to s3 buckets
  • snowcone can handle 8TB hdd - 14TB ssd, migration size up to terabytes
  • snowball edge can handle 80TB - 210TB, migration size up to petabytes
  • snowball edge supports storage clustering
  • can do edge computing on snow devices by running lambda functions or ec2 instances at the edge
  • snowcone is capable with 2 cpu and 4gb of ram
  • snowball edge on the other hand is compute-optimized and storage-optimized
  • snowball cannot transfer the data directly to s3 glacier
  • snowmobile is used to move petabytes to exabytes of data, transfer data with container-sized trucks

AWS FSx

  • fully-managed high performance file systems on AWS

Types

  • FSx for Lustre
  • FSx for Windows file server
  • FSx for NetApp ONTAP
  • FSx for openZFS

AWS Storage Gateway

  • Bridge between on-premises data and cloud data

Types

  • s3 file gateway
  • FSx file gateway
  • Volume gateway (cached or stored)
  • Tape gateway

Volume Gateway Cached Mode

  • Only subset of data is stored in on-premise volume gateway

Volume Gateway Stored Mode

  • Full and redundant data is stored in on-premise volume gateway

AWS Transfer Family

  • A fully-managed service for file transfers into and out of Amazon S3 or Amazon EFS using the FTP protocol

Supported Protocols

  • AWS Transfer for FTP (File Transfer Protocol)
  • AWS Transfer for FTPS (File Transfer Protocol over SSL)
  • AWS Transfer for SFTP (Secure File Transfer Protocol)

AWS DataSync

  • Move large amount of data to and from (can be scheduled using agent tasks)
  • On-premise/Other clouds to AWS
  • AWS to AWS
  • Only AWS data transfer service that can directly transfer the data to S3 Glacier

Supported Storage Services

  • S3
  • S3 Glacier
  • EFS
  • FSx

Application Integration/Messaging

SQS

  • Producer/Consumer Model

Standard Queue

  • Unlimited throughput, unlimited number of messages in queue
  • Default retention of messages: 4 days, maximum of 14 days
  • Low latency (<10 ms on publish and receive)
  • Limitation of 256KB per message sent
  • Can have duplicate messages
  • Can have out of order messages
  • Default visibility timeout of 30 sec
  • Cannot set priority value to each message

FIFO Queue

  • Limited throughput: 300 msg/s without batching, 3000 msg/s with
  • Exactly-once send capability (by removing duplicates)
  • Messages are processed in order by the consumer
  • Use deduplication ID and message group ID to ensure exactly-once capability

Encryption

  • In-flight encryption using HTTPS API
  • At-rest encryption using KMS keys
  • Client-side encryption if the client wants to perform encryption/decryption itself

Access Policy

  • Similar to s3 bucket policy to control the access to the queue

Long Polling

  • When a consumer requests messages from the queue, it can optionally “wait” for messages to arrive if there are none in the queue
  • The wait time can be between 1 sec to 20 sec (20 sec preferable)
  • Can configure by setting ReceiveMessageWaitTimeSeconds to a number greater than zero

Dead Letter Queues

  • Dead-letter queues can be used by other queues (source queues) as a target for messages that can’t be processed (consumed) successfully

Delay Queue

  • Delay queues let you postpone the delivery of new messages to a queue for several seconds
  • The default (minimum) delay for a queue is 0 sec
  • The maximum is 15 minutes

SNS

  • Pub/Sub Model

Topics

  • Publisher pushes events to a topic and each subscriber to the topic will get all the events
  • Up to 12,500,000 subscriptions per topic
  • 100,000 topics limit

FIFO SNS

  • Similar features as SQS FIFO
  • Can have SQS Standard and FIFO queues as subscribers
  • same throughput as SQS FIFO

Encryption

  • In-flight encryption using HTTPS API
  • At-rest encryption using KMS keys
  • Client-side encryption if the client wants to perform encryption/decryption itself

Access Policy

  • Similar to s3 bucket policy to control the access to the queue

Message Filtering

  • JSON policy used to filter messages sent to SNS topic’s subscriptions
  • If a subscription doesn’t have a filter policy, it receives every message

Fan-out (SNS+SQS)

  • Push once in SNS, receive in all SQS queues that are subscribers
  • Cross-Region Delivery: works with SQS Queues in other regions

Kinesis

  • Producer/Consumer Model

Kinesis Data Streams

  • Streaming service for ingest at scale
  • data contain partition key and data blob
  • data with same partition keys always go into same shard
  • Once data is inserted in Kinesis, it can’t be deleted (immutability)
  • Ability to reprocess (replay) data
  • Retention between 1 day to 365 days, default of 1 day
  • Cannot autoscale, have to be pre-provisioned

Capacity Modes

  • Provisioned Mode
  • On-demand Mode

Provisioned Mode

  • choose the number of shards provisioned, scale manually or using API
  • Each shard gets 1MB/s in (or 1000 records per second)
  • Each shard gets 2MB/s out (classic or enhanced fan-out consumer)
  • Pay per shard provisioned per hour

On-demand Mode

  • Default capacity provisioned (4 MB/s in or 4000 records per second)
  • Scales automatically based on observed throughput peak during the last 30 days
  • Pay per stream per hour & data in/out per GB

Security

  • In-flight encryption using HTTPS API
  • At-rest encryption using KMS keys
  • Client-side encryption if the client wants to perform encryption/decryption itself
  • VPC Endpoints available for Kinesis to access from within the VPC

Enhanced Fan-out

  • Standard: 2MB/s per shard (shared between multiple consumers)
  • Enhanced fan-out: 2MB/s per shard per consumer

Kinesis Data Firehose

  • Load streaming data into S3 / Redshift / OpenSearch / 3rd party / custom HTTP
  • Fully Managed Service, no administration, automatic scaling, serverless
  • Pay for data going through Firehose
  • Near real-time
  • Supports custom data transformations using AWS Lambda
  • Doesn’t guarantee the order of message delivery and processing

Kinesis Data Analytics

  • Real-time analytics on Kinesis Data Streams & Firehose using SQL
  • Add reference data from Amazon S3 to enrich streaming data
  • Fully managed, no servers to provision
  • Automatic scaling

Kinesis Video Streams

EventBridge

  • Trigger AWS services based on events sent by other AWS services or 3rd party integrations
  • Can archive and replay events for debugging purposes

Trigger Types

  • Schedule
  • Event Patterns

Schedule

  • Cron jobs (scheduled scripts)

Event Patterns

  • Event rules to react to a service doing something

Event Buses

  • Default event bus (AWS services)
  • Partner event bus (3rd parties)
  • Custom event bus

Schema Registry

  • The Schema Registry allows you to generate code for your application, that will know in advance how data is structured in the event bus
  • Schema can be versioned

Resource-based Policy

  • Manage permissions for a specific Event Bus
  • Allow/deny events from another AWS account or AWS region
  • Aggregate all events from your AWS Organization in a single AWS account or AWS region

Amazon MQ

  • Service for on-premise message broker protocols such as: MQTT, AMQP, STOMP, Openwire, WSS

Amazon Simple Workflow Service (SWF)

  • Amazon SWF is a web service that makes it easy to coordinate work across distributed application components

AWS AppFlow

  • To transfer and integrate data between AWS services and external SaaS platforms
  • Keeping SaaS data synchronized with AWS resources

AWS AppSync

  • A managed service for building real-time GraphQL APIs to power data-driven applications
  • Simplifies building GraphQL APIs for querying, mutating, and subscribing to data
  • Allows combining multiple data sources (e.g., DynamoDB, RDS, Lambda) into a single unified API

Machine Learning

Rekognition

  • for CV
  • labeling
  • content moderation
  • Face Detection and Analysis (gender, age range, emotions…)
  • Face Search and Verification
  • Celebrity Recognition
  • Pathing (ex: for sports game analysis)

Amazon Transcribe

  • Speech to text

Features

  • Automatically remove Personally Identifiable Information (PII) using Redaction
  • Automatic Language Identification for multi-lingual audio

Amazon Polly

  • Text to speech

Features

  • Lexicon upload for acronyms and stylized words
  • Speech customization with Speech Synthesis Markup Language (SSML)

Amazon Translate

  • Language translation

Amazon Lex

  • Chatbots
  • Call center bots
  • Natural Language Understanding to recognize the intent of text, callers

Amazon Connect

  • Cloud contact center

Amazon Comprehend

  • Fully managed NLP service

Amazon Comprehend Medical

  • Uses NLP to detect Protected Health Information (PHI)

Amazon SageMaker

  • Fully managed service for developers / data scientists to label data, build and deploy ML models

Amazon Forecast

  • For timeseries analysis

Amazon Kendra

  • Fully managed document search service powered by Machine Learning
  • Sources can be text, pdf, HTML, PowerPoint, MS Word, databases

Amazon Personalize

  • Recommendation system service

Amazon Textract

  • For OCR and IE

Security

Encryption

  • In-flight encryption
  • Server-side encryption
  • Client-side encryption

In-flight encryption

  • Data is encrypted before sending and decrypted after receiving
  • TLS certificate is used in HTTPS

Server-side encryption

  • Data is encrypted after receiving by server and decrypted before sending to the client

Client-side encryption

  • Data is encrypted by the client and never decrypted by the server

KMS

  • Fully integrated with IAM for authorization
  • Able to audit KMS Key usage using CloudTrail
  • KMS Key Encryption also available through API calls (SDK, CLI)
  • Have to pay for API call to KMS ($0.03 / 10,000 calls)
  • If a KMS key is deleted, it is in ‘pending deletion’ state for 7–30 days, with a default of 30 days and can be recovered

Asymmetric vs Symmetric Keys

  • Symmetric Keys (AES-256)
  • Asymmetric Keys (RSA & ECC key pair)

Symmetric Keys

  • Single key for both encryption and decryption
  • AWS services integrated with KMS use symmetric keys
  • Never get access to the KMS Key unencrypted (must call KMS API to use)

Asymmetric Keys

  • Public (Encrypt) and Private Key (Decrypt) pair
  • The public key is downloadable, but the Private Key can’t be accessed unencrypted

KMS Key Types

  • AWS Owned Keys (free): SSE-S3, SSE-SQS, SSE-DDB (default key)
  • AWS Managed Keys (free): (aws/service-name, example: aws/rds or aws/ebs)
  • Customer managed keys created in KMS: $1 / month
  • Customer managed keys imported: $1 / month

Automatic Key Rotation

  • AWS-managed KMS Key: automatic every 1 year
  • Customer-managed KMS Key: automatic (must be enabled) or on-demand
  • Imported KMS Key: only manual rotation possible using alias

Key Policies

  • Control access to KMS keys, “similar” to S3 bucket policies

Default Key Policy

  • Created if you don’t provide a specific KMS Key Policy
  • Complete access to the key to the root user

Custom Key Policy

  • Define users, roles that can access the KMS key
  • Define who can administer the key

Multi-region Keys

  • MRK is bound to a single region but replicas are replicated to multiple regions
  • To be able to decrypt the data encrypted in a different region
  • For the use cases of global client-side encryptions like global dynamodb client-side encryption, global aurora client-side encryption

Replicating encrypted S3 objects

  • For objects encrypted with SSE-KMS:
    • Specify which KMS Key to encrypt the objects within the target bucket
    • Adapt the KMS Key Policy for the target key
    • An IAM Role with kms:Decrypt for the source KMS Key and kms:Encrypt for the target KMS Key
    • You might get KMS throttling errors, in which case you can ask for a Service Quotas increase

AWS CloudHSM

  • Fully managed service that provides customers with dedicated hardware security modules to securely generate and use encryption keys
  • AWS CloudHSM is a fully managed service, meaning AWS takes care of hardware maintenance, updates, and availability
  • Customer retains full control over the cryptographic key management and security configurations

AWS System Manager (SSM) Parameter Store

  • Secure storage for configuration and secrets
  • Optional Encryption using KMS
  • Parameters can be stored in hierarchies

Tiers

  • Standard
  • Advanced

Parameter Policies

  • Allow to assign a TTL to a parameter (expiration date) to force updating or deleting sensitive data such as passwords

AWS SecretsManager

  • Secure storage of secrets
  • Capability to force rotation of secrets every X days
  • Automate generation of secrets on rotation (uses Lambda)
  • Integration with database services like RDS, Aurora, Redshift, DocumentDB
  • Secrets are encrypted using KMS

Multi-region secrets

  • Replicate Secrets across multiple AWS Regions
  • Secrets Manager keeps read replicas in sync with the primary Secret
  • Ability to promote a read replica Secret to a standalone Secret

AWS Certificate Manager

  • Easily provision, manage, and deploy TLS Certificates
  • Supports both public and private TLS certificates
  • Free of charge for public TLS certificates
  • Can generate certificates too
  • Certificates generated with ACM are automatically renewed

Integrations

  • ELB
  • Cloudfront distributions
  • APIs on API Gateway
  • Cannot use from EC2

API Gateway

  • Edge-optimized
  • Regional
  • Private (cannot use ACM)

Edge-optimized

  • ACM is integrated with Cloutdfront distribution
  • The TLS Certificate must be in the same region as CloudFront

Regional

  • The TLS Certificate must be imported on API Gateway, in the same region as the API Stage

Web Application Firewall (WAF)

  • Protects your web applications from common web exploits (Layer 7)

Integrations

  • ALB
  • API Gateway
  • Cloudfront
  • AppSync GraphQL API
  • Cognito User Pool

Web Access Control List (Web ACL)

  • IP Set: up to 10,000 IP addresses
  • HTTP headers, HTTP body, or URI strings Protects from common attack - SQL injection and Cross-Site Scripting (XSS)
  • Size constraints
  • geo-match (block countries)
  • Rate-based rules (to count occurrences of events) – for DDoS protection
  • Web ACL are Regional except for CloudFront
  • A rule group is a reusable set of rules that can be added to a web ACL

AWS Shield

  • Protect from DDoS Attacks

Modes

  • Standard
  • Advanced

Standard

  • Free service that is activated for every AWS customer
  • Provides protection from attacks such as SYN/UDP Floods, Reflection attacks and other layer3/4 attacks

Advanced

  • Optional DDoS mitigation service
  • $3,000 per month per organization
  • 24/7 access to AWS DDoS response team (DRP)
  • Shield Advanced automatic application layer DDoS mitigation automatically creates, evaluates and deploys AWS WAF rules to mitigate layer 7 attacks

Supported Services

  • EC2
  • ELB
  • CloudFront
  • Global Accelerator
  • Route 53
  • Elastic IP

AWS Network Firewall

  • Detail in VPC section
  • 1

AWS Firewall Manager

  • Manage firewall rules in all accounts of an AWS Organization
  • Rules are applied to new resources as they are created (good for compliance) across all and future accounts in your Organization

Security Policies

  • common set of security rules
  • WAF rules (ALB, API Gateways, CloudFront)
  • AWS Shield Advanced (ALB, CLB, NLB, Elastic IP, CloudFront)
  • Security Groups for EC2, ALB and ENI resources in VPC
  • AWS Network Firewall (VPC Level)
  • Route 53 Resolver DNS Firewall
  • Policies are created at the region level

AWS GuardDuty

  • Managed threat detection service
  • Analyze threat from input data like CloudTrail events, VPC flow logs, etc
  • Notify the findings through EventBridge

Foundational Data Sources

  • CloudTrail Events Logs
  • VPC Flow Logs
  • DNS Logs

Other Data Sources

  • S3 data event logs
  • EKS audit logs
  • Lambda network activity logs
  • RDS login activity logs
  • EBS volume data

AWS Inspector

  • Automated Security Assessments for:
    • EC2
    • Container Images push to Amazon ECR
    • Lambda Functions
  • Reporting & integration with AWS Security Hub
  • Send findings to Amazon Event Bridge

EC2

  • Leveraging the AWS System Manager (SSM) agent
  • Analyze against unintended network accessibility
  • Analyze the running OS against known vulnerabilities

Container Images push to Amazon ECR

  • Assessment of Container Images as they are pushed

Lambda Functions

  • Identifies software vulnerabilities in function code and package dependencies
  • Assessment of functions as they are deployed

AWS Macie

  • Find sensitive Personally Indentifiable Information (PII) in data stored on S3

AWS Artifact

  • To view, assess and manage the security reports as well as other AWS compliance-related information

AWS Security Hub

  • Security service that provides a comprehensive view of your security posture across AWS accounts
  • Security Hub collects and aggregates security findings from multiple AWS services such as Amazon GuardDuty, Amazon Macie, Amazon Inspector, and AWS Config, as well as from third-party security solutions

AWS Security Token Service (STS)

  • Service that you can use to create and provide trusted users with temporary security credentials that can control access to your AWS resources
  • Temporary security credentials work almost identically to the long-term access key credentials that your IAM users can use

VPC

Default VPC

  • Default VPC has Internet connectivity through internet gateway and all EC2 instances inside it have public IPv4 addresses

Own VPC

  • Can create max 5 per region (but soft limit)
  • Max CIDR per VPC is 5

CIDR size

  • Min: /28 (16 IP addresses)
  • Max: /16 (65536 IP addresses)

Allowed CIDR ranges (private)

  • 10.0.0.0 – 10.255.255.255 (10.0.0.0/8)
  • 172.16.0.0 – 172.31.255.255 (172.16.0.0/12)
  • 192.168.0.0 – 192.168.255.255 (192.168.0.0/16)

Subnets

  • AWS reserves 5 IP addresses (first 4 & last 1) in each subnet
  • x.x.x.0 – Network Address
  • x.x.x.1 – reserved by AWS for the VPC router
  • x.x.x.2 – reserved by AWS for mapping to Amazon-provided DNS
  • x.x.x.3 – reserved by AWS for future use
  • x.x.x.255 – Network Broadcast Address. AWS does not support broadcast in a VPC, therefore the address is reserved
  • Each subnet maps to single AZ
  • Every subnet created is automatically associated with the main route table for the VPC.

IPv6-only Subnet

  • Can only support Nitro instances

Internet Gateway

  • Allows resources (e.g. EC2 instances) in a VPC connect to the Internet
  • It scales horizontally and is highly available and redundant
  • Must be created separately from a VPC and attach to a VPC
  • Subnet route tables must be configured to route the traffic to internet gateway to access the internet
  • Subnet becomes public subnet when it is connected to and routed through an internet gateway

Bastion Host

  • BH is an instance in a public subnet which have access to other instances in the private subnet
  • To be able to ssh into private instances via BH
  • SG of the BH have to allow port 22 from internet and SG of private instances must allow ssh from SG of the bastion host

NAT Instance

  • An instance in the public subnet through which the private instances can access to the internet
  • Must have Elastic IP attached to it
  • Must disable EC2 setting: Source / destination Check
  • An instance can be NAT instance by configuring using NAT AMIs
  • Route tables of private subnets must be configured to route traffic from private subnets to the NAT Instance

NAT Instance SG rules

  • Inbound:
    • Allow HTTP / HTTPS traffic coming from Private Subnets
    • Allow SSH from source network (access is provided through Internet Gateway)
  • Outbound:
    • Allow HTTP / HTTPS traffic to the Internet

NAT Gateway

  • AWS-managed NAT instance
  • Higher bandwidth, high availability, no administration
  • Pay per hour for usage and bandwidth
  • NAT GW is AZ-bound
  • Uses an Elastic IP
  • Can’t be used by EC2 instance in the same subnet (only from other subnets)
  • Private Subnet NATGW IGW
  • 5 Gbps of bandwidth with automatic scaling up to 100 Gbps

SGs and NACLs

SGs

  • Operates at instance level
  • Stateful (always allow return traffic)
  • Only support ‘Allow’ rules
  • Evaluate all the rules before deciding to allow
  • Newly created SG will ‘Deny’ every inbound traffic and ‘Allow’ every outbound traffic

NACLs

  • Operates at subnet level
  • Stateless
  • Supports both ‘Allow’ and ‘Deny’ rules
  • One NACL per subnet, new subnets are assigned the Default NACL
  • NACLs and subnets are decoupled and NACLs live in VPC
  • Default NACL is “allow all”
  • Newly created NACLs will deny everything (inbound or outbound)
  • NACL have to be configured to allow inbound and outbound ephemeral ports since it is stateless

NACL Rules

  • Rules have a number (1-32766), higher precedence with a lower number
  • First rule match will drive the decision
  • The last rule is an asterisk (*) and denies a request in case of no rule match

VPC Peering

  • Privately connect two VPCs using AWS network
  • Peer VPCs must not have overlapping CIDRs
  • VPC Peering connection is NOT transitive
  • Route tables of subnets in both VPC have to be updated to route the traffic to other VPC through peer connection
  • Can create VPC Peering connection between VPCs in different AWS accounts/regions
  • Can reference a security group in a peered VPC (cross accounts but same region)

VPC End Points

  • VPC Endpoints (powered by AWS PrivateLink) allows to connect to AWS services using a private network instead of using the public Internet
  • Remove the need of IGW, NATGW, … to access AWS Services

Types

  • Interface Endpoint
  • Gateway Endpoint

Interface Endpoint

  • Provisions an ENI (private IP address) as an entry point (must attach a Security Group)
  • Supports most AWS services
  • per GB of data processed
  • Can be used to connect to another VPC
  • Uses AWS PrivateLink to connect the endpoint to services

Gateway Endpoint

  • Provisions a gateway and must be used as a target in a route table (does not use security groups)
  • Free
  • Supports S3 and DynamoDB
  • If S3 or DynamoDB is not in the same region as the subnet, Gateway Endpoint cannot be used since Gateway Endpoint is a regional service (use NAT gateway or Interface Endpoint instead)
  • can attach an endpoint policy that controls access to the service to which you are connecting
  • does not use AWS PrivateLink

Flow Logs

  • Capture information about IP traffic going into your interfaces
  • Can query VPC flow logs using Athena on S3 or CloudWatch Logs Insights

Flow Logs data can go into:

  • S3
  • Cloudwatch logs
  • Kinesis Data Firehose

Site-to-site VPN Connection

  • To connect VPC with on-prem servers through private VPN connection over public network
  • Site-to-site VPN connection can be used as a backup connection to Dx connection

Need 2 things:

  • Virtual Private Gateway (VGW)
  • Customer Gateway (CGW)

VGW

  • VPN concentrator on the AWS side of the VPN connection
  • VGW is created and attached to the VPC from which you want to create the Site-to-Site VPN connection
  • Need to enable Route Propagation for the VGW in the route table that is associated with the subnets in the VPC

CGW

  • Software application or physical device on customer side of the VPN connection
  • Need public Internet-routable IP address for the Customer Gateway device
  • If CGW is private, need NAT device to enable public routing

VPN Cloudhub

  • Provide secure communication between multiple sites, if you have multiple VPN connections
  • To set it up, connect multiple VPN connections on the same VGW, setup dynamic routing and configure route tables

Direct Connect (Dx)

  • Provides a dedicated private connection from a remote network to your VPC
  • Dedicated connection must be setup between the data center and AWS Direct Connect locations
  • Need to setup a VGW at VPC side
  • Lead times are often longer than 1 month to establish a new connection

Connection Flows

  • Private VPC Connection
  • Public Resources Connection

Private Connection Flow

  • VGW Dx Connector in Dx locations Customer router in Dx locations Customer router in customer network

Public Connection Flow

  • Public AWS resources (like s3) Dx Connector in Dx locations Customer router in Dx locations Customer router in customer network

Direct Connect Gateway

  • If you want to setup a Direct Connect to one or more VPC in many different regions (same account), you must use a Direct Connect Gateway
  • Dx connection connects to Direct Connect Gateway and Direct Connect Gateway connects to multiple VGWs

Connection Types

  • Dedicated Connections
  • Hosted Connections

Dedicated Connections

  • 1Gbps,10 Gbps and 100 Gbps capacity
  • Physical ethernet port dedicated to a customer
  • Request made to AWS first, then completed by “AWS Direct Connect Partners”

Hosted Connections

  • 50Mbps, 500 Mbps, to 10 Gbps
  • Connection requests are made via “AWS Direct Connect Partners”
  • Capacity can be added or removed on demand
  • 1, 2, 5, 10 Gbps available at select AWS Direct Connect Partners

Encryption

  • Data in transit is not encrypted but is private
  • AWS Direct Connect + VPN provides an IPsec-encrypted private connection

Resiliency

  • High resiliency
  • Max resiliency

High resiliency

  • One connection at multiple Dx locations

Max resiliency

  • Maximum resilience is achieved by separate connections terminating on separate devices in more than one location.

Transit Gateway

  • Transit Gateway sits in the middle to connect multiple VPCs transitively and can also connect to Dx Gateway and Site-to-site VPN connections
  • Regional resource
  • Share cross-account using Resource Access Manager (RAM)
  • You can peer Transit Gateways across regions
  • Route Tables: limit which VPC can talk with other VPC
  • Supports IP Multicast
  • Can peer multiple transit gateways in multiple regions

Site-to-site VPN ECMP (Equal Cost Multiple Paths)

  • Routing strategy to allow to forward a packet over multiple best path
  • Use case: create multiple Site- to-Site VPN connections to increase the bandwidth of your connection to AWS

VPC Traffic Mirroring

  • Capture and mirror the traffic to send the mirrored traffic into own security appliances to analyze, monitor or troubleshoot
  • Source and Target can be in the same VPC or different VPCs (VPC Peering)

Egress-only Internet Gateway

  • Used for IPv6 only
  • Similar to a NAT Gateway but for IPv6
  • Must update the Route Tables
  • Allows instances in your VPC outbound connections over IPv6 while preventing the internet to initiate an IPv6 connection to your instances

AWS Network Firewall

  • Protect entire VPC
  • From Layer 3 to Layer 7 protection
  • Internally uses AWS Gateway Load Balancer
  • Rules can be centrally managed cross- account by AWS Firewall Manager to apply to many VPCs
  • Can send logs of rule matches to Amazon S3, CloudWatch Logs, Kinesis Data Firehose

Protect directions

  • VPC to VPC traffic
  • Outbound to internet
  • Inbound from internet
  • To/from Direct Connect & Site-to-Site VPN

Fine-grained Controls

  • IP & port - example: 10,000s of IPs filtering
  • Protocol – example: block the SMB protocol for outbound communications
  • Stateful domain list rule groups: only allow outbound traffic to *.mycorp.com or third-party software repo
  • General pattern matching using regex
  • etc

Cost

Cost Explorer

  • Visualize, understand, and manage AWS costs and usage over time
  • Create custom reports that analyze cost and usage data
  • Monthly, hourly, resource level granularity
  • Forecast usage up to 12 months based on previous usage
  • Have API support with pagination

Cost Anomaly Detection

  • Continuously monitor cost and usage using ML to detect unusual spends
  • Monitor AWS services, member accounts, cost allocation tags, or cost categories
  • Sends the anomaly detection report with root-cause analysis
  • Get notified with individual alerts or daily/weekly summary (using SNS)