📄

Summary

🟠 Amazon EC2

🟢 Amazon EC2 – Instance Types

General Purpose

Balanced compute, memory, and networking. Suitable for diverse workloads such as web servers or code repositories.

Compute Optimized

High-performance processors for compute-heavy workloads: batch processing, media transcoding, HPC, scientific modeling, ML, gaming servers.

Memory Optimized

Designed for processing large datasets in memory: high-performance databases, in-memory caches, BI analytics, real-time big data processing.

Storage Optimized

For storage-intensive workloads requiring high sequential read/write on local storage: OLTP systems, databases, data warehouses, distributed file systems.

🟢 Amazon EC2 – Purchasing Options

On-Demand Instances

  • Pay per use:
    • Linux/Windows → per second (after first minute)
    • Other OS → per hour
  • No upfront payment, no long-term commitment.
  • Highest cost but maximum flexibility.
  • Best for short-term, unpredictable workloads.

Reserved Instances (RI) – 1 or 3 years

  • Up to 72% discount vs On-Demand.
  • Reserve specific instance attributes (type, region, tenancy, OS).
  • Payment options: No Upfront, Partial Upfront, All Upfront.
  • Scope: Regional (billing benefit) or Zonal (billing + capacity).
  • Best for steady-state workloads (e.g., databases).
  • Can be bought/sold in RI Marketplace.
Convertible RI:
  • Flexible changes (type, family, OS, scope, tenancy).
  • Up to 66% discount.

Spot Instances

  • Up to 90% discount vs On-Demand.
  • Can be interrupted anytime if Spot price exceeds your max bid.
  • Best for fault-tolerant, flexible workloads: batch jobs, data analysis, image processing, distributed systems.
  • Not suitable for critical or persistent workloads.

Dedicated Hosts

  • Physical server fully dedicated to you.
  • Meet compliance needs or run BYOL software tied to hardware.
  • Most expensive option.
  • Purchase On-Demand or Reserved (1–3 years).

Dedicated Instances

  • Instances on hardware dedicated to you (but may share hardware with your other instances).
  • No placement control; may change after stop/start.

Capacity Reservations

  • Reserve On-Demand capacity in a specific AZ for any duration.
  • No time commitment, but pay On-Demand rates regardless of usage.
  • Guarantees availability; can combine with RI/Savings Plans for discounts.

🟢 Amazon EC2 – Spot Fleets

A Spot Fleet is a collection of Spot Instances (optionally mixed with On-Demand Instances) managed as a single group to meet a target capacity under defined price constraints.

🟢 Amazon EC2 – Placement Groups

Strategies:
  • Cluster: All instances close together in one AZ for low latency & high throughput.
  • Spread: Instances placed on distinct hardware to minimize correlated failures (max 7 per AZ).
  • Partition: Instances grouped into partitions across racks for isolation at scale.

Cluster Placement Group

  • Pros: Extremely low latency, up to 10 Gbps throughput (with Enhanced Networking).
  • Cons: Single AZ failure impacts all instances.
  • Use cases: Big Data processing, HPC, tightly coupled workloads.

Spread Placement Group

  • Pros: Instances on separate physical hardware, can span AZs, high availability.
  • Cons: Limit of 7 instances per AZ.
  • Use cases: Critical workloads needing fault isolation.

Partition Placement Group

  • Limits: Up to 7 partitions per AZ, can span multiple AZs, hundreds of instances.
  • Behavior: Partitions have distinct racks; partition failure affects only that partition. Partition info exposed via instance metadata.
  • Use cases: Distributed systems (HDFS, Cassandra, Kafka).

🟢 Amazon EC2 – Hibernate

Stop vs Terminate:
  • Stop: root EBS volume preserved, instance can be restarted.
  • Terminate: root EBS (if set to delete) is lost, instance removed.
Hibernate:
  • Preserves in-memory (RAM) state to the root EBS volume.
  • On restart, OS and applications resume instantly without rebooting → avoids cold start.
  • Root EBS must be encrypted.

🟢 Amazon EC2 – AMI

An AMI (Amazon Machine Image) is a preconfigured template for launching EC2 instances.
  • Includes OS, software, configurations, and settings.
  • Reduces boot/config time by having software pre-installed.
  • AMIs are region-specific, but can be copied to other regions.
Sources:
  • Public AMI – provided by AWS.
  • Custom AMI – created and maintained by you.
  • Marketplace AMI – provided by third parties, sometimes paid.

AMI Creation Process

  1. Launch and customize an EC2 instance.
  1. Stop the instance for data consistency.
  1. Create an AMI (also generates EBS snapshots).
  1. Launch new instances from the AMI as needed.
Notes:
  • You can copy an Amazon Machine Image (AMI) across AWS Regions
  • You can share an Amazon Machine Image (AMI) with another AWS account
  • Copying an Amazon Machine Image (AMI) backed by an encrypted snapshot cannot result in an unencrypted target snapshot

🟢 Amazon EC2 – Instance Store

High-performance local storage physically attached to the host running the EC2 instance.
Characteristics:
  • Higher I/O performance than EBS (no network latency).
  • Ephemeral: data is lost if the instance is stopped, terminated, or the host fails.
  • Suitable for temporary data such as buffers, caches, or scratch space.
  • No durability — backups and replication are your responsibility.

🟢 EC2 Instance Connect

EC2 Instance Connect provides a secure and easy way to connect to your EC2 instances using SSH without the need to manage long-lived SSH keys. It works by pushing a temporary, one-time-use public key to the instance for the duration of the connection, allowing access through the AWS Management Console, CLI, or API. This improves security, simplifies key management, and integrates with IAM for fine-grained access control.

🟠 Amazon EBS Volumes

Network-attached block storage for EC2, persists independently from instance lifecycle.
  • One-to-one attachment (single EC2 per volume, per AZ)
  • AZ-bound → migrate via snapshot
  • Hot-swappable → detach/attach within same AZ
  • Provisioned capacity → pay for size + provisioned IOPS
  • Network drive → adds latency

🟢 Amazon EBS – Delete on Termination

Controls whether an EBS volume is deleted when the associated EC2 instance is terminated.
Default behavior:
  • Root volume: enabled → deleted on termination
  • Additional volumes: disabled → retained on termination

🟢 Amazon EBS - Snapshots

Point-in-time backup of a volume (can stay attached; detach for consistency).
  • Copy across AZs/Regions
  • Restore in different AZ/Region

Features

  • Archive → up to 75% cheaper, restore 24–72h
  • Recycle Bin → retention (1d–1y) for recovery
  • Fast Snapshot Restore → zero-latency on first access (extra cost)

🟢 Amazon EBS - Volume Types

6 types (SSD vs HDD) → defined by size, IOPS, throughput
Only gp2/gp3 + io1/io2 usable as boot volumes.

General Purpose SSD (gp2 / gp3)

  • Workloads: boot, VDI, dev/test, general apps
  • gp3: 3k IOPS & 125 MB/s baseline, up to 16k IOPS & 1,000 MB/s, IOPS/throughput set independently
  • gp2: 3 IOPS/GB (max 16k), burst 3k IOPS, performance tied to size

Provisioned IOPS SSD (io1 / io2)

  • Workloads: critical DBs, latency-sensitive apps
  • io1: 4 GiB–16 TiB, up to 64k IOPS (Nitro), IOPS independent from size
  • io2 Block Express: 4 GiB–64 TiB, sub-ms latency, up to 256k IOPS, ratio 1,000:1, supports Multi-Attach

HDD (st1 / sc1)

  • Not bootable – for throughput/cost optimization
  • st1 (Throughput Optimized): Big Data, DW, logs, up to 500 MB/s & 500 IOPS
  • sc1 (Cold): archival, infrequent data, up to 250 MB/s & 250 IOPS

🟢 Amazon EBS – Multi-Attach (io1 / io2)

EBS Multi-Attach allows a single io1 or io2 volume to be mounted on multiple EC2 instances in the same Availability Zone. All attached instances have full read and write access to the volume.

🟢 Amazon EBS – Encryption

When an EBS volume is encrypted:
  • Data at rest is encrypted on the volume.
  • Data in transit between the instance and the volume is encrypted.
  • All snapshots created from the volume are encrypted.
  • All volumes created from encrypted snapshots are also encrypted.
Key points:
  • Encryption and decryption are transparent to the user (no code changes required).
  • Negligible performance impact.
  • Uses AWS KMS with AES-256 encryption keys.
  • You can encrypt an unencrypted snapshot when making a copy.
  • Snapshots of encrypted volumes are always encrypted.

🟠 Amazon EFS

Amazon EFS (Elastic File System)
  • Fully managed storage service that can be mounted on multiple EC2 instances across multiple AZs.
  • Highly available, scalable, and more expensive (~3× gp2 cost).
  • Pay-per-use with automatic scaling, no capacity planning required.
  • POSIX-compliant, works like a standard Linux file system API.
  • Uses NFS (v4.1) protocol
Key points:
  • Accessible only from Linux-based AMIs (not Windows).
  • I’m Encryption at rest with AWS KMS.
  • Access controlled via Security Groups.
  • Multi-AZ design makes it ideal for shared, concurrent access.

🟢 Amazon EFS – Performance & Storage Classes

Scale & Capacity
  • Supports thousands of concurrent NFS clients and petabyte-scale storage.
  • Throughput up to 10 GB/s with automatic scaling.
Performance Modes (set at creation)
  • General Purpose (default) – Low latency, best for web servers, CMS, and latency-sensitive apps.
  • Max I/O – Higher latency but optimized for maximum throughput and parallelism; suitable for big data and media processing.
Throughput Modes
  • Bursting – Baseline of 50 MiB/s per TiB of storage, with bursts up to 100 MiB/s.
  • Provisioned – Fixed throughput regardless of storage size, e.g., 1 GiB/s for a 1 TiB file system.
  • Elastic – Scales automatically with usage; up to 3 GiB/s reads and 1 GiB/s writes, ideal for unpredictable workloads.
Storage Classes (managed via lifecycle policies)
  • Standard – Multi-AZ, high availability for frequently accessed data.
  • Infrequent Access (EFS-IA) – Lower cost storage, higher retrieval cost.
  • Archive – For rarely accessed files; ~50% cheaper than EFS-IA.
  • One Zone – Stored in a single AZ, lower cost, suitable for dev/test; can be paired with IA as EFS One Zone-IA.
Cost Optimization
  • Lifecycle policies can automatically transition data between storage tiers based on last access (e.g., after 60 days).
  • Tiering can yield over 90% cost savings for infrequently accessed data.

🟢 Amazon EFS – Mount Targets

  • An EFS mount target provides an entry point into an EFS file system from a VPC subnet
  • Each mount target has a private IP and a security group
  • EC2 must reach a mount target in the same AZ
  • Best practice: create one per AZ for HA & low latency
  • Instances connect via DNS name, which resolves to the mount target IP in the same AZ

🟠 Amazon S3

🟢 Amazon S3 – Access Control

User-based policies (IAM)
  • IAM policies: define S3 actions for users, groups, or roles
  • Attached to identities, evaluated at request time
Resource-based policies
  • Bucket policies: JSON, applied at bucket level, allow cross-account access
  • ACLs: object/bucket-level, legacy, often disabled
Evaluation logic
Access allowed if:
  • IAM policy OR resource policy permits
  • AND no explicit Deny
Bucket Policies – Structure & Use Cases
JSON with Resource, Effect, Action, Principal
  • Public access (Principal: "*") for static websites
  • Enforce encryption on upload
  • Cross-account data sharing
Access Scenarios
  • Public access → bucket policy s3:GetObject for all
  • IAM user → IAM policy with allowed actions/resources
  • EC2 instance → IAM role attached to EC2, no static creds
  • Cross-account → bucket policy with external AWS account in Principal

🟢 Amazon S3 – Versioning

  • Enabled at the bucket level
  • Each overwrite creates a new object with unique Version ID
  • Pre-versioning objects get Version ID "null"
  • Suspending stops new versions but keeps existing
Benefits:
  • Protects from accidental overwrite/delete
  • Allows rollback to previous versions

🟢 Amazon S3 – Replication

  • Versioning required on source and destination
  • Cross-Region Replication (CRR): replicate to a different region
  • Same-Region Replication (SRR): replicate within the same region
  • Works across accounts
  • Asynchronous, needs IAM permissions
Use cases:
  • CRR: compliance, disaster recovery, low-latency access in other regions
  • SRR: log aggregation, sync prod → test
Key notes:
  • Applies only to new objects (existing → use Batch Replication)
  • Can replicate delete markers (not versioned deletes)
  • Not chained → replication stops at target bucket

🟢 Amazon S3 – Storage Classes

All provide 99.999999999% durability. Objects can move between classes manually or via Lifecycle rules.
General Purpose
  • Standard – 99.99% availability, multi-AZ, low latency & high throughput. For frequently accessed data.
Infrequent Access (IA)
  • Standard-IA – 99.9% availability, multi-AZ, lower cost, retrieval fee. For infrequently accessed data.
  • One Zone-IA – 99.5% availability, single AZ, cheaper but data lost if AZ fails. For recreatable/secondary backups.
Archival
  • Glacier Instant Retrieval – ms access, 90-day min. storage.
  • Glacier Flexible Retrieval – Expedited (1–5 min), Standard (3–5 h), Bulk (5–12 h). 90-day min. storage.
  • Glacier Deep Archive – Lowest cost, Standard (12 h) or Bulk (48 h). 180-day min. storage.
Adaptive
  • Intelligent-Tiering – Auto-moves data between frequent, infrequent, archive instant, archive, and deep archive tiers. No retrieval fees, small monitoring fee.
Transition Rules
  • Standard → Standard-IA / One Zone-IA: 30 days min.
  • Any → Glacier (all types): 30 days min.
  • Glacier Flexible Retrieval: 90 days min.
  • Glacier Deep Archive: 180 days min.
  • Intelligent-Tiering: no minimum, only monitoring fee

🟢 Amazon S3 – Lifecycle Rules

Automate transition and expiration of objects based on age or tags.
Transition actions
  • Move to Standard-IA after X days
  • Archive to Glacier after longer retention
Expiration actions
  • Delete objects after set time (e.g., 365 days)
  • Remove old versions if versioning enabled
  • Delete incomplete multipart uploads
Scope
  • Rules apply to prefixes or tags
Allowed transitions
  • Standard → IA / Intelligent-Tiering / One-Zone IA / Glacier Instant / Glacier Flexible / Glacier Deep Archive
  • Standard-IA → Intelligent-Tiering / One-Zone IA / Glacier Instant / Glacier Flexible / Glacier Deep Archive
  • Intelligent-Tiering → One-Zone IA / Glacier Instant / Glacier Flexible / Glacier Deep Archive
  • One-Zone IA → Glacier Instant / Glacier Flexible / Glacier Deep Archive
  • Glacier Instant → Glacier Flexible / Glacier Deep Archive
  • Glacier Flexible → Glacier Deep Archive

🟢 Amazon S3 – Storage Class Analysis

  • Analyzes usage for Standard and Standard-IA (not One Zone-IA or Glacier)
  • Reports updated daily; first results in 24–48h
  • Output in CSV with date, storage class, object age ranges
Use case:
  • Optimize lifecycle rules for cost savings

🟢 Amazon S3 – Requester Pays

  • Default: bucket owner pays for storage, requests, and data transfer
  • Requester Pays: requester pays for requests + downloads, owner pays only for storage
Key points
  • Requester must be authenticated (no anonymous access)
  • Useful for large dataset sharing, research, public data, cross-account scenarios

🟢 Amazon S3 – Event Notifications

  • Sends notifications on object events (e.g., s3:ObjectCreated, s3:ObjectRemoved, s3:ObjectRestore, replication changes)
  • Supports prefix/suffix filtering (e.g., .jpg)
  • Multiple notifications per bucket
  • Delivery usually in seconds
  • Use cases: trigger Lambda, send to SQS/SNS
Permissions
  • Target service must allow s3.amazonaws.com via resource-based policy (aws:SourceArn = bucket)

🟢 Amazon S3 – EventBridge Integration

  • S3 can send all events to EventBridge
  • Advanced JSON filtering (metadata, size, name, etc.)
  • Route to 18+ AWS services (Step Functions, Kinesis, Lambda, SNS, SQS, etc.)
  • Supports archiving, replay, reliable delivery

🟢 Amazon S3 – Multi-Part Upload

  • Improves performance and resilience for large uploads
  • Recommended for files > 100 MB, required for > 5 GB
  • Splits object into parts, uploads in parallel, S3 reassembles

🟢 Amazon S3 – Transfer Acceleration

  • Speeds up long-distance uploads via AWS edge locations
  • Data routed over AWS internal network to bucket
  • Works with multi-part upload
  • Best for geographically distant clients

🟢 Amazon S3 – Object Encryption

Four encryption methods:
Server-Side Encryption (SSE)
  • SSE-S3: AES-256, keys managed by S3, default for new objects (AES256 header)
  • SSE-KMS: Keys in KMS, audit + granular perms, API calls (GenerateDataKey, Decrypt), quota impact (aws:kms header)
  • SSE-C: Customer-managed keys, sent in HTTPS headers, AWS never stores them. Losing key = losing object
Client-Side Encryption
  • Done on client before upload/after download
  • Customer manages keys, AWS stores only ciphertext
  • AWS never sees key or plaintext
Default Encryption vs Bucket Policies
  • Default encryption: auto-encrypts new objects (SSE-S3/SSE-KMS)
  • Bucket policy: can enforce encryption by denying noncompliant PUT requests
    • Example: Deny if s3:x-amz-server-side-encryption"aws:kms"
    • Example: Deny if SSE-C header missing
Key point:
  • Default encryption = passive (after accept)
  • Bucket policy = proactive (rejects if not compliant)
  • With SSE-S3, each object has a unique encryption key

🟢 Amazon S3 – Encryption in Transit (SSL/TLS)

  • Uses SSL/TLS for encryption in flight
  • Endpoints: HTTP (unencrypted, not recommended) / HTTPS (encrypted, default)
  • Required for SSE-C, recommended for all workloads
Forcing HTTPS (aws:SecureTransport)
  • Bucket policy can deny non-HTTPS requests
  • Example: Deny s3:GetObject if aws:SecureTransport = false
  • Behavior: HTTP denied, HTTPS allowed

🟢 Amazon S3 – CORS (Cross-Origin Resource Sharing)

  • Browser mechanism controlling requests between different origins (protocol + host + port)
  • Same origin: always allowed
  • Different origin: needs CORS headers (e.g., Access-Control-Allow-Origin)
Preflight requests
  • For methods like PUT/DELETE or custom headers
  • Browser sends OPTIONS, server must reply with allowed origins, methods, headers
In S3
  • Bucket must have a CORS config allowing the origin
  • Can allow a specific origin or all (*)
  • Common when one bucket hosts a site and another serves assets

🟢 Amazon S3 – MFA Delete

  • Adds MFA requirement for sensitive versioning operations
MFA required for
  • Permanently deleting an object version
  • Suspending bucket versioning
MFA not required for
  • Enabling versioning
  • Listing deleted versions
Key points
  • Works only if versioning is enabled
  • Can be toggled only by root account (bucket owner)

🟢 Amazon S3 – Pre-Signed URLs

  • Provide temporary access to private objects without changing bucket policies
  • Generated via Console, CLI, or SDKs
Expiration
  • Console: 1–720 min (max 12h)
  • CLI: -expires-in (default 1h, max 7d)
Permissions
  • Inherit from IAM identity generating the URL
  • Valid for specific ops (GET, PUT, etc.)
Use cases
  • Temporary download access to content
  • Allow user uploads without long-term perms
How it works
  1. Owner generates pre-signed URL
  1. Shares with recipient
  1. Recipient accesses object until expiration, no extra auth needed

🟢 Amazon S3 – Glacier Vault Lock

  • WORM model (Write Once, Read Many) for Glacier vaults
  • Vault Lock policy → once locked, cannot be changed or deleted
  • Applies to all future operations
Use cases
  • Regulatory compliance
  • Long-term data retention
  • Prevent deletion or modification of archives

🟢 Amazon S3 – Object Lock

  • WORM model at object version level
  • Prevents delete/overwrite for a set time
  • Requires versioning
Retention modes
  • Compliance: cannot be bypassed, even by root; retention cannot be shortened
  • Governance: blocks most users; privileged users can override
Retention period
  • Fixed duration, can be extended, never shortened
Legal Hold
  • Protects object indefinitely, regardless of retention
  • Managed via s3:PutObjectLegalHold permission

🟠 Amazon FSx

  • Fully managed service to launch/run third-party high-performance file systems on AWS
  • Integrates with AWS storage, compute, and security
  • No need for manual provisioning, patching, or backups

🟢 Amazon FSx for Windows File Server

  • Fully managed native Windows file system on AWS
  • Supports SMB protocol + NTFS
  • Integrates with Active Directory (auth + access control)
  • Features: ACLs, user quotas, DFS Namespaces
  • Can be mounted on Linux EC2 for cross-platform access
Performance & Storage
  • Scales to tens of GB/s, millions of IOPS, hundreds of PB
  • SSD → low-latency apps (DBs, media, analytics)
  • HDD → general use (home dirs, CMS, shares)
Connectivity & Availability
  • Accessible from on-prem via VPN / Direct Connect
  • Multi-AZ for HA
  • Daily automatic backups to S3 for durability & recovery

🟢 Amazon FSx for Lustre

  • Fully managed parallel distributed file system for HPC
  • Name = Linux + Cluster
  • Optimized for massive throughput + low latency
Use cases
  • ML training pipelines
  • HPC simulations
  • Video rendering/processing
  • Financial modeling, EDA
Performance & Storage
  • Scales to hundreds of GB/s, millions of IOPS, sub-ms latency
  • SSD → low-latency, random/small files
  • HDD → high-throughput, large/sequential files
S3 integration
  • Mount S3 as file system
  • Write computation results back to S3
  • Accessible from on-prem via VPN / Direct Connect
Deployment options
  • Scratch → temporary, non-replicated, high burst (200 MB/s per TiB), short-term workloads
  • Persistent → long-term, replicated in-AZ, auto recovery, sensitive workloads

🟢 Amazon FSx for NetApp ONTAP

  • Fully managed NetApp ONTAP file systems on AWS
  • Supports NFS, SMB, iSCSI
  • Seamless migration from on-prem ONTAP/NAS to AWS
Compatibility
  • Linux, Windows, macOS
  • VMware Cloud on AWS
  • Amazon Workspaces, AppStream 2.0
  • Amazon EC2, ECS, EKS
Key features
  • Elastic storage (auto grow/shrink)
  • Snapshots, replication, compression, deduplication
  • Point-in-time cloning for testing
  • Multi-protocol access (NFS + SMB + iSCSI)

🟢 Amazon FSx for OpenZFS

  • Fully managed OpenZFS file systems on AWS
  • Supports NFS v3, v4, v4.1, v4.2
  • Simplifies migration of ZFS-based workloads
Compatibility
  • Linux, Windows, macOS
  • VMware Cloud on AWS
  • Amazon Workspaces, AppStream 2.0
  • Amazon EC2, ECS, EKS
Performance & Features
  • Up to 1M IOPS with <0.5 ms latency
  • Snapshots, compression, low-cost storage
  • Instant point-in-time cloning for testing

🟠 AWS Snowball

AWS Snowball is a secure, rugged, portable appliance for moving large-scale data into/out of AWS or for edge processing when network transfer is impractical.
  • Supports petabyte-scale migration
  • Runs EC2 instances and Lambda functions locally
  • All data encrypted with KMS-managed keys
  • Supports AWS DataSync for automation
  • Ideal where internet is limited, costly, or unstable
Device types
  • Storage Optimized → 210 TB SSD, bulk transfer, storage-heavy workloads
  • Compute Optimized → 28 TB SSD, compute-heavy edge workloads
Notes:
  • Snowball cannot write directly to Glacier → Solution: Import to S3, then apply Lifecycle Policy → Glacier / Deep Archive

🟠 AWS Storage Gateway

  • Hybrid cloud storage service bridging on-prem apps with AWS
  • Provides local caching for low-latency access while storing data in AWS

🟢 AWS Storage Gateway – S3 File Gateway

  • Provides NFS/SMB access to objects in Amazon S3
  • Locally caches most recently used data for low-latency access
  • Stores data in S3 Standard, Standard-IA, One Zone-IA, Intelligent-Tiering
  • Can transition to Glacier tiers via Lifecycle Policies
  • Access managed with IAM roles per File Gateway

🟢 AWS Storage Gateway – FSx File Gateway

  • Provides native access to Amazon FSx for Windows File Server from on-prem/edge
  • Uses local cache for frequently accessed data, ensuring low-latency performance
  • Fully supports Windows-native features:
    • SMB protocol
    • NTFS file system
    • Active Directory (AD) integration
    • ACLs and file attributes
  • Ideal for group file shares, home directories, and Windows workloads needing hybrid access

🟢 AWS Storage Gateway – Volume Gateway

  • Provides block storage via iSCSI, with data in Amazon S3 and backups as EBS snapshots
  • Snapshots can restore volumes on-prem or in AWS
Volume types
  • Cached volumes → frequently used data cached locally, full dataset in S3
  • Stored volumes → full dataset on-prem, async backups to S3

🟢 AWS Storage Gateway – Tape Gateway

  • Keeps tape-based backup workflows while using AWS storage
  • Provides a Virtual Tape Library (VTL) via iSCSI, works with most backup software
  • Virtual tapes stored in S3, can be archived to Glacier for long-term retention
Architecture
  • On-prem Backup ServerTape Gateway over iSCSI (emulates Media Changer + Tape Drive)
  • Data sent securely over HTTPS to AWS
  • Recent tapes in S3, archived tapes in Glacier for cost savings

🟠 AWS Transfer Family

  • Fully managed file transfer service for Amazon S3 and Amazon EFS
  • Supports multi-AZ, highly available, scalable
  • Pricing: per endpoint/hour + per-GB transfer
Supported protocols
  • SFTP
  • FTPS
  • FTP (VPC only)
Authentication options
  • Built-in user management
  • External IdPs: Active Directory, LDAP, Cognito, Okta, custom
Use cases
  • File sharing with partners/clients
  • CRM/ERP integration
  • Hosting/distributing datasets
Architecture flow
  1. User connects via FTP/SFTP/FTPS (optionally through Route 53)
  1. Transfer Family authenticates (internal or external IdP)
  1. User mapped to S3 bucket or EFS file system
  1. IAM roles enforce access control
Enables secure, high-performance transfers without managing infrastructure

🟠 AWS DataSync

  • Fully managed service to transfer large datasets between on-prem, AWS, or other clouds
  • On-prem/other cloud → AWS → supports NFS, SMB, HDFS, S3 API (requires DataSync Agent)
  • AWS → AWS → transfers between storage services without agent
Supported targets
  • Amazon S3 (all classes, incl. Glacier/Deep Archive)
  • Amazon EFS
  • Amazon FSx (Windows, Lustre, ONTAP, OpenZFS)
Key features
  • Scheduled transfers (hourly/daily/weekly)
  • Preserves permissions + metadata (POSIX, SMB ACLs)
  • Up to 10 Gbps per agent task (configurable)
  • TLS encryption in transit
On-Prem → AWS Architecture
  1. NFS/SMB server connects to DataSync Agent (on-prem or Snowcone)
  1. Agent → secure comms with DataSync service
  1. Data → S3, EFS, or FSx
AWS → AWS Transfers
  • S3 ↔ S3 / S3 ↔ EFS / S3 ↔ FSx / EFS ↔ FSx
  • No agent required
  • Maintains metadata + attributes

🟠 Amazon ELB

  • Elastic Load Balancer (ELB) automatically distributes incoming traffic across multiple targets (EC2, containers, IPs) in one or more AZs
  • Routes requests only to healthy instances based on load, availability, and routing rules
  • Provides high availability, fault tolerance, and scalability
High-level flow
Users → ELB DNS name → Load balancer → Targets (EC2, ECS, Lambda, IPs)

🟢 Load Balancer Security Groups

Traffic flow
  • Clients → LB (HTTP 80 / HTTPS 443) from anywhere
  • LB → EC2 (private subnets)
  • EC2 accepts traffic only from LB’s Security Group, never from internet
Security group setup
  • Load Balancer SG → allow HTTP (80) + HTTPS (443) from 0.0.0.0/0
  • Application SG (EC2) → allow HTTP (80) only from LB SG
Key points
  • LB can be internet-facing or internal
  • EC2 instances remain private, not directly exposed
  • Health checks are included in LB traffic rules

🟢 Application Load Balancer (ALB)

  • Operates at Layer 7 (HTTP/HTTPS)
  • Routes traffic to target groups: multiple apps, containers, or services
  • Supports HTTP/2 and WebSockets
  • Can handle redirects (e.g., HTTP → HTTPS)
  • Ideal for microservices & ECS containers (supports dynamic port mapping)
Advanced routing
  • Path-based: /users → Users TG, /search → Search TG
  • Host-based: app.example.com vs api.example.com
  • Query string / headers: e.g., ?platform=mobile vs desktop
  • Targets: EC2, ECS tasks, Lambda, Private IPs (hybrid)
  • Health checks at target group level
  • Supports routing across AWS + on-prem
Networking & hostname
  • ALB DNS name: xxx.region.elb.amazonaws.com
  • Original client IP/port/proto via headers:
    • X-Forwarded-For → client IP
    • X-Forwarded-Port → client port
    • X-Forwarded-Proto → protocol

🟢 Network Load Balancer (NLB)

  • Operates at Layer 4 (TCP/UDP)
  • Handles millions of req/sec with ultra-low latency
  • Best for performance-critical apps or TCP/UDP-based services
Key features
  • Provides 1 static IP per AZ (Elastic IPs optional for static addressing/whitelisting)
  • Supports cross-zone load balancing (optional)
  • Integrates with EC2, ECS, PrivateLink
  • Not in Free Tier
Routing & target groups
  • Routes traffic on TCP/UDP rules
  • Targets: EC2 instances, Private IPs, ALBs (for chaining)
  • Health checks: TCP, HTTP, HTTPS
  • Note: if target = EC2 instance, NLB routes only to primary private IP of eth0
Use cases
  • Low-latency financial apps
  • Real-time multiplayer gaming
  • Hybrid connectivity (on-prem via Private IPs)
  • Clients requiring static IP

🟢 Gateway Load Balancer (GWLB)

  • Deploy & scale 3rd-party network appliances (firewalls, IDPS, DPI)
  • Operates at Layer 3, processes IP packets
  • Combines transparent gateway + load balancer
  • Uses GENEVE protocol (port 6081) for encapsulation
Architecture
  • Route table → GWLB → target group of appliances → app destination
  • Works with Transit Gateway / VPC Peering for hybrid setups
Target groups
  • EC2 instances (by ID)
  • Private IPs (AWS or on-prem)
  • Supports hybrid security appliances

🟢 Sticky Sessions (Session Affinity)

  • Routes the same client to the same backend instance
  • Supported by CLB, ALB, NLB
  • CLB/ALB use cookies (configurable expiration)
  • Use case: keep session data without re-login/state loss
  • Trade-off: may cause uneven traffic

🟢 Cross-Zone Load Balancing

When enabled
  • Each LB node distributes traffic across all targets in all AZs
  • Instance count differences don’t matter → load is even
  • Example: 10 instances (2 in AZ1, 8 in AZ2) → 100 requests → 10 each
When disabled
  • Each LB node sends traffic only to targets in its own AZ
  • Can cause imbalances if AZs differ in instance count
  • Example: 2 in AZ1, 8 in AZ2 → 100 requests → 25 each in AZ1, 6.25 each in AZ2
Service behavior
  • ALB → enabled by default, can disable per TG, no inter-AZ cost
  • NLB/GWLB → disabled by default, enabling adds inter-AZ data transfer cost
  • CLB → disabled by default, no inter-AZ cost when enabled

🟢 SSL/TLS

  • Encrypts traffic in transit between client and LB
  • SSL (legacy) vs TLS (modern standard)
  • Certificates issued by CAs (Comodo, Digicert, Let’s Encrypt, etc.)
  • Certificates expire and must be renewed
Certificates on Load Balancers
  • Use X.509 server certificates (via ACM or manual upload)
  • For HTTPS listeners:
    • Require a default cert
    • Support multiple certs (multi-domain)
    • Use SNI for hostname-based cert selection
    • Security policy defines allowed SSL/TLS versions
  • Example: Client → HTTPS to LB (encrypted) → HTTP to backend (inside VPC)
Server Name Indication (SNI)
  • Enables multiple SSL/TLS certs on one LB/listener
  • Client provides hostname in handshake → LB selects matching cert (or default)
  • Supported by ALB, NLB, CloudFront
  • Not supported by CLB
Service behavior
  • CLB → 1 cert only; multiple hostnames = multiple CLBs
  • ALB → multiple listeners + certs, SNI support
  • NLB → same as ALB, multiple listeners + certs via SNI

🟠 Auto Scaling Group (ASG)

  • Manages a fleet of EC2 instances, scaling in/out automatically based on demand
  • Maintains min, desired, max instance counts
  • Replaces unhealthy instances automatically
  • Can integrate with ELB for auto register/deregister
  • ASG itself is free (pay only for EC2 + resources)
Key capacity settings
  • Minimum → lowest number of instances always running
  • Desired → target number of instances
  • Maximum → upper limit of instances
Benefits
  • Elasticity → matches compute to demand
  • Fault tolerance → replaces failed instances
  • Cost optimization → scales in to save cost

🟢 Auto Scaling Groups – Scaling Policies

Dynamic scaling
  • Target Tracking → keep metric at target (e.g., CPU ~40%)
  • Simple / Step Scaling → triggered by CloudWatch alarms (e.g., CPU >70% → +2 instances; CPU <30% → –1 instance)
Scheduled scaling
  • Predefined actions for predictable patterns (e.g., min capacity = 10 every Friday at 5 PM)
Predictive scaling
  • Uses historical data to forecast demand
  • Scales out/in proactively before load changes
Good metrics to scale on
  • CPUUtilization → CPU-bound workloads
  • RequestCountPerTarget → stabilize req/instance (web apps with LB)
  • Avg Network In/Out → network-heavy workloads (streaming, transfers)
  • Custom metrics → app-specific (memory, queue depth, etc.)

🟢 Auto Scaling Group – Instance Termination Policies

When an ASG scales in, it chooses which instances to terminate based on policies (in order):
  1. AZ balancing → terminate in AZ with most instances
  1. On-Demand vs Spot → terminate Spot first
  1. Launch config/template → keep newest, terminate older
  1. Billing hour → choose instances closest to next billing hour (legacy model)
  1. Random selection → if multiple candidates remain

🟠 Amazon RDS

  • RDS = Relational Database Service, fully managed SQL database service
  • Supported engines: PostgreSQL, MySQL, MariaDB, Oracle, SQL Server, Aurora
  • AWS manages provisioning, patching, backups, scaling, and high availability

🟢 Amazon RDS – Read Replicas for Read Scalability

  • Up to 15 read replicas
  • Can be same AZ, cross-AZ, or cross-region
  • Asynchronous replication → eventually consistent
  • Replicas can be promoted to standalone DBs
  • Apps must use separate connection strings for replicas
  • Read-only (SELECT), no writes
Architecture
  • Primary DB handles writes
  • Replicas handle reads
  • Replication runs asynchronously
Use cases
  • Scale read-heavy apps
  • Analytics/reporting without impacting prod
  • Geo-distributed reads
Network cost
  • Same region (incl. cross-AZ) → free
  • Cross-region → charged

🟢 Amazon RDS – Multi-AZ

  • Synchronous replication between primary and standby in different AZs
  • Single DNS endpoint → automatic failover to standby
  • Improves availability against AZ, network, instance, storage failures
  • Failover automatic, no app changes needed
  • Not for read scaling (standby = no reads)
  • Read Replicas can also be Multi-AZ for DR + scalability
Architecture
  • App connects via one DNS
  • Primary handles reads/writes
  • Standby kept in sync
  • On failure → DNS shifts to standby
Upgrade Single-AZ → Multi-AZ
  1. AWS takes snapshot of DB
  1. Restores snapshot in another AZ as standby
  1. Sets up synchronous replication

🟢 Amazon RDS Custom

  • Managed Oracle & SQL Server with full OS + DB customization
  • AWS manages provisioning, backups, scaling
  • You get full admin access
Allows
  • OS & DB config changes
  • Install patches/drivers
  • Enable native DB features
  • Direct access via SSH/SSM
  • Must disable Automation Mode to customize (snapshot recommended)
RDS vs RDS Custom
  • RDS → fully managed, no OS/DB access
  • RDS Custom → managed + full control of OS & DB

🟢 RDS Backups

Automated backups
  • Daily full backup during backup window
  • Transaction logs every 5 min → enables point-in-time restore (PITR)
  • Retention: 1–35 days (0 = disabled)
  • Stored in S3, managed by AWS
Manual snapshots
  • User-created anytime
  • Persist indefinitely until deleted
  • Restore DB to snapshot time
  • Useful for long-term archival or pre-deployment backups
Cost considerations
  • Stopped instances still incur storage + backup costs
  • For long-term shutdown → take manual snapshot + delete instance
  • Manual snapshots billed for size + retention duration
Extra
  • RDS supports Cross-Region Automated Backups

🟢 RDS & Aurora Security

At-rest encryption
  • Uses KMS for DB + replicas
  • Must be enabled at DB creation
  • Unencrypted primary → replicas also unencrypted
  • To encrypt: take snapshot → restore as encrypted
In-flight encryption
  • TLS by default
  • Clients use AWS TLS root certs for secure connections
IAM Database Authentication
  • Connect with IAM roles/tokens, no static creds
Security Groups
  • Control inbound/outbound network access
OS-level access
  • No SSH for RDS/Aurora
  • Only RDS Custom allows OS access
Audit logging
  • Can stream logs to CloudWatch Logs for monitoring + retention

🟢 Amazon RDS Proxy

  • Fully managed DB proxy for RDS & Aurora
  • Pools/shares connections → reduces CPU/memory load, avoids timeouts
  • Serverless, auto-scaling, multi-AZ HA
  • Failover acceleration → up to 66% faster than direct DB connections
Supported engines
  • RDS: MySQL, PostgreSQL, MariaDB, SQL Server
  • Aurora: MySQL, PostgreSQL
Security
  • Enforces IAM DB authentication
  • Stores creds in AWS Secrets Manager
  • Private only → accessible within a VPC
Compatibility
  • Typically needs no app code changes to integrate

🟠 Amazon Aurora

  • AWS proprietary relational database (not open source)
  • Compatible with MySQL & PostgreSQL (same drivers/tools)
Performance
  • ~5× faster than MySQL on RDS
  • ~3× faster than PostgreSQL on RDS
Storage
  • Auto-scales in 10 GB increments, up to 128 TB
Read replicas
  • Up to 15 replicas, <10 ms lag
High availability
  • Built-in HA with instant failover
Cost
  • ~20% higher than RDS, but more resource-efficient

🟢 Amazon Aurora – High Availability, Read Scaling & Cluster Architecture

  • Cluster design
    • 1 writer (all writes) + up to 15 read replicas (reads)
    • All share the same distributed storage volumeno replication lag
  • Storage layer
    • 6 copies of data across 3 AZs (2 copies per AZ)
    • Writes need 4/6, reads need 3/6 → durable & fault-tolerant
    • Data striped across volumes for performance
    • Self-healing with peer-to-peer replication
    • Auto-expands in 10 GB increments up to 128 TB
  • Endpoints
    • Writer Endpoint → always points to primary
    • Reader Endpoint → load-balances across replicas
    • Aurora can auto-scale replicas up/down based on CPU, connections, latency
  • Failover
    • Automatic failover to a replica in <30s
    • Maintains availability without application changes
  • Global support
    • Cross-region replication for DR and low-latency access in global apps

🟢 Aurora Serverless

  • Auto-provisioning & scaling
    • Aurora automatically provisions, scales, and pauses compute capacity.
    • Ideal for infrequent, intermittent, or unpredictable workloads.
    • Removes need for manual capacity planning.
    • Billed per second of active usage → cost-efficient for variable loads.
  • How it works
    • Clients connect through an Aurora-managed Proxy Fleet.
    • Proxy routes traffic to compute nodes that start/stop on demand.
    • Compute layer scales in/out based on connections, CPU, load.
    • All nodes share the distributed storage volume, always online.
  • Benefits
    • Seamless scaling without downtime.
    • Pay-per-use model.
    • High availability: storage is persistent even if compute is paused.
    • Fully managed: Aurora handles patching, backups, and failover.

🟢 Aurora Global Database

  • Aurora Cross-Region Replicas
    • Easy setup for DR and offloading read traffic.
    • Asynchronous replication, simple to configure.
  • Aurora Global Database (Preferred)
    • 1 Primary Region → full read/write.
    • Up to 5 Secondary Regionsread-only, <1s lag.
    • Each secondary supports 16 replicas (up to 80 global replicas).
    • Disaster Recovery → promote secondary to primary, RTO <1 min.
  • How it works
    • Primary (e.g., us-east-1) handles writes and local reads.
    • Secondary (e.g., eu-west-1) gets near real-time replication.
    • Local reads in secondary = low latency for global users.
    • Replication via Aurora storage-based sync → sub-second lag, faster than binlog.
  • Benefits
    • Global performance – reduced latency worldwide.
    • Fast failover – regional promotion for outages.
    • High scalability – up to 80 total replicas.
    • Low replication lag – typically <1s.

🟢 Babelfish for Aurora PostgreSQL

Overview

Babelfish extends Aurora PostgreSQL to understand SQL Server’s T-SQL dialect and wire protocol, enabling existing SQL Server applications to run on Aurora PostgreSQL with minimal or no code changes.

How It Works

  • Accepts T-SQL queries and processes them directly.
  • Supports SQL Server drivers (ODBC, JDBC, .NET) with no client changes.
  • Translates T-SQL to PostgreSQL internally.
  • Runs both T-SQL and native PL/pgSQL in the same DB.

Migration Support

  • Works with AWS SCT for schema conversion.
  • Use AWS DMS to replicate SQL Server data.
  • Enables phased migrations (SQL Server + PostgreSQL apps on same cluster).

Benefits

  • Reduced migration effort – no major code rewrites.
  • Dual dialect support – T-SQL + PL/pgSQL side by side.
  • Lower licensing costs – migrate away from SQL Server.
  • Fast adoption – SQL Server teams can use Aurora quickly.

Example Flow

  1. SQL Server apps → send T-SQL queries.
  1. Babelfish → translates & runs inside Aurora PostgreSQL.
  1. PostgreSQL apps → connect natively with PL/pgSQL.
  1. Both app types share the same Aurora cluster.

🟢 Aurora Backups

Automated Backups

  • Always enabled with 1–35 days retention (cannot be disabled).
  • Stored in Amazon S3, fully managed by Aurora.
  • Provide continuous backup with transaction logs for point-in-time recovery (PITR) to any second within retention.
  • No performance impact — backups are taken from Aurora’s distributed storage layer.

Manual DB Snapshots

  • Created on-demand by the user.
  • Retained indefinitely until manually deleted.
  • Can be restored to a new cluster or instance anytime.
  • Ideal for long-term archival, cloning environments, or safety before deployments.

🟢 RDS & Aurora – Restore Options

  • Restoring from automated backups or manual snapshots always creates a new DB instance or cluster (never overwrites the original).

RDS MySQL – Restore from Amazon S3

  • Export a backup of your on-premises/self-managed MySQL database.
  • Upload the backup to Amazon S3.
  • Use the RDS restore from S3 feature to create a new RDS MySQL instance with that data.
  • Supports MySQL dump and other compatible backup formats.

Aurora MySQL – Restore from Amazon S3

  • Backup your on-premises MySQL database using Percona XtraBackup (required).
  • Upload the backup file to Amazon S3.
  • Use Aurora’s restore from S3 to create a new Aurora MySQL cluster populated with the imported data.

🟠 Amazon ElastiCache

  • Managed in-memory caching service supporting Redis and Memcached.
  • Delivers high performance and low latency data access.
  • Offloads read-heavy workloads from databases, improving scalability and response times.
  • Enables stateless applications by storing session data in the cache.
  • AWS manages:
    • OS maintenance and patching
    • Performance tuning
    • Setup and configuration
    • Monitoring and failure recovery
    • Backups (Redis only)
  • Typically requires application code changes to integrate caching effectively.

🟢 ElastiCache – Redis vs Memcached

Redis

  • High availability with Multi-AZ and automatic failover
  • Read replicas for horizontal read scaling
  • Durability with AOF (Append-Only File) persistence
  • Backup & restore supported
  • Rich data structures: Strings, Hashes, Lists, Sets, Sorted Sets, etc.
  • Built-in geospatial data support with specialized commands

Memcached

  • Multi-node sharding for horizontal scaling
  • No replication and no built-in HA
  • Ephemeral data only (non-persistent, in-memory)
  • No native backup/restore
  • Multi-threaded for efficient multi-core usage

🟢 ElastiCache – Cache Security

  • IAM authentication (Redis only) – controls AWS API actions (create/modify/delete), not direct client access

Redis

  • Redis AUTH: password/token required for clients
  • Security groups: enforce network-level access
  • SSL/TLS encryption: secure in-transit connections

Memcached

  • SASL authentication: optional, for client access

🟠 Amazon Route 53

  • Highly available, scalable, fully managed, and authoritative DNS service
    • Authoritative → lets you create, update, and manage your own DNS records
  • Works as both a DNS service and a Domain Name Registrar
  • Supports health checks to monitor endpoint availability & performance
  • The only AWS service with a 100% availability SLA
  • Name comes from port 53, standard DNS port

🟢 Route 53 – Records

  • Define how to route traffic for a domain or subdomain.
Each record contains:
  • Name → Domain or subdomain (e.g., example.com)
  • Type → DNS record type (A, AAAA, etc.)
  • Value → IP address or hostname the record points to
  • Routing Policy → How Route 53 responds to queries
  • TTL → Duration the record is cached by DNS resolvers

Common DNS Record Types in Route 53

  • A → Maps hostname to IPv4 address
  • AAAA → Maps hostname to IPv6 address
  • CNAME → Maps hostname to another hostname
    • Target must resolve via A or AAAA record
    • Not allowed at Zone Apex (e.g., example.com), only for subdomains like www.example.com
  • NS → Lists the name servers for the hosted zone, controls DNS routing
Other supported types: CAA, DS, MX, NAPTR, PTR, SOA, TXT, SPF, SRV

🟢 Route 53 – CNAME vs Alias

CNAME

  • Maps a hostname to any other hostname (can be outside AWS)
  • Cannot be used at the zone apex (mydomain.com)
  • Only valid for non-root subdomains (app.mydomain.com)

Alias

  • Maps a hostname to AWS resources (ELB, CloudFront, S3 static site, API Gateway, etc.)
  • Works for both root (mydomain.com) and non-root (app.mydomain.com) domains
  • Free in Route 53 (no extra DNS query cost)
  • Supports Route 53 native health checks
Exam tip: For mapping a root domain directly to an ELB or CloudFront → use Alias, not CNAME

🟢 Route 53 – Alias Records

  • AWS-specific DNS extension that maps a hostname to an AWS resource (e.g., ALB, CloudFront, API Gateway)
  • Can be used at the Zone Apex (example.com), unlike CNAME
  • Always of type A (IPv4) or AAAA (IPv6)
  • TTL is managed by AWS, not manually configurable
  • Auto-updates if the AWS resource’s IP changes

Valid Targets

  • Elastic Load Balancers
  • CloudFront distributions
  • API Gateway
  • Elastic Beanstalk environments
  • S3 static website endpoints
  • VPC interface endpoints
  • AWS Global Accelerator
  • Another Route 53 record in the same hosted zone

Restrictions

  • Not allowed for EC2 public DNS names
Exam tip: To map a root domain to an AWS resource, always use an Alias record

🟢 Route 53 – Routing Policies

Routing policies define how Route 53 answers DNS queries.
They do not route the traffic themselves; instead, they control which IP or endpoint is returned to the client.

Simple Routing

  • Returns a single resource, or multiple IPs/values without traffic control logic
  • With multiple records of the same name, Route 53 returns them all, and the client randomly picks one
  • When using an Alias in Simple Routing, you can point to only one AWS resource
  • Health checks are not supported
  • Example:
    • foo.example.com → A 11.22.33.44
    • foo.example.com → A 11.22.33.44, A 55.66.77.88
  • When to use: Basic DNS resolution with no need for failover or traffic distribution

Weighted Routing

  • Distributes traffic between records based on assigned weights
  • Formula: % Traffic = (Record Weight) / (Sum of All Weights)
  • Weights don’t need to total 100
  • Records must have same name and type
  • Supports health checks
  • Use cases: gradual rollouts, A/B testing, balancing environments
  • Special case: Weight 0 → record excluded unless all weights are 0

Latency-based Routing

  • Routes to the resource with lowest network latency from the user’s location
  • AWS measures latency between regions and edge locations
  • May choose a non-closest region if latency is better
  • Supports health checks
  • When to use: globally distributed apps prioritizing performance

Failover Routing (Active–Passive)

  • Implements disaster recovery with a primary and secondary resource
  • Route 53 returns the primary if it’s healthy, otherwise the secondary
  • Health checks are mandatory
  • When to use: mission-critical services with DR needs

Geolocation Routing

  • Routes based on the geographic location of the user
  • Can define continent, country, or US state
  • Most specific match wins (state > country > continent)
  • Best practice: define a Default record
  • Use cases: localized content, compliance restrictions, regional load distribution
  • Supports health checks

Geoproximity Routing (requires Traffic Flow)

  • Routes based on geographic proximity between users and resources
  • Can apply a bias:
    • Positive → enlarge region, send more traffic
    • Negative → shrink region, send less traffic
  • Works with AWS and non-AWS resources
  • Use cases: migrations, balancing load dynamically

IP-based Routing

  • Routes based on client IP ranges (CIDR collections)
  • Useful for ISP-based or corporate routing, cost/latency optimization
  • Example:
    • 203.0.113.0/24 → 1.2.3.4
    • 200.5.4.0/24 → 5.6.7.8

Multi-Value Routing

  • Returns multiple healthy records for the same DNS name
  • Supports up to 8 records per query
  • Works with health checks
  • Not a replacement for load balancer → only DNS-level distribution
  • Example: www.example.com → multiple A records with health checks

🟢 Amazon Route 53 Resolver – Inbound & Outbound Endpoints

The Route 53 Resolver extends DNS resolution between AWS and on-premises networks in hybrid setups using Direct Connect or Site-to-Site VPN.

Inbound Endpoint

  • Allows on-premises systems to resolve private DNS records in an AWS VPC
  • Flow: on-prem → AWS DNS
  • Deployed inside a VPC with ENIs in subnets
  • On-prem DNS servers forward queries to the inbound endpoint
  • Example: db.internal.example.com resolved from AWS private hosted zone

Outbound Endpoint

  • Allows AWS resources to resolve on-premises DNS names
  • Flow: AWS → on-prem DNS
  • Configured in a VPC with forwarding rules (domain-based)
  • Queries matching rules are sent to on-prem DNS servers via VPN/DX
  • Example: EC2 resolves corp.local using on-prem DNS

Key Points

  • Endpoints are regional, but rules can be shared across VPCs using Resource Access Manager (RAM)
  • Support only IPv4 (no IPv6 for endpoints)
  • Secured by VPC Security Groups attached to endpoint ENIs
  • Pricing based on endpoint ENI/hour + queries

Typical Architecture

  • Hybrid DNS setup:
    • Inbound endpoint for on-prem → AWS queries
    • Outbound endpoint for AWS → on-prem queries
  • Works with private hosted zones and conditional forwarding rules
  • Common in environments with Direct Connect or VPN for secure DNS traffic

🟠 Amazon CloudFront

  • Content Delivery Network (CDN) service from AWS
  • Distributes content globally to reduce latency and improve end-user experience
  • Improves read performance by caching content at AWS edge locations closer to users
  • Provides global scale with 216+ Points of Presence (PoPs), including edge locations and regional edge caches
  • Enhances availability and resilience through distributed architecture
  • Offers built-in DDoS protection with integration to AWS Shield and AWS WAF

How It Works

  1. User request is routed to the nearest edge location based on latency
  1. Cache hit → served immediately from the edge
  1. Cache miss → CloudFront fetches from origin (S3, ALB, EC2, HTTP server), caches, then serves

Edge Network Structure

  • Edge Locations: small, globally distributed servers delivering content
  • Multiple Edge Locations: concentrated servers for high-demand regions
  • Regional Edge Caches: larger caches between edge and origin, reduce origin fetches

Key Benefits

  • Reduced Latency: nearest edge delivers content
  • Scalability: absorbs traffic spikes without stressing the origin
  • Security: integrates with Shield + WAF
  • Cost Optimization: reduces load on origin servers
  • Customizable Caching: fine-grained TTL and invalidations
  • HTTPS Everywhere: supports TLS with free ACM certificates

Advanced Features

  • Multiple Origins per Distribution: serve static (S3) + dynamic (ALB/EC2) content
  • Origin Failover: primary/secondary origins for HA; automatic failover on errors
  • Field-Level Encryption: encrypt specific sensitive fields at the edge before forwarding
  • ACM Certificates in us-east-1: required for HTTPS on custom domains
  • Origin Access Control (OAC): secure method for CloudFront to read/write to S3 (replaces OAI)

🟢 CloudFront – Origins

CloudFront needs an origin — the source location where the original content resides. Each distribution can have one or more origins, and the choice of origin type depends on the application architecture and security requirements.

S3 Bucket Origin

  • Best for static content (images, CSS, JS, files)
  • Cached at edges → lower latency and reduced S3 load
  • Supports Origin Access Control (OAC) to block direct public access
  • Can also handle uploads through CloudFront (via PUT/POST)
  • Options:
    • S3 REST API endpoint → recommended with OAC
    • S3 Website endpoint → only if static website hosting features are needed (redirects, error pages → treated as Custom Origin)

How it works

  • Users access content via public CloudFront URLs
  • Requests routed to nearest edge location (e.g., Los Angeles, São Paulo, Mumbai, Melbourne)
  • Edge → Origin communication travels privately over AWS backbone, not public internet

Security

  • Direct S3 access blocked
  • Access allowed only through CloudFront using:
    • Origin Access Control (OAC) – preferred modern method
    • Bucket policies restricting access to CloudFront identity
  • Guarantees content is only retrievable via CloudFront securely

VPC Origin

  • For apps in a VPC, often private subnets
  • Origin types:
    • ALB
    • NLB
    • EC2 instances (via public/Elastic IP)
  • Usually behind a load balancer for HA + scaling
  • Often combined with:
    • AWS Global Accelerator (fixed IP, low-latency routing)
    • PrivateLink/secure networking for restricted access

Custom Origin (HTTP)

  • Any public HTTP/HTTPS backend
  • Examples:
    • S3 Website endpoint (requires static hosting enabled)
    • On-premises servers exposed publicly
    • Third-party hosting services
  • Requires:
    • Origin Protocol Policy (HTTP only, HTTPS only, match viewer)
    • Custom headers if authentication is needed

🟢 CloudFront – Geo Restriction

Restricts access to a CloudFront distribution based on the user’s country.

Modes

  • Allowlist – Only specified countries can access the content
  • Blocklist – Specified countries are denied access

How It Works

  • Country determined using a third-party Geo-IP database
  • Requests from restricted countries return HTTP 403 (Forbidden)

Common Use Cases

  • Enforcing copyright/licensing restrictions
  • Meeting regulatory compliance for content delivery

🟠 AWS Global Accelerator

Key Features

  • Routes traffic through the AWS global network instead of the public internet, reducing latency and improving availability
  • Automatically assigns two Anycast IP addresses for your application
  • Provides consistent performance regardless of client location

How It Works

  1. Clients worldwide (e.g., in America, Europe, Australia) connect to the Anycast IPs
  1. Traffic is routed to the nearest AWS Edge Location
  1. From the edge location, traffic travels over the AWS private backbone directly to your application (e.g., a Public ALB in India)

Benefits

  • Anycast routing ensures optimal path selection
  • Reduces hops and latency by avoiding the public internet
  • Delivers fast, reliable global access through AWS infrastructure
  • Works with both public and private application endpoints

Supported Resources

  • Elastic IP addresses
  • EC2 instances
  • Application Load Balancers (ALB)
  • Network Load Balancers (NLB)

Consistent Performance

  • Routes traffic intelligently to the lowest latency endpoint
  • Supports fast regional failover in under one minute
  • Avoids caching issues since the IP addresses remain static

Health Checks

  • Actively monitors the health of registered endpoints
  • Automatically fails over traffic if a target becomes unhealthy
  • Enables quick disaster recovery scenarios

Security

  • Only two external IPs need to be whitelisted for client access
  • Built-in DDoS protection through AWS Shield

Unicast IP

  • Each server has its own unique IP address
  • Clients must connect to the specific IP for the desired server
  • Example:
    • Server A → 12.34.56.78
    • Server B → 98.76.54.32

Anycast IP

  • Multiple servers share the same IP address
  • Clients are automatically routed to the closest or best-performing server based on network topology
  • Example:
    • Servers A, B, and C all respond to 12.34.56.78
    • A client is routed to the nearest server, minimizing latency

Summary

  • Unicast → One IP per server
  • Anycast → One IP shared across multiple servers, with routing based on proximity or performance

🟠 Amazon SQS

Amazon Simple Queue Service (SQS) is a fully managed message queuing service that enables decoupling between application components.
Producers send messages to a queue, and consumers retrieve and process them asynchronously. This architecture allows each component to scale independently and improves overall fault tolerance.

🟢 Amazon SQS – Queue Types

Standard Queue

  • Oldest and most widely used SQS type
  • Unlimited throughput and number of messages
  • Low latency: typically < 10 ms for publishing and receiving
  • Message size limit: up to 256 KB
  • Retention: 4 days by default, configurable up to 14 days
  • Delivery semantics:
    • At-least-once delivery → duplicates possible
    • Best-effort ordering → messages may arrive out of order

FIFO Queue

  • Must have a .fifo suffix in the queue name
  • Ensures First-In-First-Out delivery order
  • Exactly-once processing (no duplicates)
  • Ordering is maintained using a Message Group ID (required for each message)
  • Deduplication ID prevents the same message from being enqueued more than once within the deduplication interval
  • Throughput:
    • Up to 300 messages/sec without batching
    • Up to 3,000 messages/sec with batching
  • Ideal for order-sensitive workloads such as financial transactions or sequential workflows
You cannot convert an existing Standard queue into a FIFO queue. You must create a new FIFO queue and migrate producers/consumers to it.