🟠

High Availability & Scalability in AWS

Scalability vs High Availability

Scalability: the ability of a system to handle increased load by adapting resources.

Vertical scalability: increase the size of a single instance (scale up/down).
Horizontal scalability: increase the number of instances (scale out/in).

High Availability (HA): the ability of a system to remain operational despite failures (e.g., AZ outage), usually by running resources in multiple Availability Zones.

Scalability and HA are related but not the same — you can have a scalable system that’s not highly available, and vice versa.

Vertical Scalability

Increase the capacity of a single instance (more CPU, RAM, network throughput).

Common for non-distributed systems like databases (RDS, ElastiCache).

Limited by hardware constraints.

Example: t2.micro → t2.large.

Horizontal Scalability

Add more instances to handle load.

Common for distributed systems such as web applications on Amazon EC2.

Requires a load balancer to distribute traffic evenly.

Easier to implement in AWS thanks to Auto Scaling Groups (ASG).

High Availability

Goal: withstand data center failures without downtime.

Requires running resources in at least 2 AZs.

Passive HA: standby resources that take over during failure (e.g., RDS Multi-AZ).

Active HA: multiple active resources handling traffic simultaneously (e.g., multi-AZ EC2 + Load Balancer).

High Availability & Scalability with EC2

Vertical Scaling: change instance type to a larger/smaller size.

Horizontal Scaling:

Use Auto Scaling Groups to add/remove EC2 instances based on metrics (CPU, requests, etc.).
Combine with Elastic Load Balancer to distribute traffic.

Multi-AZ HA: deploy instances across AZs within an ASG, fronted by a multi-AZ ELB.