- Dynamically adjusts the desired number of running ECS tasks based on load or metrics.
- Powered by AWS Application Auto Scaling.
Key Metrics for Scaling
- ECS Service Average CPU Utilization
- ECS Service Average Memory Utilization
- ALB Request Count per Target (from Application Load Balancer)
Scaling Methods
- Target Tracking Scaling β Keeps a CloudWatch metric at a specific target value.
- Step Scaling β Responds to CloudWatch alarms with scaling actions based on thresholds.
- Scheduled Scaling β Runs scaling actions at fixed times/dates (good for predictable traffic patterns).
Scope of Scaling
- ECS Service Auto Scaling β Adjusts task count.
- EC2 Auto Scaling β Adjusts container instance count (EC2 launch type).
- Fargate Auto Scaling β Only adjusts task count (no instance scaling needed).
EC2 Launch Type β Auto Scaling EC2 Instances
When running ECS with the EC2 launch type, scaling ECS tasks may require scaling the underlying EC2 instances.
Auto Scaling Group (ASG)
- Manages the number of EC2 instances in the cluster.
- Scales based on CloudWatch metrics such as CPU utilization.
- Ensures additional EC2 instances are launched when needed to run more tasks.
ECS Cluster Capacity Provider
- Integrates ECS with an ASG for automated infrastructure scaling.
- Monitors ECS task capacity needs (CPU, RAM).
- Automatically provisions EC2 instances when thereβs insufficient cluster capacity.
ECS Scaling β CPU Usage Example
Scenario: ECS Service Auto Scaling based on CPU utilization.
- CloudWatch Metric
- ECS service reports Average CPU Utilization to CloudWatch.
- CloudWatch Alarm
- Alarm triggers when CPU exceeds a set threshold (e.g., 70%).
- Service Auto Scaling Action
- Policy adds new ECS tasks (e.g., from 2 to 3 tasks) to handle increased demand.
- Infrastructure Scaling (Optional)
- If EC2 launch type is used and the cluster lacks resources,
the Capacity Provider + ASG launches more EC2 instances.
This combined scaling approach ensures both the application layer (tasks) and the infrastructure layer (instances) can respond to traffic spikes.