Skip to main content

Extend autoscaling

Last updated on November 7, 2023

Hosted Extend service autoscaling strategy

Hosted Extend services run in standardized Pods. Each Pod represents a hosted application with:

  • 1 CPU core
  • 350 MB memory

Hosted Extend services use horizontal scaling as the strategy to match demand. Horizontal scaling works by adjusting the number of the running Pods. CPU utilization percentage of the Pod is used as the proxy for demand, with 100% Pod CPU utilization represents full (1) CPU core utilization. The Extend controller maintains the average of CPU utilization from all Pods of a hosted Extend service to be close to 80% by using the following algorithm:

desiredPods = ceil[currentPods * ( currentAvgCPUUtilizationPct / 80% )]

For example, with a scale-out scenario, given that:

  • The current average CPU utilization of the Pods is 100% (the Pods are fully utilized).
  • There are 2 Pods currently running.
desiredPods = ceil[2 * ( 100% / 80% )] 
desiredPods = 3

The desired number of Pods will be 3.

Another example, with a scale-in scenario, given that:

  • The current average CPU utilization of the Pods is 20% (the Pods are underutilized).
  • There are 4 Pods currently running.
desiredPods = ceil[4 * ( 20% / 80% )] 
desiredPods = 1

The desired number of Pods will be 1.