Azure – Scaling Applications – Part 1

Santosh Gaikwad

Connect on LinkedIn      Follow SCI Page

Write to
Santosh Gaikwad

Latest posts by Santosh Gaikwad (see all)

>> Related Articles      >> Azure – Scaling Applications – Part 2

One of the major benefits of using the cloud is scalability. With Azure auto scaling, you can scale up and out like you couldn’t do with your own hardware, as much as your pay capacity. And importantly, you can scale down and in when you don’t need the resources, thereby saving money. This would not be possible if you bought servers on-premises to accommodate your peak load.

There are two main ways to scale resources:

  • Vertical: Scaling up and down
  • Horizontal: Scaling out and in

Scaling an application


Scaling-up refers to increasing the compute power of the hosting nodes i.e. increase the capacity of the servers by increasing memory, processing power, or drive spaces. Scaling-down is opposite, decrease capacity of server. Scaling-up has certain constraints as physical machines only support certain memory and disk.

Scaling-out is a horizontal approach. Instead of trying to increase the compute power of existing nodes, scaling-out brings in more hosting nodes to share the workload. There’s no theoretical limit to how much you can scale-out, you can add as many nodes as needed. This makes it possible for an application to be scaled to very high capacity that is often hard to achieve with scaling-up. Scaling-in is opposite, decrease number of instances that application runs on.

Scaling-out is a preferable scaling method for cloud applications.

Application can be scaled manually or automatically, Auto-scaling is a way to automatically scale up/down, in/out the number of compute resources that are being allocated to application based on its needs at any given time.

Scaleup vs Scaleout
Scaleup vs Scaleout
Scale OutScale up
Add more of the same serversAdd more resource to an existing server e.g. cores, RAM, Disk space
More difficult to scale existing applicationEasier to scale existing application
More cost effective for large scale applicationsLimited by cost and physics
Likely to need infra and application change

Why auto scale applications?

  • Better fault tolerance. Auto Scaling can detect when an instance is unhealthy, terminate it, and launch an instance to replace it.
  • Better availability. Auto Scaling can help you ensure that your application always has the right amount of capacity to handle the current traffic demands.
  • Better cost management. Auto Scaling can dynamically increase and decrease capacity as needed. Because you pay for the instances you use, you save money by launching instances when they are actually needed and terminating them when they aren’t needed.

Key areas to consider for scaling applications

Scaling an application is a complex problem that does not have a “one size fits all” solution. Simply adding resources to a system or running more instances of a process doesn’t guarantee that the performance of the system will improve.  To correctly scale your application there are few key areas that will contribute to applications success:

1. Understanding application architecture and its weaknesses.
Is Application Stateful? Stateless?
What are all the components of application?
Where are the bottlenecks in the application?
When load is applied to app, what will break first?

2.  Understanding the expected load and performance requirements.
Does the application need to serve one thousand users? Or one million?
Will traffic come from a single geographic location or globally?
Are there seasonal variations? Traffic peaks?
How fast should the app respond? 1 second? 1 millisecond?

3. Understanding and correctly leverage the platform hosting.
What features should be leveraged to achieve scale goals?

4. Consider Pipes and Filters Pattern
If the solution implements a long-running task, design this task to support both scaling out and scaling in.

5. Consider throttling the services
Auto scale takes some time to provision hardware, but in case of sudden burst of workload services might break by the time. See the Throttling Pattern.


Auto scale Azure solutions

Azure provides built-in auto scaling for following compute options.

  • Virtual Machines support auto scaling through the use of VM Scale Sets, which are a way to manage a set of Azure virtual machines as a group.
  • Service Fabric supports auto-scaling through VM Scale Sets. Every node type in a Service Fabric cluster is set up as a separate VM scale set.
  • Azure App Service has built-in auto scaling. Auto scale settings apply to all of the apps within an App Service.
  • Azure Cloud Services has built-in auto scaling at the role level.
  • Azure Functions automatically allocates compute power when code is running, scaling out as necessary to handle load.

Workload Distribution

When an application is scaled-out, the workload needs to be distributed among the participating instances. This is done by load balancing, Traffic Manager, Application Gateway in Azure.

Load Balancer
Applications are generally designed in multi-tier architecture. Hence the application workload needs to be distributed among the participating instances by the Azure public-facing load-balancer and middle tiers and database tiers that aren’t directly accessible from the Internet. Azure has introduced Internal Load Balancers (ILB) to provide load balancing among VMs residing in a Cloud Service or a regional virtual network.

End users access the presentation layer. The requests are distributed to the presentation layer VMs by Azure Load Balancer. Then, the presentation layer accesses the database servers through an internal load balancer.

Load Balancer
Load Balancer


Azure Traffic Manager
The job of Azure Traffic Manager is to route traffic globally based on flexible policies, enabling an excellent user experience that aligns with how you’ve structured your application across the world. Traffic Manager works at the DNS level. It uses DNS responses to direct end-user traffic to globally distributed endpoints. Clients then connect to those endpoints directly.

Traffic Manager has several different policies:

  • Performance routing to send the requestor to the closest endpoint in terms of latency.
  • Priority routing to direct all traffic to an endpoint, with other endpoints as backup.
  • Weighted round-robin routing, which distributes traffic based on the weighting that is assigned to each endpoint.
Traffic Manager
Traffic Manager


Application Gateway
Microsoft Azure Application Gateway offers various layer 7 load balancing capabilities for application. It allows customers to optimize web farm productivity by offloading CPU intensive SSL termination to the application gateway. It also provides other layer 7 routing capabilities including round robin distribution of incoming traffic, cookie-based session affinity, URL path-based routing, and the ability to host multiple websites behind a single Application Gateway. A web application firewall is also provided as part of the application gateway.

Application Gateway
Application Gateway


>> Related Articles      >> Azure – Scaling Applications – Part 2

Check Articles From Categories      Health and Parenting      Inspiring Stories      Technology      Microsoft Azure      SharePoint O365

Leave a Reply

Your email address will not be published. Required fields are marked *