Client Side Load Balancing for Web 2.0 Applications

Client Side Load Balancing for Web 2.0 Applications

Got something to say?

Share your comments on this topic with other web professionals

In: Articles

By Lei Zhu

Published on October 1, 2007

A web server handles HTTP (Hypertext Transfer Protocol) requests sent to it by web browsers. When you type in a URL —, for example—your computer sends out a request to look up the servers needed to handle requests and send responses back quickly. The technique for determining how to route requests to the cluster of web servers efficiently is called load balancing.

Load Balancing Web Applications

Load balancing increases the reliability of a web site by routing requests to other servers in the cluster when one of the servers is too busy or fails. There are many techniques for achieving load balancing, but generally they should meet the following requirements:

  1. Distribute loads among a cluster of application servers.
  2. Handle failover of an application server gracefully.
  3. Ensure the cluster of servers appears as a single server to the end user.

A popular yet simple approach to balancing web requests is called round-robin DNS. This approach involves creating multiple DNS entries in the DNS record for the domain. For example, let’s say we want to balance the load on, and we have two web servers with IP addresses of and, respectively. To create a round-robin DNS to distribute requests, we simply create the following DNS entry:

As the end user’s web browser sends a request for the DNS record for, the entry that is returned first will alternate. Since your browser uses the first entry returned, the requests are distributed evenly among our two servers. Unfortunately, the key drawback to round-robin DNS is that it fails the second requirement mentioned above—if one of the two servers fail, the DNS server will continue to send requests to it, and half of your end users will experience downtime.

Another popular load balancing technique involves handling requests with a single server or router dedicated to load balancing. Dedicated hardware equipment and software-based solutions such as F5 BIG-IP and Linux Virtual Server Project are examples of this type of solution. The dedicated load balancer takes requests and distributes them among a cluster of web servers. The load balancer detects if a server has failed, and routes requests to other servers. Also, to ensure that there is no single point of failure, a backup dedicated load balancer is available to take over in case the primary one fails.

The downsides to this approach are:

  1. There is a limit to the number of requests the load balancer itself can handle. However, this problem can be resolved with the combination of round-robin DNS and dedicated load balancers.
  2. There is an extra cost related to operating a dedicated load balancer, which can run into tens of thousands of dollars. The backup load balancer generally does nothing other than wait for the primary to fail.

Client Side Load Balancing

There is an approach to load balancing modern web applications that does not require any load-balancing hardware, and handles failure of servers more gracefully than round-robin DNS. Before delving into the details, let us consider a desktop application that needs to connect to servers on the internet to retrieve data. If our theoretical desktop application generates more requests to the remote server than it can handle using a single server, we will need a load balancing solution. Could we use the round-robin DNS and/or dedicated load balancer approach describe above? Of course, but there are other approaches that are more robust and less costly.

Instead of letting the client know of only one server domain from which to retrieve data, we can provide many servers—,, and so on. The desktop client randomly selects a server and attempts to retrieve data. If the server is not available, or does not respond in a preset time period, the client can select another server until the data is retrieved. Unlike web applications—which store the client code (JavaScript code or Flash SWF) on the same server that provides data and resource—the desktop client is independent of the server and able to load balance servers from the client side to achieve scalability for the application.

Sample of load balancing and scalability

So, can we apply the same technique to web applications? Before we can answer this question, we need to establish what makes up a web application.

Web applications have inherently blurred the boundary between the client component and the server component of a typical application. Web applications written in PHP may often have the server code entwined with the client code. Even if an MVC (model-view-controller) pattern framework is applied, and good practice is followed by separating code that generates the presentation layer (HTML) from code used for backend logic, it is still the server that is generating and serving the presentation. In addition, resources such as images are served by the server as well—but with new web technologies, the boundaries have shifted. Many applications today only make AJAX or Flash remoting calls—in fact, there is a lot of similarity in the ways a standard desktop application and web applications make remote calls.

For the purposes of client-side load balancing, there are three main components to a modern web application:

  1. Client-side code: JavaScript and/or SWF (for flash clients)
  2. Resources: images, CSS (Cascading Style Sheets), audio, video, and HTML documents
  3. Server-side code: backend logic that generates data requested by the client

It is easier to make the client code and resources highly available and scalable than to do so for the servers—serving non-dynamic content requires fewer server resources. In addition, it is possible to put the client code on a highly reliable distribution service such as Amazon Web Services’s S3 service. Once we have client code and resources served from a highly reliable location, let us take a look at how we can load balance server clusters.

Just like the desktop client above, we can embed our list of application servers into the client code. The web client contains a file called “servers.xml”, which has a list of available servers. The client tries to communicate (whether via AJAX or Flash) with every server in the list until it finds one that responds. Our client-side process is therefore:

  1. Load the file, which is stored with the client code and other resources, and contains the list of available servers, e.g.:
    <servers>   <server></server>   <server></server>   <server></server>   <server></server> </servers>
  2. The client code randomly selects servers to call until one responds. All subsequent calls use that server.
  3. The client has a preset timeout for each call. If the call takes greater than the preset time, the client randomly selects another server until it finds one that responds, and uses that server for all subsequent calls.

Making Cross-Domain Calls

If you’ve been working with AJAX for any length of time, you’re probably thinking, “This won’t work, because of cross-domain browser security,” so let’s address that.

For security reasons, web browsers and Flash clients will block calls to a different domain—for example, if the client wants to talk to the server, the client code must be loaded from the same domain, Requests from clients loaded from any other domain will be blocked. In order for the load-balancing scheme described above to work, the client code at needs to be able to call services running on other sub-domains (such as

For Flash clients, we can simply set the “crossdomain.xml” file to allow requests from *

<cross-domain-policy>   <allow-access-from domain="*"/> </cross-domain-policy>

For AJAX-based client code, the restriction depends on the transport we use to make server calls. If we use the Dynamic script Tag method to transport calls, there is no security issue, because we can make server calls without cross-domain security constraint issues. (However, it is generally a good idea to check the referrer header to make sure it is definitely your client that is making the server requests, for the sake of your site’s security.)

What if the application uses XMLHttpRequest? XHR strictly forbids client calls from a different domain to the server. Luckily, a workaround exists if the client and server have the same parent domain—as in our example, and We can make all AJAX calls to the server using an iframe loaded from the server; since browsers allow communication between scripts in an iframe, it is possible to access data received from the server calls made in the iframe if the scripts are loaded from the same parent domain. Problem solved.

Advantages of Client-Side Load Balancing

Now that we can make cross-domain calls, let us see how well our load-balancing technique meets the requirements outlined at the start of the article.

  1. Distribute loads among a cluster of application servers. Since the client randomly selects the server it connects to, the loads should be distributed evenly among the servers.
  2. Handle failover of an application server gracefully. The client has the ability to failover to another server when the chosen server does not respond within a preset period of time. The application server connection seamlessly fails over to another server.
  3. Ensure the cluster of servers appears to the end user as a single server. In the example, the user simply points a browser to The actual server used is transparent to the user.

So what are the advantages of using client-side load balancing over server-side load balancing? The obvious advantage is that a special load-balancing device is unnecessary—there is no need to configure load-balancing hardware, or to make sure that the backup functions the same as the primary load balancer. If a server is unavailable, simply remove it from the “servers.xml” file.

Another advantage is that servers do not have to be housed in the same location; since the client is selecting servers instead of having a fixed load balancer redirecting traffic, the locations of the servers are unrestricted. Servers can be in multiple datacenters in case one datacenter is not accessible. If the application requires a database on the local network, the other datacenter can still be used as a backup in case your primary one is unavailable. Changing to another datacenter is as simple as making an update to the “servers.xml” file, instead of waiting for DNS changes to propagate.

Voxlite, a Client-Side Load Balanced Web Application

Voxlite, a web-2.0 application that lets users send video messages to one another with just a browser and a webcam, is an application that uses client-side load balancing to achieve high availability and scalability. In addition, Voxlite uses the Simple Storage Service (S3) and Elastic Computing Cloud (EC2) services from Amazon Web Services.

From very early on, the S3 service presented an attractive option for storing and serving the video messages, and EC2 was naturally designed to work with the S3 service. It provides an easy and cost-effective way for Voxlite to scale to support more users. EC2 instances can be allocated at any time by simply starting a virtual machine image—each EC2 instance costs ten cents per hour, or seventy-two dollars per month. But what makes EC2 even more attractive is the computing resource is elastic; EC2 instances can be de-allocated when they are not being used. For example, if Voxlite gets more traffic during the day than in the evening, it is possible to only allocate more servers during the day, and thus, greatly increase the cost-effectiveness of the hosting solution. Unfortunately, one major drawback with EC2 is that it is not possible to architect a server-side load balancing solution that doesn’t have a single point of failure. Many web applications hosted on EC2 use a single EC2 instance with dynamic DNS to load-balance requests to a particular domain. If the instance that provides the load balancing fails, the whole system can become unavailable until the dynamic DNS maps the domain to another EC2 instance.

Using the client-side load balancing technique described above, it is possible to have a load-balanced solution with EC2 servers that has no single point of failure. To build a cluster of EC2 instances supporting client-side load balancing, Voxlite’s client code and other web resources is stored on, and served from, S3. An EC2 image with the server code is created so that whenever an EC2 instance starts, it is properly configured and ready to handle client requests. Voxlite then uses a clever technique to make the client aware of the available servers.

Earlier, I described the use of a “servers.xml” file to let the client know the list of available servers—but, with the S3 service available, there is a better way. When accessing an S3 bucket (a bucket is a term used by S3 for storing a group of files; the idea is similar to file folders) without any keys, the S3 service will simply list all the keys matching the given prefix—so, in each of Voxlite’s EC2 instances, a cron job is created that runs periodically and registers the server as a cluster member by writing an empty file with the key servers/{AWS IP address} to a publicly readable S3 bucket.

For example, if I go to the URL, I get the following response:

<ListBucketResult>   <Name>voxlite</Name>   <Prefix>servers</Prefix>   <Marker/>   <MaxKeys>1000</MaxKeys>   <IsTruncated>false</IsTruncated>   <Contents>     <Key>servers/</Key>     <LastModified>2007-07-18T02:01:25.000Z</LastModified>     <ETag>"d41d8cd98f00b204e9800998ecf8427e"</ETag>     <Size>0</Size>     <StorageClass>STANDARD</StorageClass>   </Contents>   <Contents>     <Key>servers/</Key>     <LastModified>2007-07-20T16:32:22.000Z</LastModified>     <ETag>"d41d8cd98f00b204e9800998ecf8427e"</ETag>     <Size>0</Size>     <StorageClass>STANDARD</StorageClass>   </Contents> </ListBucketResult>

In this example, there are two EC2 instances in the cluster, with IP addresses of and respectively.

The logic for the cron job is:

  1. Load and parse
  2. If the current running instance is not listed, write an empty file to the bucket with the key servers/{IP address of EC2 instances}.
  3. Verify if other servers listed in the bucket are running properly by testing the connection using the internal AWS IP address. If a connection cannot be established, remove the server key from the bucket.

Once this cron job is a part of the EC2 image, each running instance is automatically registered as an available server in the cluster. The client code (AJAX or Flash) parses the list of keys in the bucket and extracts the external AWS host name, allowing it to then randomly select a server to connect to, as described above when using the “servers.xml” file. If an EC2 instance shuts down or happens to crash, the other instances in the cluster would automatically remove its key from the bucket—the bucket would be left with only available instances. In addition, the client can select another EC2 instance in the bucket if a request does not get a response in the preset time. If the web site is getting more traffic, simply start more EC2 instances. If the load decreases, simply shut down the extra instances. By using client-side load balancing with S3 and EC2, it is easy to build an elastic, scalable and robust web application.

References and Additional Readings

Related Topics: Web Maintenance, Technology, Programming, Planning, Databases, Browsers

Lei Zhu is a software developer living in New York City. One of his technical interests is to build highly reliable web applications using inexpensive infrastructure. He can be reached at