Digital Web Magazine

The web professional's online magazine of choice.

Client Side Load Balancing for Web 2.0 Applications : Comments

By Lei Zhu

October 1, 2007

Comments

tom

October 2, 2007 8:33 AM

what about the SEO implications of having multiple subdomains serving the same content?

Walker Hamilton

October 2, 2007 12:45 PM

IMO, the moments at which you need sub-domains serving “the same” content are not public. Some good examples would be Joyent’s Connector where the domain is yourcompany.joyent.net or 37signals’ Basecamp where it’s yourcompany.projectpath(etc).com. This article is about “Applications” not “Web Sites” so I feel it has applied the use of load balancing in the correct manner in this case as, In these cases, the app is the same but the content varies greatly and is private.

Generally, #3 in Zhu’s list, is the manner in which one ensures search engine optimization for public websites while allowing for ongoing scaling of that site.

Mark Snape

October 2, 2007 1:21 PM

I fully appreciate the perspective of the article, but it shouldn’t be forgotten that hardware load balancers can provide additional features such as directing traffic to least utilised servers, or those currently returning the best response times.
The approach taken in the article is especially good in a hosting scenario where servers, and specifically network infrastructure are provided by the hosting company and are not easily re-engineered. Distributing loads across multiple, simple webservers, possibly even in datacentres from different providers, is simplified with this technique.
One final point, its not just about load balancing or scalability, having a method like this in place from the outset is good when a server needs to be taken out for planned maintenance.

Derrick

October 3, 2007 11:17 PM

Way to go. You broke all the caching proxies in the network path. This “approach” multiplexes names instead of resources, which is fail. Do it right next time.

Lei Zhu

October 4, 2007 12:57 AM

Mark – Thank you for the very nice comment. The article didn’t dig deep into topic such as directing traffic to least utilized servers, but it is possible. There are a few ways:

1) In the server.xml file, you can attach a weight to each server. For example, if one server is able to serve twice as many connections as the other, the servers would have weights of 2 and 1 respectively.

2) When the client is initially contacting the server to make sure it is available, the server can return a load number. E.g. the number of current connections. If it is above a certain threshold, the client would try another server.

3) In the voxlite example, when an EC2 instance updates its status, it can also attach a load number in the key. The client can utilize the server load information while deciding which server to connect.

Subbu Allamaraju

October 4, 2007 9:24 AM

I appreciate your zeal to solve this problem from the client side, but unfortunately, you won’t be able to go far enough with this approach. The current infrastructure/solutions isolate the client from having to do all this work. Now imagine rogue clients trying to kill individual nodes, and image the pains with having to protect each node from hackers. Also, this client side approach is impractical in almost all real-world scenarios where the physical server is not necessarily fixed and can change based on demand, available resources, virtualization and so on.

Anthony Cimino

October 5, 2007 5:44 AM

This reminds me of something Netscape did with either version 2 or 3 of their browser. Clicking on the animated Netscape logo generated a massive amount of traffic to their site. Since this was before the days of really robust routers, each click on the logo randomly pointed the user to www1 – www10.netscape.com.

Since the mid 90’s routers and load balancing technology has become a thousand times more robust. I think this is a clever idea, but it seems to be the burden of load balancing should really be off loaded from the client and put onto the datacenter.

John DeHope

October 5, 2007 7:18 AM

I really enjoyed this article. I had never thought of the fact that half the presentation code, and the complicated half at that, runs on the server! Of course it makes sense now that you mention it, but I had never considered it before. Have we really come that far since the mainframe days of a mainframe composing a screen and pushing it down to the terminal for some minor client-side processing, before receiving the whole screen back? We have not, as this is pretty much how a web app works today! Anyway I am getting off the subject…

Caching at the domain level is a non issue here. I’ve never worked on a web app where network caching of resources made any performance difference. There’s just not enough static content to make a difference. If I could snap my fingers and all the 16×16 pixel gifs would load instantly to the client, would he even notice?

And what’s this about “datacenters”? The whole point here is that there is no datacenter, right? The “datacenter” here is EC2!

An editorial note… The whole time I was reading this article, until the end, I kept thinking “yeah but how is he going to deliver the server list to the client, and how will he keep said list updated?” At the end you explain it quite well, but up until then I was left wondering. Just a quick note that you’ll explain that at the bottom would have tipped me off.

I’m still a little confused about how the server list is retrieved. Do all your servers in EC2 respond to the root domain, serving up the server list and bootstrap client code? If so how are they balanced? If you have 3 EC2 servers running, and a user dials in the root URL, which server does he get? Once he gets the server list it is clear that all subsequent AJAX calls can go to any of the active servers. But that initial list retrieval is still a little fuzzy to me. Perhaps I need to re-read the article.

Lei Zhu

October 5, 2007 1:59 PM

Thank you all the responses to the article.

John,

The list of available servers is stored separately. In the case for Voxlite, the list of EC2 instances is stored as keys in a S3 bucket. E.g. if you go to http://s3.amazonaws.com/voxlite/?prefix=servers , there are 2 keys with server ip addresses. If there were 3 EC2 servers running, there will 3 keys. The Ajax client selects a server randomly.

To Everyone,

I agree that server side load balancing is a robust and time tested approach. I believe that many issues you brought up can be resolved over time. For example, what if the Ajax framework or Flash library supports client side load balancing and developers only has to make server calls? The initial load of the server list, selecting a client, and server fail-over can all be handled by a common framework.

Another criticism I hear is the approach of randomly selecting a server as the load balancing strategy. There are many strategies for load balancing. Round robin is one of the most popular. Round robin clearly does not work for client side load balancing. Selecting a server randomly may not result in equal number of connections for each server, but over many requests, it should not result in significant higher load for one server over others. More advanced load balancing strategies such as selecting server with least load is also possible for client side load balancing as I have mentioned in my previous comment.

Lei

commenter

October 6, 2007 10:53 AM

Way to go. You broke all the caching proxies in the network path. This “approach” multiplexes names instead of resources, which is fail. Do it right next time.

Derrick, what does this even mean?

“which is fail” ?!! Is that even english?

Wanderer

October 7, 2007 7:02 AM

How do you handle bookmarks?

Reader

October 7, 2007 10:36 AM

I’ll have to agree with some of the other posters that the state of legitimate, dedicated load balancing hardware is at a place to render pseudo-load balancing client side useless. If your application is making too many request for the server to handle in the first place, most likely a vast amount of other problems are going on within your actual application. Anything from poorly written SQL to bad database table design to awful server side code. Bloating your application with a heap of client side code is just bad design to begin with. If your application is sound and you really have that many clients connecting at once, I think maybe it’s time then to make an investment into your product and scale the hardware to a suitable level.

It is a clever idea, but I am sure it’s been thought of many times before. If you have taken the time to setup a plethora of application servers and invested that time and money, you’ll most likely add a hardware load balancing tier to the equation without giving it any thought.

John DeHope

October 8, 2007 5:52 AM

If you have taken the time to setup a plethora of application servers…

Isn’t that the whole point to the second half of his article. He hasn’t taken the time to set up a plethora of application servers. He just made a few calls to EC2, waited a few minutes, and boom!, there were his servers.

If your application is making too many request for the server to handle in the first place, most likely a vast amount of other problems are going on within your actual application.

So you’re saying that any application that can’t run on a single box is poorly designed?

It is a clever idea, but I am sure it’s been thought of many times before.

What idea hasn’t been thought of before, in one way or another?

n the case for Voxlite, the list of EC2 instances is stored as keys in a S3 bucket. E.g. if you go to http://s3.amazonaws.com/voxlite/?prefix=servers…

How does a web app running from a different domain make ajax calls to that URL? Or does S3 support a calling convention that returns JavaScript code blocks, suitable for use in a IFrame ajax call? This is where I am confused, how does the ajax app call amazonaws.com?

Simon Jia

October 8, 2007 3:12 PM

This certainly is a pretty good idea, but IMO, there are some draw backs, both from performance perspective and security perspective (especially implemented through Ajax application).

Security Concern:
Javascript is served off a web server in plain text and a lot of times get cached by browsers. In order to load balance through client side, it means we have to maintain a list of active servers in the javascript somewhere. And yes, we can do a separated call at runtime to retrieve that list.
Basically you have opened up the list to anyone to see how many servers you have to handle the calls. Believe me, there are enough bad ppl out there who will use it for nasty purposes.

Performance Perspective:
The intial content (javasciprt, etc) still needs to be served off a webserver. What if the server dies? You still need some sort of DNS round robin or load balancer to serve the initial content.
Furthermore, making request to each server to see which one is responding can slow down the whole application process. I guess you can set a timer on the responses on each test request. But if the volume of traffic builds up, all of the servers might take a lot of extra time to process the test request. If the all of the server responds outside of the timer range, you then have no where to go besides picking a server randomly. On top of that, if you have 10 servers in the list, that’s ten test requests done from the client side before it realizes it needs to pick a random one and just go with it (of course you can set the threshold to a lower number).

The load of trying to figure out which server is faster and responding either lives on client side or server side. Personally, i’m still in favor of server side since it gives us enough control to add/remove servers on the fly if we need to, and the algorithm on the load balancer can be tweaked at anytime to control the traffic distribution.

I think there are different trade offs between traditional hardware load balancing and client side load balancing. Both of them are good ideas depending on what you are trying to implement.

Simon Jia

October 8, 2007 4:41 PM

If your application is making too many request for the server to handle in the first place, most likely a vast amount of other problems are going on within your actual application.

So you’re saying that any application that can’t run on a single box is poorly designed?

>>>>
Don’t take it the wrong way John. This is actually a common problem amount many web applications. Some companies choose to toss hardware and money at the problem. If the program is running slow and it takes 200 hours worth of work to solve, it’s probably smarter just to toss in extra servers. But on the other hand, if the code is really nasty and quick fixes or on going enhancements can literally boost up the performance, that seems to be the right way to go.
If you look closely at Zhu’s interest on the bottom of the post, it reads:
“One of his technical interests is to build highly reliable web applications using inexpensive infrastructure.”
The truth is, not every company has the budget to buy servers which they don’t utilize.

Jugs

October 10, 2007 6:22 AM

The approach you have suggested here works well

1. If the web application only invokes stateless API method calls
2. If session replication is properly implemented across the servers and you don’t have to be worried about hitting the same server twice to get the same state.

Ofcourse there are lots of advantages in using server-side load balancing. For instance mode_jk / mod_proxy has the ability to route request to the server on which the load is very less and the load balancer component will update load factors from the servers frequently so that the output is always excellent. But here in this case, servers.xml needs to be updated with this information very often and the client should download this file before a request is made to get the current load factors in the servers and to make the decision accordingly.

Then another problem will be to provide replication for servers.xml and the list goes on like this.

Anyways I believe this can be a GOOD start and all of us can work out some logics to get around these problems.

Lei Zhu

October 10, 2007 1:06 PM

Jugs,

Many web 2.0 applications can store and cache more information on the client side instead of relaying on sessions, but the technique I presented does not require the application to be stateless. Once the client finds a working server from the list of servers, it will be used until a call fails or times out. It is kind of like sticky session in mod_jk. When a call fails or times out, the client should re-download the servers.xml file to get the updated list of servers.

John,

The client actually does not make Ajax or Flash Remoting calls the the S3 service to get the list of servers in the voxlite example. The client simply loads the xml file and parses it as a DOM object.

Jugs

October 11, 2007 5:45 AM

Lei,

I fully understand your intention, but to me sticky sessions should be highly discouraged as they never account for fail-overs. In your case, you are not doing load balancing literally, you are only using a set of servers in the round-robin fashion when one server fails to respond. The will not account for fail-overs especially when sessions are not replicated across your servers.

To satisfy your need, you will only need an RRDNS setup with the list of servers that you have and when one of the servers fail to respond, query again with the same url and you will get another server, but again you will need sessions to be replicated.

For your purpose you don’t need a script which will choose the server for you and your application need not bother about different ips and instead use just one single url and RRDNS will take care of this. Use a provider like EasyDNS.

John DeHope

October 11, 2007 6:46 AM

The client actually does not make Ajax or Flash Remoting calls the the S3 service to get the list of servers in the voxlite example. The client simply loads the xml file and parses it as a DOM object.

Okay so the server retrieves the list of servers from S3, and serves it from a URL known to the client? That makes sense. Thanks for explaining it to me.

Now can you talk a little bit about how the client is bootstrapped? In my mind all you would need, at the bare minimum, would be a single HTML page with just enough javascript to retrieve the server list and begin a session. From there all subsequent content and javascript code can be retrieved from the app servers, after a session is begun.

I would think you would only need a single bootstap server, serving a single static HTML page, and a single static javascript file, which could be allowed to be cached. This bootstrap code would hardly ever change.

Lei Zhu

October 16, 2007 12:02 PM

John,

You can certainly load a very small js file that selects a server and then load the actual client code from that particular server, but I think it is better to load as much of the client code as possible. Web 2.0 applications can sometimes take 5-20 seconds to load initially, if the server you are communicating with crashes, you wouldn’t want to reload the client and have your end user wait another 5-20 seconds. In addition, your web 2.0 client will likely lose its state, and your end user will have to restart and lose their current work. For a flash client, I recommend loading the client code initally. If you are using Ajax and XMLHttpRequest, then there are 2 parts:

1) client code that make server calls in an iframe.
2) client code that interacts with the end user.

1) has to be reloaded each time the client changes the server it is communicating to. 2) should be loaded initially as the end users loads the app.

Lei

Steven B.

December 23, 2007 10:00 AM

I think the author dismises round-robin DNS a bit too quickly. We here at the museum have been using dnsmadeeasy(.com) for a year now with fantastic result. It’s cheap, solid, and includes auto-failover if a particular server in the rotation dies-out. Bigger setups (with more than a few servers) might need something bigger like UltraDNS. I’ve used hardware-based load balancers in the past, but that won’t help you if a core switch dies. The hosted DNS solution allows us to have servers in two completely different locations, be failover between them… without any additional engineering.

Wes

July 17, 2008 8:38 AM

So if I were to take this approach, my application would need to make an http request and parse XML to figure out what servers are available, before it can make the ACTUAL request for whatever was supposed to be done?

I’m sure user’s appreciate having to wait for 2 requests instead of one. Terrible.

Sorry, comments are closed.

Media Temple

via Ad Packs