Powering the Web with HTTP

In developing Web applications, as in life, it is important to understand the basics. With that thought, let’s cover one of the most basic components of the Web: the hypertext transfer protocol, or HTTP.

HTTP

Why did you just get disconnected? Simply put: HTTP is stateless.

Using HTTP, a request is made by the browser (or another user agent) and sent to a server. The server then processes the request and sends back a response.

A request consists of a request line followed by any headers. These headers provide information about who you are, what types of content you can accept and other useful things. Web servers often capture this data in a log file that can be analyzed to track things like a number of visitors and what browsers they say they are using.

Here’s an example request:

GET / HTTP/1.1
Host: www.digital-web.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.7) Gecko/20050414 Firefox/1.0.3

In this case, I’m requesting the root document from www.digital-web.com and telling the server that I’m currently using Firefox.

When the Web server receives this request, it processes it and sends back a response. A response consists of a status line followed by the response headers.

A sample response:

HTTP/1.x 200 OK
Date: Mon, 25 Apr 2005 04:25:17 GMT
Server: Apache/2.0.46 (Red Hat)
X-Powered-By: PHP/4.3.2
Content-Type: text/html; charset=iso-8859-1

The 200 OK in the status line indicates I’ve found the document I was looking for. The headers supply additional information such as the server date, what Web server version is running and what type of document I’m about to receive. Although I left it out, the remainder of the response would have included the actual HTML document.

Where do I find these headers?

Browsers handle the exchange of the request and response automatically for you but it can be useful to capture this information during the development of a Web application.

Say you have a Flash application that is communicating with the server. It’s sending data to the server but something just isn’t working right. You could review the request to see which page the Flash application was requesting and what data it was sending along in the request.

For Firefox, I recommend a handy tool called LiveHTTPHeaders (its IE equivalent is IEHTTPheaders), which allows you to view, filter and capture HTTP requests and responses sent within the browser. Like in the example I just gave you, it can even capture requests from Flash files embedded in the page.

Request Types

When making a request, there are various methods that can be used. In Web application development, you generally only work with a few of these methods: GETPOST and (sometimes) HEAD.

In coding forms, you have no doubt noticed that GET retrieves a page by attaching the form values onto the URL. HTML preprocessors such as PHP and ASP will automatically parse the query string into an array or collection.

POST behaves much like a GET request except that the form data is sent as a separate part of the request instead of as part of the URL. This allows for more complex or lengthy requests such as uploading files to the server. Like the query string, your Web development environment should handle the parsing of this data into an easier-to-use array or collection.

When should you use POST and when should you use GET?

POST is best used when sending information that will change the state of something on the server. For example, you want to add a new entry to your blog. The action and the data that goes with it is a one-time event that will change the state of the database by adding a new record.

GET is best used to define a resource that you wish to have the capability of retrieving again at a later date—for example, a search query. I’ve seen some sites handle the search via POST. The downfall is that you can’t bookmark the results or send them to a friend. You would have to enter the search criteria again to view the results.

HEAD behaves exactly like GET except that the response will only include the headers. No content will be returned. Therefore, this is usually best used to determine basic information such as whether a particular resource exists or not. (Although, I’ve used XMLHttpRequest to retrieve information and have it send back a reply as a header instead of within the actual body of the content.)

You can test sending headers using a Telnet client by opening a connection to a server on port 80 (the default port for a Web server). To make a simple request, type GET/ HTTP/1.1 then hit the Enter key twice. You may have to enter the host header as well, as some servers run multiple Web sites off of one IP address. You’ll get a response from the server similar to what you see above, then get disconnected.

Rarely, if ever, would you actually need to send a request manually using Telnet but if your browser is misbehaving, it’s useful to have a way to access the raw information.

Stateless

Why did you just get disconnected? Simply put: HTTP is stateless. A connection is established to the server, the request is sent, the response is received, then the connection is broken.

This is great from a performance perspective but not so great when trying to remember who a user is from one request to the next.

Cookies

To get around this, cookies were invented. Cookies are small pieces of information in key/value pairs that are stored on the user’s machine. They get sent back and forth between the browser and the server via the headers. Now, when a request comes from a user, the server can send back a message and say, for example, “You are user 12.” Every request the user send after that says, “I am user 12.”

There are actually two different kinds of cookies: Temporary cookies and persistent cookies. A temporary cookie, also known as a session cookie or session variable, stores the cookie only while the browser window is open. Once the browser is closed, the cookie is lost for good. In both Firefox and IE, this usually means closing ALL windows or tabs. A persistent cookie, on the other hand, sticks around. When an expiry is specified in the Set-Cookie header, the browser will store the cookie on disk until the next time you visit the site or the cookie’s expiry date has passed.

Web servers and browsers automate this process and help make HTTP a transparent process. Understanding how it works, however, can help you build more efficient and usable Web applications.

Resources

For a rather dry but informative look at HTTP, check out the W3C.