Powering the Web with HTTP
In: Columns > Behind the Curtain
Published on June 28, 2005
In developing Web applications, as in life, it is important to understand the basics. With that thought, let’s cover one of the most basic components of the Web: the hypertext transfer protocol, or HTTP.
Using HTTP, a request is made by the browser (or other user agent) and sent to a server. The server then processes the request and sends back a response.
A request consists of a request line followed by any headers. These headers provide information about who you are, what types of content you can accept and other useful things. Web servers often capture this data in a log file that can be analyzed to track things like number of visitors and what browsers they say they are using.
Here’s an example request:
GET / HTTP/1.1
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.7) Gecko/20050414 Firefox/1.0.3
In this case, I’m requesting the root document from www.digital-web.com and telling the server that I’m currently using Firefox.
When the Web server receives this request, it processes it and sends back a response. A response consists of a status line followed by the response headers.
A sample response:
HTTP/1.x 200 OK
Date: Mon, 25 Apr 2005 04:25:17 GMT
Server: Apache/2.0.46 (Red Hat)
Content-Type: text/html; charset=iso-8859-1
200 OK in the status line indicates I’ve found the document I was looking for. The headers supply additional information such as the server date, what Web server version is running and what type of document I’m about to receive. Although I left it out, the remainder of the response would have included the actual HTML document.
Where do I find these headers?
Browsers handle the exchange of the request and response automatically for you but it can be useful to capture this information during the development of a Web application.
Say you have a Flash application that is communicating with the server. It’s sending data to the server but something just isn’t working right. You could review the request to see which page the Flash application was requesting and what data it was sending along in the request.
For Firefox, I recommend a handy tool called LiveHTTPHeaders (its IE equivalent is IEHTTPheaders), which allows you to view, filter and capture HTTP requests and responses sent within the browser. Like in the example I just gave you, it can even capture requests from Flash files embedded in the page.
When making a request, there are various methods that can be used. In Web application development, you generally only work with a few of these methods:
POST and (sometimes)
In coding forms, you have no doubt noticed that
GET retrieves a page by attaching the form values onto the URL. HTML preprocessors such as PHP and ASP will automatically parse the query string into an array or collection.
POST behaves much like a
GET request except that the form data is sent as a separate part of the request instead of as part of the URL. This allows for more complex or lengthy requests such as uploading files to the server. Like the query string, your Web development environment should handle the parsing of this data into an easier-to-use array or collection.
When should you use
POST and when should you use
POST is best used when sending information that will change the state of something on the server. For example, you want to add a new entry to your blog. The action and the data that goes with it is a one-time event that will change the state of the database by adding a new record.
GET is best used to define a resource that you wish to have the capability of retrieving again at a later date—for example, a search query. I’ve seen some sites handle the search via
POST. The downfall is that you can’t bookmark the results or send them to a friend. You would have to enter the search criteria again to view the results.
HEAD behaves exactly like
GET except that the response will only include the headers. No content will be returned. Therefore, this is usually best used to determine basic information such as whether a particular resource exists or not. (Although, I’ve used XMLHttpRequest to retrieve information and have it send back a reply as a header instead of within the actual body of the content.)
You can test sending headers using a Telnet client by opening a connection to a server on port 80 (the default port for a Web server). To make a simple request, type
GET/ HTTP/1.1 then hit the Enter key twice. You may have to enter the host header as well, as some servers run multiple Web sites off of one IP address. You’ll get a response from the server similar to what you see above, then get disconnected.
Rarely, if ever, would you actually need to send a request manually using Telnet but if your browser is misbehaving, it’s useful to have a way to access the raw information.
Why did you just get disconnected? Simply put: HTTP is stateless. A connection is established to the server, the request is sent, the response is received, then the connection is broken.
This is great from a performance perspective but not so great when trying to remember who a user is from one request to the next.
To get around this, cookies were invented. Cookies are small pieces of information in key/value pairs that are stored on the user’s machine. They get sent back and forth between the browser and the server via the headers. Now, when a request comes from a user, the server can send back a message and say, for example, “You are user 12.” Every request the user send after that says, “I am user 12.”
There are actually two different kinds of cookies: Temporary cookies and persistent cookies. A temporary cookie, also known as a session cookie or session variable, stores the cookie only while the browser window is open. Once the browser is closed, the cookie is lost for good. In both Firefox and IE, this usually means closing ALL windows or tabs. A persistent cookie, on the other hand, sticks around. When an expiry is specified in the
Set-Cookie header, the browser will store the cookie on disk until the next time you visit the site or the cookie’s expiry date has passed.
Web servers and browsers automate this process and help make HTTP a transparent process. Understanding how it works, however, can help you build more efficient and usable Web applications.
For a rather dry but informative look at HTTP, check out the W3C.
Jonathan Snook is a freelance web developer and consultant. When not working on one project or another, this proud father can be found spending time with his son and wife in beautiful Ottawa, Ontario, Canada.