Standards for distributed information architecture
In: Columns > IAnything Goes
Published on October 2, 2002
The articles in this month's iteration of Digital Web Magazine all focus on standards, and their importance to the present and future of the Web. In addition to markup standards like HTML and XML, and presentation standards like CSS, there are formats like SOAP and XML-RPC, which use existing Web standards as a basis for communication and transactions between Web sites.
However, there is currently no standard for allowing Web sites to share data with respect to their categorization, organization, and labeling. Creating standards for distributed information architecture would allow for easier and more effective combination of content, resources, and metadata across sites.
You say to-ma-to, I say to-meta-data
A few months ago, I purchased a new digital camera. My old digital camera served me well but was not capable of producing high-quality prints. The new camera has a much higher resolution and produces great photos, both on screen and on paper. I have always posted my pictures on my Web site, but, since I wanted prints of my pictures and don't have a color printer, I needed to upload my pictures to an online photo service. I chose Target Photo Center, which allows me to upload pictures to my account, order prints, and pick them up at my local Target store.
Now imagine this scenario. I have a gallery of digital pictures on my Web site, organized into several categories. One category features pictures from a recent visit to my parents' house, which I labeled "Visit Mom and Dad – Summer 2002." My father took his own pictures during my visit with his digital camera, and has these on his Web site under the heading "Jeff's visit, July 2002."
In addition to posting the pictures on my Web site, I upload several of them to my Target Photo Center account so I can order prints for myself. There isn't a Target store close to my parents' house, though, so my father opts to have his digital photos converted to prints by using ofoto.com or the Wal-Mart Photo Center.
Now, there are photos residing in five places: My Web site, my father's Web site, my section of Target Photo Center, my father's account at ofoto.com, and my father's account at Wal-Mart Photo Center. These photos need to be manually uploaded, copied, labeled, and organized.
The Web—our vague universe and network that is partially accessible by search engines and directories—sees these five locations as disparate and separate. In reality, we know that even though these five sets of pictures are in five different places, and potentially labeled under five different headings, they really aren't five different things. Though they appear differently to a network of computers, we know that in fact they are all related.
What I have labeled "Visit Mom and Dad – Summer 2002" my father calls "Jeff's visit, July 2002." You say po-tay-to, I say po-tah-to; it's still the same vegetable. Or, perhaps more appropriately, I might label the tomato as a fruit, and you might call it a vegetable. Regardless of what category it's in, it's still a tomato, and we're both referring to the same piece of food.
I can purchase prints of any photo that I upload to my Target Photo Center account. I can even forward the URL to friends and relatives, and they can buy their own copies of the prints. That's a great way to do business, but it's not enough. I should be able to buy prints of the pictures on my own Web site through Target Photo Center without having to upload them to my Target account. I should be able to point my Target account towards my father's Web site, and even his ofoto.com and Wal-Mart accounts, and be able to buy photos all from my Target account.
It is not that difficult to imagine this happening. Sure, I would need to manually make the initial links between the five locations, but after that, no additional work should need to be done. If my father renames his picture collection to "July 2002 – Jeff comes to visit," my Target Photo center account still should be able to access those pictures and understand that they're in the same collection. Similarly, if he adds more pictures to that folder, they should appear in my Target Photo Center account without me having to manually upload or import them.
Not just rainbow connections
I define a connection between similar groupings of objects across multiple Web sites as a distributed information architecture. In the digital photo case, the information architecture is encompassing photos that are distributed among several Web sites. However, like true good information architecture, the fact that these pictures are housed in several different locations should be transparent. It is said that an information architecture (and also graphic design) is only noticed when it is done poorly, and I would add that the same is true with the distributed nature of a distributed information architecture.
The basic idea of distributed information architecture with syndicated content is applicable to many purposes, with commerce being a major purpose. Imagine, for example, that you are in a rock band that has CDs for sale on your Web site. There are probably thousands of rock bands just like yours, with Web sites that offer CDs for sale. Wouldn't it be great if I could set up an online CD store that didn't hold any inventory, didn't facilitate auctions, and didn't require you to format your information in a special way, in order for it to be published on my Web site?
Utilizing a standard for distributed information architecture, I could store that aggregated information about any band that provided information in the "standard" format, along with information about each CD, format it on my Web site, and allow people to buy the CDs directly from me. People would benefit by being able to purchase multiple albums from one trusted and centralized source rather than having to seek out each band individually and perform multiple transactions. When a sale occurs, I would take a small commission, send the transaction information to each band for fulfillment, and then credit the balance to the band's account when the order is sent.
This scenario would be similar to the ones put into practice by sites like half.com and amazon.com Zstores that feature multiple sellers, except that individual retailers could provide information about their goods without having to enter the information multiple times on multiple sites. If there were many Web sites like mine, you could have your information in one standard format that would allow an unlimited number of sites to sell your goods. Plus, the choice of whether to show the individual suppliers or hide the aggregated and distributed backend would be up to whoever is running the site.
Let's go even further and say that you're an emo band. Well, on a large site like Tower Records, that would be filed under Rock. On the site of a smaller specialty store that sold only indie rock CDs, you would need more specific categorization and information. In one case, emo = rock, but in another, emo = emo. The metadata that you supply could be used as generally or as specifically as needed for each particular Web site. The architecture of the aggregating site could even be totally adaptable, with no preset categories – only facets that are based on the information collected from content and metadata providers.
There are two important aspects to understand about such a system. First, it takes advantage of the distributed nature of the Web and is built on aggregation and syndication. Second, it is able to reference the connections between categories and allow for the syndicator to provide different options, while also allowing an aggregating site or application to use its own labels and categorization.
For the most part, the first part of these objectives is already possible with XML and similar accepted standard languages like RSS. Similar systems based on other technologies like Moveable Type's TrackBack have also started to become more commonly used. There is no acknowledged standard for the second aspect, though that brings us to the second event I mentioned earlier.
You want standards? I got yer standards...
The type of system I was envisioning was actually proposed earlier this year. eXchangable Faceted Metadata Language—or XFML—was published in May 30, 2002. As the XFML homepage (xfml.org) states, "XFML is an open XML format for publishing and connecting faceted metadata between Web sites. ... XFML allows for easy creation of advanced, automatically generated navigation for your Web site. You can even automatically generate links to related topics on other Web sites. It also allows for merging of metadata between different Web sites."
I'll be the first to admit that, when XFML was announced, I didn't really grasp its point entirely. Faceted metadata is not an easy idea to understand, especially when visualized across multiple sites, but it helps to understand how the idea could be used in practice. The photo sharing problems I faced would be an excellent application for XFML. Pictures could reside in several different locations, but through the sharing and combination of XFML documents, images from five different servers could all appear in my Target Photo Center account, and I could purchase prints of all of them, regardless of where they were stored.
It's important to note that XFML is not yet a standard. Though it is based on the Topic Maps (XTM) standard, and thus on XML, it is still in development (v0.2 at the time of this writing, with v0.8 said to be forthcoming) as a proposed standard. XFML has many features that would facilitate the development of distributed information architectures. Authors can create and share taxonomies, merge XFML documents, manually define metadata and facets, and import XFML from other authors. The XFML Web site has a full list of features, benefits, and possible uses.
Though promising, XFML does have its limitations. Only parent-child relationships between pieces of content are possible, limiting the types of associations that could be made. It is not possible to have multiple languages (that's "real" languages—e.g. English, French, Spanish—not programming or markup languages) within an XFML document like it is within XHTML. Most importantly, since it is still being developed, XFML must be created by hand and is not supported in any current software, or on any existing Web sites, for that matter. There is no guarantee that it will become a standard or even that it will be used in practice.
A standard for distributed information architecture is certainly needed. What SOAP and XML-RPC have done to allow Web services to span multiple domains, a similar standard could do to allow information architectures to span multiple Web sites and Web applications. XFML certainly has the potential to become that standard, but only time will tell. Regardless of the specific technology, the concepts and techniques for implementing distributed information architectures will remain the same, and information architects and Web Developers will need to be prepared to create and implement distributed information architectures upon the appropriate standards if and when that time comes.
Jeff Lash is a User Experience Designer in the Health Sciences division of Elsevier. He is a co-founder and Advisory Board member of the Asilomar Institute for Information Architecture (AIfIA) and has also written articles and tutorials for Boxes and Arrows and WebWord. His personal website is jefflash.com.