Digital Web Magazine

The web professional's online magazine of choice.

Microformats Primer : Comments

By Garrett Dimon

November 14, 2005

Comments

Jules

November 15, 2005 6:28 AM

Although I like the hCard microformat, I am not sure I like the use of compact=“compact” in XOXO format (anyone wonder why they named this microformat after “hugs and kisses” :-)). I am going to have to ponder this some more but the first thought that came to my head was that the attribute compact was being misused as a means of hiding content (using CSS). The problem is that someone without a CSS-browser will see all of the content and therefore get something different than those with a CSS-capable browser and what about screen-readers that read hidden content.

michael mckee

November 15, 2005 8:25 AM

Huh? I thought I had some idea what microformats are, standardized xhtml formatting conventions for standard uses, i.e. RSS. Now I’m overwhelmed with buzzwords and quite confused.

grumpY!

November 15, 2005 8:40 AM

why are people wasting their time with this? microformats are the classic last gasp of a contrived technology movement, an answer to a question no one is asking, not even the 1% of the 1% of the 1% web2 obsessed tag/geocode/flickr manics.

Nick Finck

November 15, 2005 8:46 AM

I respectfully disagree, grumpY. While I do think not many within the industry are asking for microformats, I do know that a lot of clients are asking for them. Now, sure this could be because of certain people hyping them. But the clients that have come to me asking for them had several real business cases as to why they need them. I got my eye on microformats right now, but if more clients keep asking for them it’s going to get a lot more of my attention.

Garrett Dimon

November 15, 2005 9:40 AM

Michael – I’m not sure what buzzwords you’re talking about. That is almost exactly what I said. :) The opening explanation is a little on the technical side, but I wouldn’t call them buzzwords. The rest of the article focuses on how down-to-earth Microformats actually are.

grumpY – I disagree as well. Microformats are about writing appropriate semantic markup for humans. If it can be used for machines, then that’s great, but it’s far from the soul purpose. It’s an answer to writing better, more readable code. What’s wrong with that?

Danny

November 15, 2005 9:45 AM

Nice piece. Microformats offer a very neat solution to a whole range of practical problems. I reckon they’re a good step towards moving from a Web of Documents to a more general Web of Data. The coolest bit being that they’re not inventing whole new formats for things that can already be done: simple descriptions, lists, hierarchies etc. XHTML can cover all of these nicely.

There’s one aspect that I believe deserves mention, in the context of using microformats to provide machine-readable data on the Web. A HTML document can (and should, IMHO) declare the microformats in use, providing URI as values(s) for the profile attribute of the head element.

With profile references in place, it means that globally unambiguous data can be deterministically extracted from microformat documents. In other words, it’s no longer necessary to scrape HTML pages, you can parse them.

chris ward

November 16, 2005 7:29 AM

Well, if indeed microformats are presented as XML, we now lose the eXtensibility of XML.

Semantical markup is key, but this movement is not forward-compatible.

It looks like a buzzword that people are desperately trying to take a stake on, on the internet.

chris ward

November 16, 2005 7:43 AM

Okay, my bad…

So if RSS is a microformat, a set psuedo/industry/proven-standard , then we could invent microformats for so many different objects.

How on earth would we be able to keep up with them all, or even build extensibility into them at our own pace?

The term ‘microformats’ is just synonymous with ‘best semantical practices’?

Garrett Dimon

November 16, 2005 8:11 AM

Chris – The key is that Microformats are built on standards that already exist and are in wide practice. Yes, you’re absolutely right, there are many different types of objects that could be described by their own unique Microformat, but the discussion to establish a specific standard is rigorous.

As for keeping up with them all, there’s a list of current Microformats as well as discussions and notes regarding future Microformats at Microformats.org.

XML’s strength, extensibility, is also its weakness. With that extensibility comes complexity. Additionally, if everyone is creating new tags on a whim, there would be very little consistency in markup from one site to the next. That means less semantic valuable for search engines and other spiders and more of a mess.

The discussion and process for establishing a Microformat is thorough, open, and driven by real needs. The discussion for a Microformat never begins with a hypothetical situation. It’s only after someone presents an real-world example on the discussion list that the conversation gets started, adn that conversation can continue for weeks, if not months, before that Microformat is officially adopted.

Best of all, they’re so simple and practical, you can use them now without needing to learn anything new.

karl

November 16, 2005 8:47 AM

There is one misleading comment in the article which might disappoint people playing with microformats.

Garry, you said:

“Providing explicit metadata requires extra effort and has traditionally been extremely challenging or neglected altogether. With microformats, implicit metadata is provided by virtue of using the right markup. It won

Garrett Dimon

November 16, 2005 9:35 AM

Karl – You are partially correct. Some of the data that would be exposed would still need to be provided manually.

Yes the data is made explicitly available, but not in the traditional ways. i.e. It’s about exposing existing data in an appropriate format instead of creating that data. Like I said in the article, it won’t replace the need for metadata, but it serves as a great supplement.

karl

November 16, 2005 10:11 AM

In the traditional ways, as you meant ;), there is a very small overlap between data in the text and data in header (I suppose it’s what you are talking about).

Author: Being in the header or in the page. No overhead. Just a question of tools and templates.
Email: Same. :)
Copyright: Same :)
Title: Same :)
Keywords. Typically they were in the header in the ol’time, and people start to include them in the page itself. Unfortunately in a none vendor neutral way. But basically same effort.
Calendar: They never have been in meta of head section. So that’s additional work.
Address: they never have been in the meta of head section. So that’s additional work.

Be careful, I’m not saying that microformats are bad. Quite the opposite, they will add more explicit information, but it will require a bit more work. My comment was just here. Someone new to it, might think: “Oh cool, I will have less work to do”. When it’s more, I will have a different work to do, and maybe a little bit of extra work.

In the end, everyone will win because there are explicit metadata/data :) which means parsable, which means you can reinject them in many frameworks. :)

Garrett Dimon

November 16, 2005 10:35 AM

Karl – Agreed. :)

Brian Reindel

November 16, 2005 2:04 PM

“XML’s strength, extensibility, is also its weakness. With that extensibility comes complexity. Additionally, if everyone is creating new tags on a whim, there would be very little consistency in markup from one site to the next. That means less semantic valuable for search engines and other spiders and more of a mess.”

Very informative article, but I’m going to have to disagree with the quote above. Search engines like Google are not only categorically crawling for content (which theoretically is easier to find if code is semantically well-formed), but they are seeking to give it weight or relevance. Microformats cannot provide this kind of relevance, only structure. This is something XML already does quite well.

There is also the cross cultural and business sector variables to think about when determining a proper nomenclature for any microformat. Who determines what gets to be “standard” in my neck of the woods in my particular line of work? A slight dialect change from California to Texas all the way over to Amsterdam can cause havoc.

Predefining microformats I believe only works on an organizational level. In which case, I think they are a great idea. Think of a company like General Motors or Sony – they constantly develop Web applications that meet the needs of various third-parties doing business with them. Defining microformats for information that is likely to be shared across those applications and amongst those third parties can be extremely helpful. However, IMHO, I would say they should stay out of the mainstream.

chris

November 16, 2005 4:02 PM

I can see the need to reach the zen of semantical markup in mostly every situation, but i’d also like to point out that these are just recommendations.

RSS is pretty well-founded now, and considered a ‘standard’ microformat, so are there any other microformats on the rise to look out for?

Garrett Dimon

November 17, 2005 7:20 AM

Brian – What I meant by less semantically valuable means that if search engines have to understand 20 different dialects, that they are far from being able to match them all up. XML and new DTD’s mean more people making more “standards” until there is none, and the subtleties of creating a good XML schema are lost on most.

Not surprisingly, this is the same problem with tagging and folksonomies. Is it NYC, New York, New York City, The Big Apple, or NY? You’ll see tags for them all, but Flickr or any app doesn’t have the awareness to realize they are all the same thing. If everybody is creating their own tags (literally) through XML, then there’s less value than if search engines only have to interpret one standard set of tags and markup.

As for the cross cultural and business sectors, all of the microformats I’ve seen are well insulated from those discrepencies. The group that’s working on defining them isn’t isolated to one region or culture. The discussions for these microformats happen over e-mail and a wiki where anyone from around the world can provide ideas and suggestions.

Chris – RSS isn’t a Microformat. It doesn’t rely on existing HTML tags and attributes. It’s really made up of several different XML schemas. (RSS, RSS 2.0, Atom) It might be transparent to most common users, but developers of feed readers must support all of the different formats. The amount of flexibility in XML encourages fragmentation of what otherwise could be a very simple standard. As for additional Microformats, there’s a list of current and upcoming Microformats at the official web site.

Joshua Porter

November 19, 2005 12:39 PM

XML’s strength, extensibility, is also its weakness. With that extensibility comes complexity.

The only reason why microformats are not overly complex is that they are controlled by several developers led by the folks at Technorati. Call it a peer review or what you want, but it’s humans that make them simple, not the fact that they’re dealing with XHTML or because people can’t just make up their own tags.

All complexities in any efforts to create a new XML format are nearly the same as any effort to create a new microformat.

I think that Dave Winer’s move of freezing RSS 2.0 and the great work being done by the Atom folks are testaments to that. They’re keeping complexity issues at a minimum, and creating XML formats that are enabling a lot of folks to do great things. Microformats are but a part of the overall semantic markup picture.

Remember, software has to be written to recognize microformats just as it would be for specialized XML formats like RSS. Otherwise, it’s just meaningless markup…you don’t just throw microformats into a browser and have additional functionality. You’ve got to recognize them somehow…and browsers don’t do that by default. This gets back to Danny’s comment above.

Also, by saying that Flickr can’t resolve the similarity between The Big Apple and New York City you are making the incorrect assumption that they’re the same, and missing the point of folksonomies. Folksonomies work precisely because they’re not a centrally-controlled taxonomy. They work because of their ability to allow context, in the same way that the big apple I’m currently paying attention to is the Powerbook I’m typing on.

Garrett Dimon

November 21, 2005 6:27 AM

Joshua – You’re absolutely right that it isn’t easy to come up with Microformats. However, they are dramatically easier and simpler for most people to use because they are familiar with the elements involved. There’s no new tags to learn or understand, and no need for a schema to be involved, etc.

As for folksonomy issue, I fully recognize that you can’t make that assumption, and I do understand the point of folksonomies. However, when someone does mean for those to mean the same thing, you’ve now lost the connection. As with many things, it’s a double-edged sword.

Say for instance, the one important document I need isn’t tagged using the term I expect. I’ll never know that and there’s no way to predict it. As such, I’ll never find it. The tags can serve as a good supplement, but they won’t cover all of my bases.

Edward

November 21, 2005 10:29 PM

Just a little feedback from someone who never heard of microformats nor hCard. Just don’t mention hCard! One at a time man! It makes it so much more confusing to have hCard in your explanation and code example, its a primer so use something the majority understands.

Maybe have hCard linked on a footer as an additional or related reads.

I think this is an intro to a concept many web designers concerned with standards already know, only we might not have a name for it … Microformats? ... sigh … (another fancypants term) my mind is still rejecting that name =)

Joshua Porter

November 22, 2005 2:57 AM

“However, they (microformats) are dramatically easier and simpler for most people to use because they are familiar with the elements involved.”

Dramatically easier and simpler than what? RSS? The power of XML is that we can be more descriptive when necessary. So, if I was creating my own XML format, I might create a <paragraph> tag instead of using the <p> tag. This is not harder. In fact, it’s easier and more descriptive. Like Steve Krug says…“Don’t Make Me Think”. In general, we have the ability to create much richer and more obvious tags.

Also, I see it as bothersome that we have to stick to a tag set (HTML) that was invented for high-energy physics papers, where you could (presumably) rely on headers and paragraphs as the primary elements for every document. But the intervening 15 years have shown that the ways in which we use HTML far outstrip the use of the provided tag set. Does a header tag belong in a web application? I don’t think so. We need new elements that support our current applications. While it is admirable to re-use where necessary, it’s kind of like seeing everything as a nail when all you’ve got is a hammer.

Further, microformats uses tags like the <address> tag, which have historically confused the heck out of people.

“However, when someone does mean for those to mean the same thing, you’ve now lost the connection. As with many things, it’s a double-edged sword.”

Agreed. It is a double-edged sword. I would argue, however, that this is like someone saying that observing the linking patterns of people could never provide enough value for us to find what we need. But Google does this each and every day. While in theory this is a lossy application of metadata, in practice it’s amazingly powerful and most importantly it’s “good enough”. And since most folksonomies allow people to add as many tags as they want, if they think it’s important to tag something “NYC” and “New York City”, they’re certainly free to do that. The problem is when we start making that assumption for them.

“Say for instance, the one important document I need isn’t tagged using the term I expect. I’ll never know that and there’s no way to predict it. As such, I’ll never find it. The tags can serve as a good supplement, but they won’t cover all of my bases.”

You’re missing stuff right now that you don’t know exists. I don’t see folksonomies as creating that problem. In fact, the bottoms-up action of them tends to uncover items like this, not bury them. To borrow a phrase from Eric Raymond, the famous open source guy, “given enough eyeballs, all bugs are shallow”. This means that since people are the judges of quality, they’ll eventually float the important stuff to the top. It’s a leap of faith, I know, but it’s one we’ll all make sooner or later.

John

November 22, 2005 7:20 PM

I’m a little confused about something. One of the selling points of css and semantic code is the file size reductions. We’re taught, use lean-code, avoid classes when you can access the element via a parent element. Doesn’t the use of microformats sorta unleash classitis?

But then I got thinking. The html tag set is fairly limiting. One reason for microformats is to create a set of tags to describe data that standard html tags can not. Should I be using classes (or id’s) to help make my code easier to understand for the next developer who comes along to work on my site? Regardless if that class has style information associated with it in the stylesheet?

Douglas Clifton

November 23, 2005 7:49 PM

I wouldn’t go as far as using an unnecessary class on an element if you aren’t using it for presentation purposes. The basic idea behind microformats is to attach additional metadata to an element that otherwise might not have any, but to do so in a well-defined way per a specification. Although I have yet to see any evidence of this, the long-term goal of this idea is not so much for humans, but to allow the markup to be machine (software) readable. In other words a “semantic web” bot comes along and slurps up the contents of your (insert your favorite microformat spec here).

Sorry, comments are closed.

Media Temple

via Ad Packs