Use formats instead of microformats
The Semantic Web continues to break new ground, and Web 3.0 seems to be a term that people associate with it. In the backwaters of semantics, microformats aims to develop standards to embed semantic information into XHTML. I can't help to think that's strange.
One of the principles of microformats is to "design for humans first, machines second". Still, almost all formats are about adding span tags and class or rel attributes to existing XHTML. Humans will never see those, or benefit from them, unless there's some kind of machine parsing done on top. Microformats were first built by people working with the blog search engine technorati, one of the reasons being to make it easier for technorati to aggregate data from those blogs. So machines it is.
Thing is, if you're going to give information to machines, why not use vCard instead of the equivalent microformat hCard? hCard is just a translation of vCard into XHTML. vCards open in your e-mail program, allowing you to save the contact information there, hCards don't open anywhere. vCards are also just as easy (probably easier) to crawl and parse as microformats.
So what I'm saying is, could we please use real formats instead of microformats?
Update: This article was too fuzzy, so let me clarify: This discussion is about embedded formats vs. formats. The "vs." come from the fact that lots of sites that implement microformats choose not to implement the corresponding format, which in some cases lead to people not being able to use the extra information.
Comments
By: Nick Dunn (#1)
Take hCard for example. If you want to provide a basic vCard, then it's likely your contact page already contains your name, telephone number and e-mail address. Why copy this information into a separate file just to expose it to machines? Don't Repeat Yourself, surely.
Similarly with hAtom (marking up blog-type content), this saves having to create separate functionality for creating syndication feeds for RSS1, RSS2.0, Atom, plain XML and so on. You have your data marked up once, and it can be parsed and converted as appropriate. This is where some microformat XSLT parsing would be very useful.
Technically speaking, with my contacts example, if the details were stored in a database then generating a vCard file from this data would alleviate the data-duplication issue. However not all sites run from a CMS and very often this information is buried in the HTML.
That's the simplified beauty of microformats.
By: Emil Stenström (#2)
vCards are very rarely made by hand, so it wouldn't be repeating oneself. Rendering machine data from a machine makes more sense to me.
By: Siegfried (#3)
Besides that vCard is indeed a long existing format that should be used. And there is already a proposal for that. It is part of the RDFa proposals (see http://www.w3.org/TR/2006/WD-xhtml-rdfa-primer-20060516/ ). But RDFa is for XHTML, not for HTML. So, to say it short: Microformats hCard is for now, RDFa/vCard is for the (near?) future.
By: Emil Stenström (#4)
By: Siegfried (#5)
To repeat it: Embedding vCard in xhtml the correct way is via RDFa. But that is limited to xhtml. It is not possible for html. So embedding vCard the RDFa way is for the future, embedding hCard into html is for now. Both methods have their usage and their place. So there is no "versus". As long as you stick to html, you have to stick to hCard. As soon as you switch to xhtml you have the option to switch to RDFa/vCard.
In the title of this article there are two nonsense points. First is claiming some "versus" between microformats and vCard. Second is implying that microformats is no "real" format. It is a real format the same way RDFa is.
The remainder of the article is mostly correct. And i agree that vCard is the way to go - in the (near?) future.
By: Gert (#6)
By: Pelle (#7)
Sure - it needs spans or other tags but the content would most of the time be printed on the page anyway so adding a tag that makes the computer understands doesn't take much of an effort.
And by the way - hCards opens as vCards very easily. Check out Operator, Tails and Microformats bookmarklets - they all export as vCards.
Real formats are good - they are better than microformats. But microformats are better than nothing and doesn't exclude real formats.
By: Emil Stenström (#8)
Development is about priorities, and I agree that adding both would be a good idea. Real world examples contradict that though, people that use microformats withdraw from making data available as non-embedded formats.
@Gert & Pelle: Firefox extensions make them usable, but far too few people use them to motivate a developer to implement them. A big search engine like technorati could of course crawl and search for them, but even then: crawling for non-embedded formats would be easier.
By: Siegfried (#9)
But there will be very much water going down the Rhine river until we reach that day. Just consider that the majority of the web designers of today still stick to the concepts and methods of HTML 3.2. Even if they use the HTML 4 doctype they still code and design likeback in the old days of HTML 3.2. So don't expect the adoption of semantic markup, in what standard ever, too soon. And if there will be some minor adoption of the bare idea of web pages having not only a "look", but also a semantic, the first step will be small. And microformats allows for the smaller first step.
By: Sarven Capadisli (#10)
or check out the principles, goals, criticism or faqs on the microformats Wiki: http://microformats.org/wiki
Hope that helps.
By: Jesse Skinner (#11)
By: Emil Stenström (#12)
@Sarven Capadisli: Instead of mindless self-promotion, tell me what I've misunderstood.
@Jesse Skinner: Well, perhaps DRY is a reason, could be. But wouldn't it make sense to not store contact information some other way than in a HTML chunk in your database? And if you generate a vCard from that, there's very little reason to do hCard too.
By: Sarven Capadisli (#13)
If all the comments above telling you that your views are incorrect because of pre-misconceptions, perhaps it is time to learn what microformats is really about or what it is really trying to solve at the end of the day. In any case, the fact that you don't like microformats (or perhaps feel that you are left behind and trying to make up a case for it) shouldn't get in the way to educate yourself on it. Don't expect people to give you all answers if you are not willing to put the time to learn.
But for the sake of the argument, I can point a few more things about your post that's wrong on top of the above comments:
Microformats is not trying to replace existing formats. It is a way to use existing formats in (X)HTML. Whether you want to use a separate file to keep track of a vCard or not is totally up to you. Note that, since you are already writing HTML, you might as well use certain names in your HTML that match those in widely accepted standards. Because that way, you can pull off a vCard by having a single instance for the data. Are you huffing about names like "entry_title" being useless and you rather go for "my_foo_title" instead? What's the problem? If you are going to use a name you might as well use a standard name and have the advantage of parsers being able to understand your document among others. Maybe you don't want to do that. That's your call.
You are thinking code bloat? You need to learn about writing HTML on different environments and how they differ at the end of the day. Cases where you need to move code around or keep consistency and define things on a granular level. Above all, maintenance. microformats can lead you in that direction, if, of course, you choose to understand how to make use of it in your own work.
Technorati is not in charge of microformats by any means. Many formats have been developed after XFN and still being researched and developed by an open community. I take slight offense to this because I contribute my share. Believe me, the discussions are a lot more complex then anything under this URL. It caries out analysis and constructive feedback from the community.
You've clearly missed the point of "design for humans first, machines second". The idea is to mark "visible" data that we are already providing to humans so that the machines can also understand them. The tags and attributes that you speak of is a way to do that. It is like focusing on "social tagging" instead of meta-keywords. Can you guess why meta-keywords is dropped in favour of social tagging? microformats favors visible data as opposed to meta-data, keep that in mind.
microformats is not a new language nor is it trying to revolutionise the way we work. It is a step in the right direction. It will not solve all our problems but it will get us 80% there because it is pretty reasonable right now. microformats is not competing against RDF(a). They are meant to solve a similar (but quite different) problem in a different way. If you want to cover your "Semantic Web"ness, perhaps microformats is not for you. If you want to have a way to provide a way for machines to understand it on the existing Web then microformats is for you. You also need to understand the state of the Web though. Don't expect to go from zero to "Semantic Web" over night. microformats can help you bootstrap it though.
Consider this a freebie. Now go read about it instead of making uneducated claims because you are adding noise to the Web without doing proper research.
(This comment form is not very user friendly: dimensions of the container is too small for a comfortable writing. I am actually typing this out in a text editor and will paste it back.)
By: Siegfried (#14)
O.k., HTML, or, in the (hopefully near) future is _the_ web format. Instead of tons of proprietary and very different formats in tons of files it is indeed better to have them embedded into a single file. The main advantage is that this could then be used simply by humans by simply reading it. It could be used by humans that even do not know of vCards and the like. You could just sit down in front of the monitor, take a pen and paper and write down that telephone number to call that person. You could try that with vCrard, too. And if you know enough about computer data you probably will succeed. But the average noob is far better off with a visually appealing nice web page.
On the other hand embedded data can be extracted by programs to automatically enhance usability and usefulness of that information. You could still write down the address information on paper and then add it to your e-mail client manually. But is adds to usability if the computer does it for you by a simple click.
Additionaly this information, let's say an address information, is, if embedded into a web page, within a context. The address information is part of the complete information of that page. If you extract that address information, you get a naked address. For what purpose is this address? Why do you have it? The vCard data format profides no information about that. A web page does. This contextual information is completely lost when you extract this piece of data.
Of course you may _link_ to that standalone data from your web page. This has advantages and disadvantages. The context is not available ad hoc as when embedded. But you get an immediate access to a format ready to use, without some program involved. To combine both, i have done both on my impressum page. I have the address information on the page marked up with hCard, and have alink to a pre-done vCard file. Often, it's not a question about this _or_ that, but best practise would be to offer both.
BTW: Since i'm already offer my pages in HTML and XHTML, i'm currently working on switching the XHTML files to RDFa while still using Microformats on the HTML versions :)
By: Emil Stenström (#15)
By: Emil Stenström (#16)
By: Emil Stenström (#17)
I see what you mean, and what you're getting at, but I still don't agree. One thing you're saying is that the future is one format. But webpages of today already consist of many different formats. We have HTML, XSS, JS, PNG, JPG, SWF, and so on. Each format has a certain specific thing it accomplishes and it does it well. Other file types the user get to decide what to do with. If you click a vCard file you get a friendly Outlook "add contact" window. It works, for real users today.
Good point about context, I didn't think about that. As you say, it's a bit harder for machines to resolve things based on links, but they probably still will have to do that across different pages.
I agree that doing both, as you have on your impressum page, is the best way to do things right now. But if one of my clients gives me a couple of hours and asks me to make their contact information page more usable by ordinary people, I'd still pick vCard. First.
By: Sarven Capadisli (#18)
(Implementing all those corresponding "real" formats, is not easy. Othwerise, we'd see it more often. Implementing microformats is simple and we let the scripts to do that extra work for us. This is trying to solve a real world problem and it is not meant to solve all problems either.)
By: Thuy (#19)
Am I misunderstanding this? Is there a way to display a vCard directly on the page? I thought the point of microformats was to get the best of both worlds: microformats can be manipulated by machines/programming while not requiring extra interaction from the user.
By: Emil Stenström (#20)
By: Siegfried (#21)
Just to be precise: I do not think that there will only be one format. This one format, html, will just be the one main container. Just as it is a container for formats like jpg, gif and png (and others). Embedding these formats is simple, well known and approved.
Now about embedding other formats. Or not embed it but link to it. These are the 2 options you have. If you embed it, you have to adapt it to what is possible in your container. The container is html. If you do that, you have the information immediately embedded into a web pages whole context. There is no need for any human to do anything additional to see this information and to recognize it within its context. The drawback is indeed that you need some extra computer functionality to get that information in this standalone format. This is why microformats per se are not very useful. They become useful if there are functions to extract and convert them.
If you do not embed them, but link to them, the advantage is that you get the information directly (well, mostly direct, you have to do an extra klick) in the format usable in your programs. That is nice, and you do not need any script snippet to extract that for you. But the information is out of context. And there is an extra action needed for the human in front of the computer.
So both have its advantages and disadvantages. And i think, we both agree that the best way would be to offer both to combine the advantages of both while getting rid of the disadvatages. But then it is still no "versus" between both methods.
And last point: I personally think that for the future we will have one basic format for all kind of meta data: RDF. This could be very well embedded into xhtml, and it could as well be a perfect standalone format. So with rdf you have a format combining the advantages of all we have today.
By: Emil Stenström (#22)
By: Siegfried (#23)
I'd prefere the embedded version. But this is indeed debatable.
By: Stephanie (#24)
True enough. However, they don't need to. Technorati provides a service through which HTML authors can generate vCards from hCards by adding a few parameters to a link. It's relatively simple (especially with an example to work from) for semi-trained authors to use.
Rather than an end to microformats, which are so easy to create, we need more format conversion tools like Technorati's service -- preferably open source scripts that can be run locally.
By: Emil Stenström (#25)
By: Stephanie (#26)
By: Your Internet Classroom | Coffee on the Keyboard (#27)