Why XHTML is a bad idea
XHTML is often mentioned when you talk about web standards of different kinds, and many believe that it's the future of the web. I'm of a different opinion and this article explains my reasoning.
- Some HTML history
- XML enters the scene
- "Let's make HTML work like XML"
- Problems with using XML on the web
- We can't expect beginners to use XHTML
Some HTML history
A long time ago, in 1990, the first pieces of HTML came to use. It was built specifically for scientific documents and contained nothing by structural elements. The idea was that the user should be the one deciding what a certain document looked like; after all, you read reports for their content, not for their looks. The target audience for HTML was pretty clear, scientists and other computer literate people: programmers in some sense.
The web soon became mainstream. Everyone has now surfed the web and lots of people have their own websites. But far from everyone care for code quality and most websites contain serious code errors. Despite the errors it's worth to remember than most sites "work", that is, they display like the authors want. All thanks to the error-handling of the current browsers.
An angry community of programmers/webmasters has complained about bad code since the beginning and demanded that we force people to only produce valid code. I've been a part of this group in the past…

XML enters the scene
Around 1998 the specification for the XML language was released. XML is a language that makes it easy to construct your own languages. Think of it like HTML, but where you make up your own tag names, and where errors are not allowed. The programmers took it in as their new favourite and it spread quickly.
XML has very precisely defined error-handling (unlike HTML): when the parser finds something unexpected it just stops and displays an error. This means two things: it makes editing XML documents closer to "real programming" (if you make a small error it doesn't compile), and since you don't need code for error-handling the parsers become both faster and easier to write. As you can imagine the programmers felt at home.
"Let's make HTML work like XML"
W3C was founded and the programmers from the angry HTML community had made an impression on them. They decided to do something about the lousy code people wrote and standardized a new language for the web. XHTML takes the tags from HTML but adapts the language so that it becomes compatible with XML. The result is a language that can (and should) be parsed with an XML parser.
So all is well then? No. As you poke around you'll soon notice that it's pretty damn hard to get XHTML parsed with an XML parser in current browsers. Let me explain: to decide what parser to use you need to send the correct MIME-type from your server. If you're using PHP you can do it like this:
<?php
if ( stristr($_SERVER["HTTP_ACCEPT"],"application/xhtml+xml") ) {
header("Content-type: application/xhtml+xml");
}
else {
header("Content-type: text/html");
}
?>
That pretty much asks the browser if it can handle XML and if it can sends the XML MIME-type "application/xhtml+xml". If it can't handle it (Internet Explorer 6 and 7 can't) it gets fed the old MIME “text/html” (and errors are tolerated and corrected).
But that’s not the only change you need. In the late 2004, Ian Hixie wrote Sending XHTML as text/html Considered Harmful (quite technical). As you read through it you see that you have to change a lot more than the MIME if you want your XHTML to work as intended. Summary: doing it the right way is hard.

Problems with using XML on the web
So XHTML is hard to get parsed the intended way in current browsers. Instead most people using it decide (or don't know otherwise) to parse it like if it was HTML. But doesn't that defeat the biggest reason to user XHTML in the first place? The only big difference between HTML 4.01 and XHTML is that XHTML can be parsed as XML! As long as you parse your code as HTML there's no reason to use XHTML.
If we have a look at the specification of XHTML there's a little table displaying what versions of XHTML that should be sent with what MIME type. You'll see that it's ok to send XHTML 1.0 as text/html (may). But looking forward to later versions you see that they "should not" be sent as XHTML. So pretty soon, parsing XHTML as HTML will not be allowed if you want to follow the standards. That leaves parsing it as XML.
Imagine some kind of dynamic site that allows readers to add content to it. The comments of this site are a good examples. If I used XHTML (and parsed it properly) and someone used invalid code in the comments the page would break. New users to the site would get a big and ugly error message with a line number and some code. Totally unacceptable. So I would have to find a way to parse the HTML in my comments and fix errors that could break the page. Ok.
Next I copy-paste some text from a site I want to quote to you. When I publish the article I get a big ugly error message because the site I pasted used another character encoding and that breaks my XHTML. I research the problem and find that this could be fixed by parsing all text in my admin console and making sure it's valid UTF-8 before storing it in the database. Ok.
Next I download a bit of javascript and try using it on the site. Again people get an ugly error message in their face when they enter. It seems javascript is handled a lot stricter in XHTML and inline javascript needs some strange CDATA characters at the beginning and end of the script element. Ok, so I fix that too.
And like that it continues, tiny differences in code makes the page break over and over again, and bugs in the parsers I write makes people able to break my site by posting content. I have a computer science education behind me, so I could probably fix the errors and keep patching the site until everything works. But do I want to? What's wrong with the current way of fixing errors when I notice them when validating? And what about all the non-programmers?
We can't expect beginners to use XHTML
As you've read above it's damn hard to get XHTML right. Still the W3C pushes for XHTML as the new standard for the web. Despite how hard it will be for beginners to get things right. Despite that you will move error handling to the parsers of each website instead of the browsers. Despite that XHTML has almost no backwards compatibility so pretty much all sites on the entire web will have to update their code.
No, I won't parse things as XML on the sites I build. XHTML was a bad idea from the start and I'd rather go with the new version of HTML developed under the name Web Applications 1.0 (also known as HTML 5).
I hope this article explains why a lot of web development blogs (including Friendly Bit) use HTML. What language do you use?
Update: More reasons why you should use HTML
Comments
By: Rowan Lewis (#1)
For example, by simply using HTML Tidy on your output, you'll never see a single error.
But better than that is to use a script that is capable of correctly handling user input, creating the right entities and so on. WordPress doesn't do this by default.
The best thing about using XHTML for me: I can do whatever I like with it, I can parse it with an XML parser, or transform it with XSLTs.
Its really not that hard, all the tools you need have already been written and maintained for years.
By: Emil Stenström (#2)
Thanks for commenting.
By: Sarven Capadisli (#3)
We can expect beginners to use XHTML the same way we expect beginners to use any other language. Do we expect the Java language to be more forgiving? Why should XHTML be a special case?
The bigger problem is pushing a dynamic set of standards which makes it difficult to adapt to. The Web languages are perhaps evolving at a faster rate then the classic programming languages since this 'information highway' medium has a wide range of expectations for today's requirements.
By: draco (#4)
Anyway I agree that there's no true reason to use XHTML unless you plan to serve it with the proper MIME type and use extras like MathML and the likes, or else why restrict your site to 20% of the Internet users? And no point in content negotiation, really.
Besides, when XHTML 2.0 eventually appears, it's not backward-compatible. So imho HTML 4.01 (and microformats!) will suffice for now... But I've been wrong before. ;)
By: Coda Hale (#5)
I mean, you also can't trust your users to construct well-formed HTTP requests, which is why you require a browser instead of trying to walk them through opening a terminal, launching telnet, etc., and then getting mad when they forget to CGI encode their path strings.
XHTML 1.0 is fine, except for the mime type issue (thanks IE6!). XHTML 2.0, on the other hand, looks absolutely horrendous, and I don't plan on using it. Ever.
By: Emil Stenström (#6)
@csarven: You're comparing XHTML to Java, a language that's a lot harder to learn than HTML. Beginners learn HTML in a week, Java takes one full semester. I see your point but I really don't see any good reasons to force XHTML upon people that already are struggling with HTML. Keep it simple!
@draco: Exactly, you don't even need to be in quirks mode. Standards mode code with some minor errors displays fine.
@Coda Hale: on a personal basis: yes, I could go through the hassle and make it work on this site. Think bigger though, how do you expect to tell beginners about the CDATA characters that they have to put around their inline javascript? What about some PHP scripter that writes his first guestbook, how will you explain character set conversion to him? I really think XHTML is a bad idea for most sites.
By: Coda Hale (#7)
"XHTML is hard" is not a good reason to avoid XHTML.
By: Emil Stenström (#8)
By: Kalle (#9)
By: Jarvklo (#10)
Being "Anti XHTML" seems to be all the rage these days... :P
*sight*
In my opinion HTML and XHTML are equally mature mainstream sister technologies and both will most likely co-exist on the web for a long time...
So what's the point of arguing one way or the other? There are things you can do in XHTML that you cannot do in HTML and vice versa...
Isn't it better to write comparisons that focus om when to use one over the other instead of ranting about one of them ? (sorry Emil, I fail to se anything but a rant here)
If I had one wish tonight, it'd be that you allowed yourself to take a deep breath and calmly consider the following before you flame me for my opinions:
* XML is a good thing on and off the web - right?
* When you know an use XML, XHTML could be a logical choice for web pages given that you already know how to handle XML - right ?
* If you don't plan to use XML (or if youre not willing to make the effort to learn it) - then simply don't use it and go for HTML instead!
As long as you make an informed decision , it doesn't matter what you use - at least as long as your pages are accessible to your focus group (i.e. validate and then some :) )
By: One Person (#11)
I mean it is up to the programmers to make a application / solution that makes XHTML easy to produce for non-programmers. CMS or Wysiwyg that works or such.
By: Nicole (#12)
By: Emil Stenström (#13)
Your third point is what I'm getting at. Do we need to use XML? The cases where people actually want to embed MathML or SVG on their sites are very few. XHTML is just a technical quirk for the 1% that needs that stuff. But W3C is not marketing it like that, they want it to replace HTML. I simply don't agree with that.
@One Person: I don't agree there. The only difference between the lanugages that will matter structurally to a beginner is that you have to add a "/" to empty tags in XHTML.
Also, why move XHTML generation to the websites when the browsers are already good at it?
@Nicole: most people reason like you do, just send it as text. But that defeats it's sole purpose, what good is a language that is made to be parseble with an XML parser if you send it as HTML? It's the very purpose of the conversion!
By: Rowan Lewis (#14)
I'm ashamed of the current state of the web, HTML lets people think sites like MySpace are ok, and that their code is acceptable when in fact it's beyond useless.
The looseness of HTML was a big mistake, it has taken years of progress off the Internet. How? Because more than half the time browser developers spend working on their rendering engines is wasted looking for ways to support people who are clueless in regard to the standard.
Plus, it allowed every man and his dog to "code" a website, regardless of weather it followed the standards or not.
HTML has broken the web, XHTML(2) is here to fix it.
* Rowan hops off the high horse...
By: draco (#15)
XHTML is just a formulation of HTML into XML. No more, no less. I just don't consider having properly-closed tags "fixing the looseness".
I write my site in HTML 4.01 Strict and I don't consider the code pretty tight too.
Just my 2cents.
By: draco (#16)
By: Rowan Lewis (#17)
Also, because HTML allows you to exclude end tags, thats all right, what I meant by looseness is lack of semantics and lack of appreciation for validation.
By: Mark Donohue (#18)
By: Emil Stenström (#19)
Forcing strictness on those people is not the right way forward, education and voluntary validation is.
@Mark Donohue: AMEN!
By: Tony Marston (#20)
By: Emil Stenström (#21)
By: Neil Jacques (#22)
By: Martin Payne (#23)
I don’t think it’s unreasonable to expect beginners to learn XHTML, but I think what is more important is to make sure they write valid code—whether it be HTML or XHTML. The main problem seems to be that the people who are teaching web design don’t even know how to do it properly themselves, so their students have no chance. I’ve seen far too many publications which haphazardly mix HTML and XHTML syntax together, with nonsensical tags such as
<P/>
.From the experience of university lecturers, it seems they learned HTML in the mid 90s, and don’t realise how much things have changed since then…
By: Ben Millard (#24)
application/xml
orapplication/xhtml+xml
. Website developers the world over have already made their preference known: they like the friendly behaviour oftext/html
. :-)The additional strictness of XHTML moves the burden of error correction to every author using markup. That would make it impossible for people without an unhealthy amount of markup knowledge to participate -- a point well made by this article.
It's much more practical for errors to be handled by a few well-funded browser manufacturers, especially since they already have sophisticated error handling.
HTML5 is defining error handling for HTML parsers, as well as for handling common MIME mis-labelling and suchlike. This will mean future websites can be highly interoperable even if their code isn't perfect.
The rennaisance of HTML is good news for all those amateurs blogging and browsing the web.
By: Gamermk (#25)
By: Emil Stenström (#26)
@Gamermk: I'm not saying they should. Web standards is the ideal that we should all strive for, not something to force upon people.
By: Sarven Capadisli (#27)
The fundamental idea behind 'Web Standards' is to move away from relying on browsers to handle our markup but to pass a markup specification that is consistent, well-defined and used by all.
Therefore I believe not moving forward with XHTML is a step away from such Standards.
By: Emil Stenström (#28)
No matter what language you use you are still dependent on browsers.
By: Sarven Capadisli (#29)
This is a better Standard to set in my opinion.
By: Emil Stenström (#30)
By: david (#31)
By: Mark Reeder (#32)
First - You mention dynamic sites. If you're allowing users to add content, their content should *never* break your site, which is why just about every comment system filters out tags. If you're not, you're asking for serious trouble.
Second - with javascript, you shouldn't be using inline javascript at all if you want proper separation of content, display and behavior.
Finally - the argument has been made that it's actually easier to teach XHTML than HTML to a beginner because it's more structured and gives instant feedback if you do something wrong. How many beginners that you know actually validate their code or are even taught to do so?
What this all comes down to is that there are two very different approaches (get people doing things correctly from the beginning out of necessity - XHTML, or lowering the bar and letting stuff that's not quite right get through without complaint - HTML). Don't get me wrong, I don't think HTML is going to disappear, however I do think that within a few years that it will be considered second-tier/amateur in much the same way that table-based layouts are considered today.
By: Chad Calhoun (#33)
By: Just Me (#34)
Hmm...I think I might use that same argument to keep using tables for layout...who cares as long as it works! Not too mention how much easier it is for the beginners.
By: Emil Stenström (#35)
Second: No, you shouldn't be using inline Javascript. But lots of people install premade scripts of different kinds on their site and many of them will mean that the site breaks (if inline), or that the script stops working (if it uses document.write...)
Third: I've thought about that. I feel the positive feedback beginners get by seeing something work right away is more important than making them produce great code from the start. There's time for that later.
Thanks for commenting!
@Chad Calhoun: From my experience beginners rarely care about pixel precision or that it works cross browser. I think that's something that could come later.
Note that I am fully pro standards here, the HTML standard. I'd recommend anyone to write valid code, and I do it daily at work, I just don't think we should force validity upon beginners like XHTML do it.
@Just Me: Note the difference between what a "web professional" is expected to do and what a beginner is expected to. My method of teching people is starting out simple and giving them lots of positive feedback. Then working on the validity and standards bits. Following that track, yes, I think it's ok telling beginners how to use tables for layout.
By: Matt (#36)
Neither is HTML easier because it allows a novice page designer to see his/her creation more quickly despite errors. This is behavior is the exact opposite of what I would say is best, which is to point out the errors right away instead of hide invalid markup, which leads to more difficult compounded errors later which a novice user would inevitably find more frustrating.
But what's the point of this article if both HTML and XHTML are available to web designers and aren't going away soon? My choice of one or the other doesn't affect a novice designer one bit.
By: Friendly Bit » 9rules Network Official Blog (#37)
By: Emil Stenström (#38)
Exactly how many web site developers have the skill to write a textile or markdown parser for their site? A beginner has no chance of doing that.
You are talking of the best behaviour, I'm talking about the one that makes it easy for the beginner. Invalid markup is an OK first step, when they're online and are starting to learn more you and I will step in and continue the lecture.
Your choice surely affects beginners. An easy way to get a site online is to just copy and paste someone elses. If you're unsure about what to use you check to see what others do.
By: Johan (#39)
Lots of people are confused by this! But basically XML is a descriptive language that can describe records/data with self declared tags, I see a better future with XML eg AJAX uses XML?
By: Johan (#40)
By: Johan (#41)
By: Martin Payne (#42)
Neither are deprecated tags… ;o)
It tends to be slower due to the way in which most (all?) XHTML browsers currently work—they download the entire page, check it for well?formed?ness, and only then will they render it. HTML browsers on the other hand, will start to render a page as soon as it receives data. Of course that doesn’t take into account the time it takes to download the associated images, CSS, scripts etc.
By: Michael Yevdokimov (#43)
Hello Emil,
The article is good, though I am disagree with some major points. I agree that there is no enough "correct" support by different browsers. I would say even block models differ from browser to browser. But even this all does not mean that something should very easy for the users. Well, we are not living in perfect world. If you are not a technical man, you usually read manuals and learn how to use something, right? XHTML is not difficult at all. It has its own rules which are good. Yes, it adds extra layer of "difficulty". But that layer is important because in the end you will well-structured code. You would say again your 5 cents to protect user from all these "difficulties". I still think XHTML is a step forward which forces you to split structure from styles. I think HTML was a real mistake, but was not be able to avoid it. All first versions of any products have its "limitations", in reality they are not limitations themselves but rather the realization of what was in developer's mind at that time. Time pasts, we grow.
Does everyone build cars? Does everyone buils airplanes? Does everyone program in PHP? No, not really. In this way why everyone should be coding in HTML? :) If you are not the programmer, just use the special tool, like Dreamweaver or something cheaper which would handle that coding for you. Otherwise, if you gonna grow as a developer use the correct tools. Otherwise, pay money to professional. It's a pity that <br /> is not a real standard for HTML, instead I should use not-closed tag. Everything should have its order. Presentation is presentation, structure is structure. I don't have much doupts about XHTML. The only thing I hate is "buggy" browsers and using hacks for CSS.
Unfortunately, it happened that everyone once tried HTML think they are pros since that moment and can do real business. In the end we get what have now: billions of unstructured pages. Who cares? We, developers, do. The situation is very similar to digital photography market. :)))) Somebody buys digital camera, "pornography" begins. Yes, it's all now looks simplier, but still you must keep your head and hands on the places where they are growing from. Otherwise, it's a mass.
Please excuse me for my big message. I think this is an endless theme. :)
By: Emil Stenström (#44)
You compare building websites with building airplanes. But the fact is that many are able to build a site right now, using HTML. Let's say that HTML takes a week to learn. What I fear is that XHTML doubles that learning time. Can you learn to build an airplane in a week?
Let's keep HTML and let's educate instead of force users to do the right thing. That's what I try to do with Friendly Bit.
By: Georg (#45)
"Relying on error-recovery is generally a bad idea, and doing so doesn't help anyone to learn anything about mark-up."
By: Emil Stenström (#46)
By: Adedeji Olowe (#47)
By: Georg (#48)
One of the ideas behind standards were that one should not have to hand-write or manually check and maintain pages/sites. The standards should in themselves create a base for tools as well as interpretation. This would have solved the problem for the majority of people, and left the road open for the minority to explore and create beyond what the tools could provide for at any given time.
Now we have come somewhat closer to "all browsers = same interpretation" since they are, at least on paper, supporting the same standards. It is even closer if we serve XHTML properly, as 'application/xhtml+xml'.
Reliable and standard-based tools that provide room for creativity are still pretty much a joke on the 'text/html' level, and completely non-existent on the 'application/xhtml+xml' level. Thus, the majority of people who don't know - and most often don't care - about standards as long as it works, can't get much, if any, help from tools.
The only help the majority of people get is **error-recovery** in browsers, while it should have been **error-corrections** in tools before their "creations" ever got near a browser.
If we want the quality to deteriorate even more, then we can just keep on telling the majority of people that applying standards is too hard so they shouldn't bother. That should keep the need for proper tools at FrontPage level for the foreseeable future, so no real improvements will be needed in that sector.
I don't think anyone in their right mind want that, but this - and many other articles lately, are providing the ground for web design at such a low level of quality to be accepted as a de facto "standard".
Understand me right: I don't read your article in such a negative way, and I do in fact agree with much of what you have written. However, I don't think the majority of people will read and use such articles as anything but excuses for not bothering, and unless that is what you wrote it for then it certainly should have been angled differently and the title should have been changed.
By: Emil Stenström (#49)
Is using HTML strict setting the bar too low? I think not. Also, I totally see your point, I just don't agree.
By: Cpawl (#50)
You keep making the point that you are all for standards and using HTML strict is the way yet you also make comments saying beginners should have not to worry about such things because it's "too hard" and they can learn them later.
Your paradox makes no sense. Would you encourge language students to say the words half right, because it makes them happy that they accomplished something, instead of correctly? As long as a math student feels good inside, would he be allowed to get his algebra only somewhat right... he can learn the rest later.
Anyone interested in web design, whether with the goal of a pro or as a novelist should know the rules, apply them (even if it's too too hard for their little minds as you make them out to be) or else go do somthing else with their time. HTML alone allows too many errors, depends too much on someone else (in this case the browser) to keep up after you. The web is polluted with horrible code because beginners were encouraged to not really give a poop, where given tools that cared less too, and where taught by people who said, "Although I care you do not need to little first timer."
With your "but the poor beginners" argument, I ask why you even bother to use strict HTML? If it is really pointless in the end, if the browser can just simply fix it anyway, if this is what you encourage and teach, then why bother yourself? Just go ahead and forget to close some tags, use all caps, whatever man. This is the beauty of HTML remember?
By: Georg (#51)
So called "XHTML" that can only work when served as 'text/html' isn't a standard at all - it's a joke, and I think that is what you have problems with and are calling "...a bad idea". I have no problems with such a statement either, since you're just saying that serving non-standard and non-working markup is a bad idea.
However, since you're actually saying that "XHTML is a bad idea", and "...it’s damn hard to get XHTML right", then I do have a problem with what you're saying. Proper XHTML 1.0, that can be served either way, isn't a bad idea, and it certainly isn't harder to get right than any other markup standard - just slightly different.
The problem with "XHTML" is that the joke (that: if it is "valid" and is working when served as 'text/html', then it is XHTML) has been kept alive and maintained on too many high profile sites for too long. "Fix it with the right doctype" is one of the worst statements out here, as no doctype can "fix" anything. Triggering mode-switching in browsers isn't a real fix. It's the markup that should be fixed so it goes with the chosen doctype (read: standard), and the browsers should treat it accordingly.
Most, if not all, tools are crippled, and can't go beyond 'text/html', and a major player on the browser-side have ignored the existence of properly served XHTML (as they have ignored just about everything else) for too long.
Neither of the present shortcomings can be blamed on XHTML as a standard, so I have a problem with your article on that point.
Whether or not XHTML can, or will, ever be developed into something useful beyond 1.x, is an entirely different matter. Working in a vacuum within a vacuum, must be pretty frustrating.
By: Emil Stenström (#52)
Your example with the language students is really a good one. If you start correcting every little error someone makes when trying to speak a new language they will get bored and stop trying. That's basic pedagogy!
This is not about beginners being stupider than the rest of the world. Both you and I know that, we both where beginners once.
We are already seeing a web where the onces producing the most content is the people that knows most about technology. For every time we raise the bar for beginners to learn we will help that shift.
On web standards: We can't make the web better by forcing developers. The only thing that helps is education.
You may close your tags with HTML, and yes, I think that's better from a programming standpoint. Forcing me to use lowercase tags is just silly.
By: bruceyeah (#53)
I just wanted to point out a little history. When Tim Berners Lee invented the web, he never intended 'beginners' to be writing HTML. Software was meant to handle all that and hide the tags from the end user. It's just an unfortunate accident of history (ie. the explosive growth of Mosaic) that lead to beginners writing their own HTML everywhere.
XHTML might be a little more difficult to learn, but there's a few things I wanted to point out about this:-
1) in an ideal world where web browsers are better, 'beginners' will never have to deal with writing XHTML themselves.
2) you are underestimating the value of a valid XML format. Valid XML means that the document can be read by machines as well as humans. It ensures that documents don't become messy and unparseable in future. Don't forget that the uses we find for web pages may not be obvious until well into the future. Also... how can you use DHTML and AJAX type features if you can't parse a valid DOM structure from a page?
By: Emil Stenström (#54)
1) I belive it's wonderful that a beginner can write <b>Some text</b> in a textfile, open it in their browsers, and see bold text. It gives them the feeling that the web is easy to deal with and does not require them to buy expensive tools for the task.
2) I know the value of XML. But beginners don't need AJAX, SOAP, and DOM parsing. They just want to slap something together and show to their friends. The next step is making working upwards in the foodchain and become a web professional.
By: carter (#55)
By: Martin Payne (#56)
I don’t really see why. For instance in C, “
char *myString
” and “char *MyString
” are two completely different variables. So why shouldn’t XML consider “<foo:myElement>
”, “<foo:MyElement>
” and “<bar:MyElement>
” as being different from each other?By: Emil Stenström (#57)
By: La domo de karotoj » XHTML ne estas malbona ideo (#58)
By: Greger Lundström (#59)
But I don't see the contradiction between a harder, more strict language and letting beginners create their own sites. That's what tools that produce valid code are for. Language of the output doesn't matter.
My biggest concern is that while it's easy enough to get started with html you get a lot of bad habits as browsers and standards are soo forgiving. Bad habits die hard and as you progress these habits become a burden. Many semi-professionals produce a lot of invalid code for professional sites. They continue to do so because: it's easy, it works most of the time, their clients doesn't know. When the day comes and a client needs and demands valid code, he can't deliver. The step between beginner and pro becomes unnecessary hard because you'll have to relearn to do things you've been doing for years. I've taken this road myself the past years and I would have been much easier to learn to do it right the first time than relearning as I go along.
By: Emil Stenström (#60)
I see your other concern and I've thought about it. There's a limit in how difficult you can make a language. If XHTML was even more strict, would you still recommend it for them? There's a line to be drawn somewhere, and I think that line should be drawn before XHTML.
By: Greger Lundström (#61)
By: Matt (#62)
By: Henrik Feldt (#63)
It's been a while. I came to visit your site because I've started my own web developing business, and I'm looking into people I had time to talk with before the international baccalaureate tests that I had this spring.
So what do I find, when entering your site? A big headline - "Why XHTML is a bad idea". Too bad I think... That power of the dark forces have gotten hold of him... And he seemed to be such a nice guy and all!
Let me first say that I agree with what Mark said in the comments previously, and I do believe some other people have made some very strong points about it being easier to learn because of the structure of it. Then I'm not talking about XHTML 1.1 served with the correct mime type, but rather the structure of XHTML itself - the structure of XML, which is very logical in itself. How would it be, had we gone back to the books teaching:
That would suck, wouldn't it? XHTML is the structure the web needed, and there is no logical reason as to why one would simply turn it down. (especially since you may serve it as text/html)
MathML is a beautiful creation - being able to convey maths with ease to ordinary people. I wrote my Higher Lever math essay in XHTML 1.1 because it enabled me to use CSS to style consistently and MathML to insert maths that would otherwhise be really hard to harmonize with images.
Anyhow, I also would like to say that there ARE tools that produce good XHTML. For example XStandard which is excellent to use in content management systems.
http://www.xstandard.com/
Another point you seem to miss out on is that XHTML does indeed not stand alone - it is designed with CSS and made dynamic with javascript. Hence it does indeed display very nicely in old browsers - they simply don't display all the goodies, nor should they - you musn't become too conservative and say that "we have to allow for IE 5.0 (which was - 10 years ago?)"...
Overall I think the article is rather narrow-minded - you should focus on progression, not regression -- for example: you could explain the features of HTML 5, because I don't know them, you could explain the features of MathML instead of bad-mouthing it, you could explain the potential welfare-gains in society by using open-source image editing tools to provide SVG images for the web, instead of buying expensive proprietary software. You could focus on the tighter integration between XHTML, CSS, Javascript, Flash and Server-side - how it all can be used to make the web more fun and usable for even the tech-savvy.
henke
By: Henrik Feldt (#64)
I can't write code in here ...
By: Henrik Feldt (#65)
Sov gott.
By: Perrymc (#66)
Emil, thanks for Friendly Bit. It has given me considerable food for thought!
By: Emil Stenström (#67)
If you start learning HTML you also have the basics of XHTML. XHTML adds some stuff to be XML compliant and in my opinion you can add that afterwards.
Good hearing that you too like my readers :)
By: Max Pagels (#68)
Being a newbie to the world of valid (X)HTML i have to say that i agree with most of what this article has to say. In the beginng, i found it very difficult to write valid markup, and would as such definitely not recommend XHTML to beginners.
By: Emil Stenström (#69)
By: FROIDURE Nicolas (#70)
By: aRgus (#71)
I don't care about the anal retentive tag structure making my content more "hardcore". I don't care about the "XHTML STRICT!" banner on the bottom of my pages.
I do it for the warm and fuzzies I get of having a "well formed document" that can be predictably parsed by any software that is able to understand the doctype.
Sure, HTML can be viewed broken and I'm happy for that. I don't want people to have to know how to split an atom to post content that I want to view. As a user, as long as it looks good in my browser I don't care what they use.
Now as a developer, I am inclined to do some extra work on my part, to insure uniformity in my code regardless of it's context or application.
Someone may want to parse my work into another format, text reader, whatever. I publish content for others to use. I do what I can to insure they can parse, view, format, whatever my content any way they wish.
Should everyone be FORCED into XHTML? No. That doesn't make XHTML itself a bad idea either. There is definitely something to be said for uniformity in code.
By: Emil Stenström (#72)
By: fan-tiger (#73)
Well, after I learned XHTML 1.0 y first tried to use it as is. The problem at this time was the missing compatibility of browsers. So I had to learn about how to get contents displayed in older browsers.
And dammned, there´s still no standard, which is fullfilled by all!
So finally it was an good idea to get HTML parsed as XML, but the only version to use is 1.0 and only, if there´s no other way of working with data.
Some words to HTML:
The META statement for the used version is still very tricky, because only transitional will get the content displayed well in most browsers.
I even don´t know anyone, who´s using "strict", but I know very much pages using different versions like 4.0 and 4.01 because of formatting tables. So how do You explain a beginner this?
By: Kamran (#74)
I agree with the idea of XHTML i.e. to saparate the content and presentation in different pieces.
Of course, everything has its pluses and minuses, but when it comes to me, I feel XHTML has more pluses in the long run than the minuses.
By: Emil Stenström (#75)
By: Will (#76)
"XHTML is a stricter and cleaner version of HTML."
"XHTML is aimed to replace HTML"
"XHTML is almost identical to HTML 4.01"
So for those wanting to learn HTML, ideally what W3C schools should be pushing is the use of strict HTML instead of pushing people to use XHTML as it's unncessary if your website is only text/html.
From a beginners perspective this is quite confusing. I'll definitely look more into this with the links provided.
By: lamandriel (#77)
By: Emil Stenström (#78)
@lamandriel: There's a difference here. No one would ever use the code you describe. When I talk about learning HTML I mean learn it to do stuff you want to do, not some theoretical crap :)
By: Dutch (#79)
W3C advises to use XML to change data in the future I understood with wordprocessors, databases, spreadsheet applications. They expect the application programmers to adapt these programms which makes exchanging data easier.
Emil, what do you think about this W3C motivation?
Another question:
I already changed my website to XHTML 1.0, before reading your article, should I be worried now that relatively older browsers won't see the website?
By: Emil Stenström (#80)
By: Dutch (#81)
But I can't find an answer to my first question about W3C's motivation for using XHTML: easer exchanging data for the next future.
I doubt new websitewriters have interest in exchanging data yet, but if youngsters learn writing XHTML with CSS correctly in an early stage, I can't really think what would be against this. It might depend on the education level, I guess?
I agree W3C shouldn't push XHTML especially not for beginners, like you said!
By: Jordan (#82)
One of my co-workers, not a fellow of the programmer mindset and accustomed to designing with tables in Dreamweaver, was trying to get to grips with using valid HTML and CSS to design websites. The process was painful and agonising to watch; sometimes, I wanted to scream after looking at the code he produced. (And then I remember that I used to write code like that.)
All this agony, and we hadn't discussed convincing him to use minimal
<div>
tags and semantic markup! I'm convinced that if, at that point, I tried to introduce him to the distinction between<acronym>
vs.<abbr>
, he would flip out completely, slaughtering the office in a bloody rage. Never mind explaining the problems with using<abbr>
in IE. As for the debate over the<q>
element… Even I wanted to defenestrate after wading through that one.This man isn't clueless, either. He's been designing websites since before I even saw a webpage for the first time!
There is no way on earth that the vast majority of the online population have any place being asked to consider the complex rules governing character encodings and MIME types. At this level of technical detail there should be limited user involvement; common XHTML problems should be dealt with by an abstraction layer which can take care of it without the need for low-level supervision. We need smarter tools to lighten the cognitive load for non-technical users.
Using your examples, here's the sort of thing I mean:
Browsers should provide better built-in semantic editing tools, circumventing the need for Javascript/plugins. Blogging sites need built-in tools which can handle and deal with receiving malformed input. New coders should be strongly encouraged to use existing abstraction layers rather than hand-coding them.
Whatever CMS you are using to publish articles should already be able to deal with different character encodings; again, those who want to write their own CMS should be directed to use existing modules which take care of these complicated problems before they arise.
Javascript is a tricky one. For one thing, I think that all Javascript should be contained in external resources; this avoids problems with CDATA sections. However, we need to develop better procedures for packaging, documenting and integrating Javascript into webpages. I think that what we really need are IDEs which enforce modularisation of Javascript and external resources, clearly identifying dependencies between modules; then, when the site is launched or tested, the necessary files are "precompiled" into a distribution directory. This is similar to how Java programmers have been working for a while. In my opinion, this will be much easier when most browsers support newer versions of Javascript with an "
import
" directive.So I agree that HTML is the best current solution at the moment for new web designers. However, once tools, technologies and applications have evolved which make producing and interacting with XHTML as easy and useful as it was always intended to be, I won't be surprised if HTML is swiftly abandoned in favour of its more powerful sibling.
One final note: a huge part of the responsibility here lies with browser vendors. When we can finally rely on valid markup and CSS being interpreted correctly, and uniform Javascript support across different platforms, I predict that progress will be made in leaps and bounds. Here's to IE 8!
By: Anders M.J. (#83)
The majority of users use IE and since this browser doesn't support XHTML, there's no reason to use this standard yet (and I don't use XML, so why code in XHTML). Might as well stick with good ol' HTML 4.01 then.
I had not coded in Strict before today, but still never used any deprecated tags, so the convert wasn't that painful. :-)
/Anders
By: Siegfried (#84)
If you use html, you can't directly include it. You may include such things via object, but then: If the browser is capable to parse xml to render an external svg object, why then not directly parse xhtml as xml? It is the same parser.
Sure, using xhtml this way enforces the use of application/xhtml+xml to get those benefits. If you send it as text/html, in fact you just send malformed html which the browsers error handler fixes. So i agree it would not be useful to limit yourself to the html possibilities, send your files as text/html but still pretend to code it as xhtml. This _may_ be useful during a learning phase, but not more.
So as a conclusion: As long as you are well of with what html offers you, it should be sufficient to code and send as html. As soon as you plan to add more to your pages you should think of switching to xhtml. Using xhtml consequently breakes every limit. Your possibilities virtually become unlimited. _This_ is why xhtml is the future. html 5 just extends the current html limits, but it still has limits and will always have. Many times html 5 will be absolutely sufficient. But as soon as you can get more, there will be an increasing demand for more, so you will have to switch to xhtml at some point.
And, BTW, it is still possible to automatically convert xhtml to html using xsl. It is nearly impossible to automatically convert erroneous html files to proper xhtml.
By: Emil Stenström (#85)
On a sidenote, HTML5 includes a variant that allows XHTML syntax, so they try to cater to that side too. Might be good to know.
By: Siegfried (#86)
The problem with the html effords is, you always have to add new features to the monolitic block of html. With xml you freely get modularisation. So in fact this is the way to go.
You're right, those additional features are not much needed today. For most issues html 4.01 is just sufficient. Su as long as you accept the limits you may stick to html. And since the IE does not understand xml (not without an xsl stylesheet), today it is generally better to at least offer a parallel html version, if not just an html version alone. But the point will come where you have to switch. It is your personal decision, when to switch. But some day it will be necessary. Old style html will then be just a markup language for the casual hobby web page producer. Professionals will need xml. In the future :)
By: William O. Yates (#87)
Better tools for common users could nullify this entire debate.
A well built composer should publish pages following all the standards. (HA!)
Some of the "DOM" tricks lost to HTML/XHTML/XML from SGML are environmental "hints" which would allow the composer software to "learn" which code base to use and what attribute switches to select.
I built such a composer for IBM's BookManager product back in 1992. It handled 1200 or so of the most commonly used tags in a research environment. Secretaries were using it with a couple hours introduction.
BUT, (sigh...), I still use my hammer and chisel to code my web pages by hand (html-401 strict of course :-).
By: Emil Stenström (#88)
By: Christian (#89)
You can learn tables-based layout in 3.6 days and CSS-based layouts in 3.8 days. What's there to worry about?
But if you learn things correctly there really is no difference in learning time.
And if you use XHTML from the beginning it'll be easier for you to work with your code later.
By: RomanAge (#90)
By: Emil Stenström (#91)
By: xhtml « Rofrol blog (#92)
By: Jim Westergren (#93)
Good article and good point.
On my latest major site that I am currently building I am using XHTML 1.0 strict - mostly to be prepared for the future.
I tried sending it in XML with a similar PHP script that you have here but I disabled that after some testing. I can't have my site breaking with parsing error because I missed to close a p element or forgot to unescape an & in an URL. Can't take that risk.
Perhaps HTML 5 is the way to go. I have read some on that. But I guess it will take quite some years for that to be fully ready and supported.
By: Emil Stenström (#94)
Since HTML5 is the future, it shouldn't matter if you use HTML 4 Strict, or XHTML 1.0 Strict, they are all incorporated into HTML5. I truly believe that's the best thing for the future of the web. It will take a couple of years, but browsers are already building offline storage and stuff like that. Also, big parts of the spec are just respecifications of how browsers have implemented stuff.
By: Jim Westergren (#95)
By: MWin (#96)
i read that article, and Emil does have some points...
but i want to add something to the "leave the error-handling to the browsers" topic:
One of the reasons why XHTML was introduced as a "cleaned up" HTML so-to-speak, is the increasing usage of internet on mobile devices, such as mobile phones, PDAs, that sort.
I know, hardware performance is constantly increasing, but so far most mobile devices just dont have the cpu for extended html-code-analysis and exception-handling, if they had, they would be even slower as they already are...
But in the end (at least until HTML5 is published, but even then) it's the webdesigner's choice which technology to use, i suppose they all have their pros and contras...
Just my thoughts, have a nice day,
Mike
By: MWin (#97)
as for the looseness of html:
consider following, simplified, code:
Paragraph 1
Paragraph 2
Pargaraph 3
now, these 3 paragraphs, are they siblings to each other, or descendants?
I'm one of them programmer types, so for me this in fact is an issue...
as for the above example, as far as i could find out, FireFox (3.5) considers them siblings (automatically closes all non-closed paragraphs when a new one opens, i.e. paragraphs cant contain other paragraphs, just like the html4 dtd specifies), so does IE (7), netscape considers them descendants, and amaya wouldnt let me find out at all...
so, i guess thats one point on XHTMLs scoreboard...
hand
Mike
By: MWin (#98)
try this:
<body>
<p> Pragraph 1
<p> Pragraph 2
<p> Pragraph 3
</body>
By: Web Design: What DOCTYPE do I use? « Obvious In Hindsight (#99)
By: Web Design: What is XHTML? « Obvious In Hindsight (#100)
By: Steve Savage (#101)
I'm leaning towards just publishing HTML5 right now for the simple fact: if I decide to put advertisements on my site I will no longer have complete control of the markup, and I'd rather have the tag-soup parser kick in then have "XML Parsing Error: mismatched tag" show up instead of my content.
BUT, I emphasized the word publishing for a reason. I will probably use an XHTML editor behind the scenes for the content I write, so I can use XSLTs to convert my XHTML documents to other formats.
By: K (#102)
In hindsight, what should have happened in the early 2000's is, instead of pushing XHTML, authors and industry should have advocated and taught proper/recommended HTML4 & CSS coding.
In reality, that is what average web dev students were seeking back then. Most, like me, had hobby pages in the late 90's with no sense of direction. Focus turned to XHTML with total lack of understanding of XHTML's true purpose.
If I only understood back then, I would have never focused on XHTML - or focused on WAP but that's a different story - and continued enjoying HTML.
I am excited about HTML5 and eager to learn about it soon... and it might lessen my perfectionist anxieties. Maybe XHTML caused it? :)
By: Emil Stenström (#103)
I truly think XHTML was one of the big reasons HTML5 has started to catch on now. People where simply not OK with some parts of XHTML, and wanted others. HTML5 (and XHTML5) are excellent alternatives.