Latest News | Sports | Society | Book Reviews | Politics | Technology | Life & Style | Entertainment | Books | Education | GTU | Engineering | Creativity | Reviews

Wednesday 14 August 2013

Everything you know about XHTML is wrong.

Why are MIME types important? Why do I keep coming back to them? Three words: draconian error handling. Browsers have always been “forgiving” with HTML. If you create an HTML page but forget the </head> tag, browsers will display the page anyway. 

(Certain tags implicitly trigger the end of the <head> and the start of the <body>.) You are supposed to nest tags hierarchically — closing them in last-in-first-out order — but if you create markup like <b><i></b></i>, browsers will just deal with it (somehow) and move on without displaying an error message. 

three birds laughing
As you might expect, the fact that “brokenHTML markup still worked in web browsers led authors to create broken HTML pages. A lot of broken pages. By some estimates, over 99% of HTML pages on the web today have at least one error in them. But because these errors don’t cause browsers to display visible error messages, nobody ever fixes them. 

The W3C saw this as a fundamental problem with the web, and they set out to correct it. XML, published in 1997, broke from the tradition of forgiving clients and mandated that all programs that consumed XML must treat so-called “well-formedness” errors as fatal.

 This concept of failing on the first error became known as “draconian error handling,” after the Greek leader Draco who instituted the death penalty for relatively minor infractions of his laws. When the W3C reformulated HTML as an XML vocabulary, they mandated that all documents served with the new application/xhtml+xml MIME type would be subject to draconian error handling.

 If there was even a single well-formedness error in your XHTML page — such as forgetting the </head> tag or improperly nesting start and end tags — web browsers would have no choice but to stop processing and display an error message to the end user. 

This idea was not universally popular. With an estimated error rate of 99% on existing pages, the ever-present possibility of displaying errors to the end user, and the dearth of new features in XHTML 1.0 and 1.1 to justify the cost, web authors basically ignored application/xhtml+xml. But that doesn’t mean they ignored XHTML altogether. Oh, most definitely not. Appendix C of the XHTML 1.0 specification gave the web authors of the world a loophole: “Use something that looks kind of like XHTML syntax, but keep serving it with the text/html MIME type.” And that’s exactly what thousands of web developers did: they “upgraded” to XHTML syntax but kept serving it with a text/html MIME type. 

Even today, millions of web pages claim to be XHTML. They start with the XHTML doctype on the first line, use lowercase tag names, use quotes around attribute values, and add a trailing slash after empty elements like <br /> and <hr />. But only a tiny fraction of these pages are served with the application/xhtml+xml MIME type that would trigger XML’s draconian error handling. Any page served with a MIME type of text/html — regardless of doctype, syntax, or coding style — will be parsed using a “forgiving” HTML parser, silently ignoring any markup errors, and never alerting end users (or anyone else) even if the page is technically broken. 

XHTML 1.0 included this loophole, but XHTML 1.1 closed it, and the never-finalized XHTML 2.0 continued the tradition of requiring draconian error handling. And that’s why there are billions of pages that claim to be XHTML 1.0, and only a handful that claim to be XHTML 1.1 (or XHTML 2.0). So are you really using XHTML? Check your MIME type. (Actually, if you don’t know what MIME type you’re using, I can pretty much guarantee that you’re still using text/html.) Unless you’re serving your pages with a MIME type of application/xhtml+xml, your so-called “XHTML” is XML in name only. 


Source : http://diveintohtml5.info/past.html