Hacker News new | past | comments | ask | show | jobs | submit login

He even mentions the biggest reason not to do so, which is that any typos in XHTML (when served as XML) will result in the yellow-and-red screen of death in Firefox.

The failure case for XML on even a single typo being the total destruction of your web app is unacceptable. This is why at Yahoo, we're pretty firm about using plain HTML, from yahoo.com on down.

Being a purist about markup is great and all, but the robustness of browsers when given accidentally broken markup is one of the great strengths of the web. Serving as XML makes your apps less stable.




any typos in XHTML (when served as XML) will result in the yellow-and-red screen of death in Firefox.

Or, say, using && in your javascript. Yes, that does it. I have the 30 minutes of wasted time a week ago to prove it.

Here's the "solution", just so you can see how elegant it is. You have to escape it like this:

  <script type="text/javascript">
  /* <![CDATA[ */
  if (var1 && var2) blah blah;
  /* ]]> */
  </script> 
Or just use AND I suppose, forgot if it's got some weird precedence difference or something.


He doesn't mention the actual biggest reason not to do so, which is that Internet Explorer doesn't support XHTML at all.

You'll only experience the yellow screen of death if you've already went to the trouble of serving the content as tag soup to IE and detecting Firefox to send it with a different mime type.

In short that whole section made me think this guy was an uninformed crank, still desperately clinging to the dead end of XHTML and spreading disinformation in order to justify things to himself.


a single typo [is] the total destruction of your web app

Really? I mean, I know why my company's pages don't validate. I'm more than ready to execute a putsch to take care of that problem.

So why doesn't Yahoo, in its vastness of being, have the ability to clear pages through (automatable, standardized and fairly simplistic) validation? Is it a 5-bytes-on-the-wire savings problem like at Google? Or just PEBCAK as is par for the course?


Any web company should have a policy of making its pages validate to the extent that is possible. And we do.

In the real world -- when you have to do Stupid Web Tricks to make things work in IE6, or you have advertisers who can insert random awful FONT tags -- a fully validating page is an impossible dream, especially when (for the front page, for instance) that page is different for every single user in scores of countries and dozens of languages.

So we get as close as is practical, and rely on the wonderful resilience of the web to take care of the 3% of edge cases.


FONT tags in advertiser injection, I understand. We do some mangling to avoid that, but you're right. It doesn't kill everything.

I'm wondering what kind of stupid web tricks you're talking about that go against XHTML validation, though. Care to share a horror story?


PEBKAC, actually.

Though it can be (sort of) argued that regular HTML files that don't validate are just as destructive to user experience as XHTML ones. If your file doesn't validate, the browser has to discard it's "pure" parsing mode and start again in quirks mode, which can really slow it down.

I don't have the figures to hand, but there have been some neat studies done that show that changing a website so that it validates appreciably increased conversions.


That's not true.

The browser doesn't validate the content and neither knows nor cares if it would. It does something called doctype sniffing to choose whether it goes into one of a few different modes based entirely on the declaration (or lack of the same) at the top of the document.

"However, it is important to realize that the Quirks mode vs. Standards mode is predominantly about CSS layout and parsing—not HTML parsing. Some people misleadingly refer to the Standards mode as “strict parsing mode”, which is misunderstood to imply that browsers enforced HTML syntax rules and that a browser could be used to assess the correctness of markup. This is not the case. The browsers do tag soup fix-ups even when the Standards mode layout is in effect. (In 2000 before Netscape 6 was released, Mozilla actually had parser modes that enforced HTML syntax rules. These modes were incompatible with existing Web content and were abandoned.)"

Quoted from (and far more details) here: http://hsivonen.iki.fi/doctype/


Actually there are two different things—the parser and the rendering mode. The choice of the former depends solely on the MIME type—no matter that docypte you use if your page is served with text/html MIME it will be parsed by HTML parser. And keep in mind, that every slash in "/>" is treated as invalid attribute by this parser, so serving your XHTML page as text/html may be not the best idea. BTW, do "view source" of such page in Firefox 3 and you will see all slashes in red.

And if you did happen to server XHTML with proper MIME there could well be the case that your page would take longer to load on Gecko <1.9 — it did not support incremental rendering of XHTML so your document had to be fully loaded before it could be rendered.

HTML engine choses rendering mode (standards, almost standards, quirks) depending on DOCTYPE (for XHTML parser it is always standards mode). HTML engine must be always ready to recover from markup errors (and HTML5 does a good job describing what to do), XHTML parser can just die on the first error it encounters.


Good point, but I'd bet that not displaying a page at all will decrease conversions even further.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: