HTML 5.2 Recommendation

masswerk · on Dec 14, 2017

Hm – I'm not too happy to see most of the original HTML-elements marked "not conforming" and "must not be used", thus preparing for browsers to eventually drop the support. There are still lots of web-sites and valuable information stored and archived in this format. Back in the day, it was thought that basic HTML was a format to last. Who is going to update these documents in order to make them conforming to future browsers? Or are we just dropping a decade of documentation? Is it worth it?

(Consider: Apparently, MS-Word docs or PDF prove longer lived than basic HTML documents! Who would have thought of this?)

domenicd · on Dec 14, 2017

I don't really know what process the W3C fork of our work uses for removal, if any, but you can learn more about how features get removed in the (WHATWG) HTML Standard per our working mode:

- https://whatwg.org/working-mode#removals

- https://whatwg.org/faq#removing-bad-ideas

In short, I think it's important to distinguish between conformance and removal from browsers. Removal from browsers is a big deal and, as per those links, is only done when it's not going to break the web, or when the benefits are very high (e.g. security issues). Removal from being conformant just reflects the evolution of best practices. See also https://github.com/whatwg/html/blob/master/FAQ.md#how-are-de...

reificator · on Dec 14, 2017

<blink> is no longer supported by browsers, but with a few css declarations it works just fine.

Removing support for presentational markup does not mean a loss of information. Browsers will still render tags they don't recognize, and re-applying the styling of those tags is often trivial. (I mention <blink> because it's one of the more difficult, but not terribly so)

As the web evolved, our needs changed. Do we still need the tag?

orph4nus · on Dec 14, 2017

I think you're missing the OP's point though.

It's true that you can just use some CSS to make up for the lost HTML feature, but than again you could also rewrite the HTML part.

Forgive me if I'm wrong, but I'm fairly sure that what the OP is trying to say, is that there are plenty of great websites out there, which were developed a long time ago, and for which there is no maintainer to do any work on it. Thus having HTML elements like this dropped, would make the content in a way lost.

Thinking about it some more, users can probably add plugins to add this css automatically, or some browsers might even keep those features in, but still, there will be users that don't know this, I think, resulting in a bad experience.

jraut · on Dec 15, 2017

The "plenty of great websites" were developed long time ago. Having a degree of visual consistency of layout across different browsers was not possible then according to standards.

The content will not be lost. The tags will result in valid elements but the rendering may vary. This has always been a thing to be expected, since legacy elements (pre HTML5) never had uniform rendering and contained quirks.

Should the current/new standard have support for ambiguously rendered quirky elements? Is it even a standard then?

After HTML5 the end result will definitely be the same on most (if not all) layout engines. The standardization as a process requires non-conforming legacy to be dropped.

reificator · on Dec 15, 2017

I may not have expressed myself clearly, but I understood what OP was saying. Never overnight and post, kids.

I was thinking about something like Stylish or the user stylesheet I've been hearing about in Firefox (for their UI IIRC, but still). Inject some global css on older/missing doctypes, and it's probably less than 200 total declarations to handle every older tag. I'd imagine to be the hardest and/or longest, followed by <blink> and <marquee>.

Would be a small extension.

My other point I think I expressed clearly enough, that the loss of presentational markup is not a loss of content in most cases. If the title is in Times New Roman instead of Arial, most of the time it'll just look worse. Unless the content is meta, the presentation is to make things more pleasant to read.

unicornporn · on Dec 15, 2017

Not sure I'm getting you… An extension to view old pages? This is the worst idea I've heard in a long time. The web is awesome partly because it's backwards compatible.

reificator · on Dec 15, 2017

I'm not saying I want that, I'm saying it'd be super simple to do if things came to that.

My primary point is that presentational markup is not required to get value from all but the most meta of old pages.

sjwright · on Dec 14, 2017

Presumably new web browsers will only drop support for styling these elements in standards compliance mode?

chrismorgan · on Dec 14, 2017

The tag is still necessary because of HTML for emails and Microsoft’s disappointing decision back in 2007 to regress Outlook to using the MSO renderer/editor, which is worse than IE 5.0.

tomc1985 · on Dec 14, 2017

There are a ton of uses for old-style , , etc tags. CSS is both more verbose and more abstract. When I do not care about the "reusability" of a given code fragment, give me presentational markup tags all day long

ocdtrekkie · on Dec 14, 2017

I unapologetically still use the font tag, because most of the time I have no desire to bother with CSS.

taco_emoji · on Dec 14, 2017

Even inline?

    <span style="color: #000; font-face: Whatever; font-size: 10pt">Blah</span>

vs.

    <font face="Whatever" size="2" color="#000000">Blah</font>

bzbarsky · on Dec 14, 2017

Just as a nit, "font-size:10pt" is a lot worse than from an accessibility perspective. is "one notch below the user's default size". If the user sets a 20pt font, it's going to be larger than 10pt.

You could use "font-size: small" to get the size="2" behavior.

(Also, "font-family: Whatever", not "font-face: Whatever".)

ocdtrekkie · on Dec 14, 2017

Note that your CSS example is longer and requires combining two different syntaxes. And for one change, the difference in length and complexity is much more apparent.

> Blah

vs.

> Blah

Which is easier to remember? Which is more obvious at a glance?

maemilius · on Dec 14, 2017

The former. The problem is just that you've learned what the latter means better than the former. I'm the opposite.

Plus, the former is an absolute value where the latter is not, as far as I've been able to tell. I know that first span will always be 10pt font. I have no idea what "2" even means in this context.

ocdtrekkie · on Dec 14, 2017

Another commenter did you a favor and explained why using 10pt is worse than using 2: https://news.ycombinator.com/item?id=15927837

reificator · on Dec 15, 2017

That may be true, but regardless of whether pt is a best practice, the point remains that it is still a unit. IMO that puts it ahead of the alternative example.

sureaboutthis · on Dec 15, 2017

pt is for print and does not translate to the web. There are articles about this, specifically from the W3C https://www.w3.org/Style/Examples/007/units.en.html

sureaboutthis · on Dec 15, 2017

<blink> has never been in any HTML specification since time began.

reificator · on Dec 15, 2017

And yet it can be supported just fine.

sureaboutthis · on Dec 15, 2017

To nit-pick, no. You can emulate it, not support it, cause it doesn't exist anymore.

vorpalhex · on Dec 14, 2017

The HTML mode tag should put the browser into compatibility mode. This content should not be lost.

sureaboutthis · on Dec 15, 2017

Mode tag? Compatibility mode?

jraut · on Dec 15, 2017

This covers the situation. Quirks mode is a legacy-compatible mode for old pages. https://developer.mozilla.org/en-US/docs/Web/HTML/Quirks_Mod...

austincheney · on Dec 14, 2017

> Back in the day, it was thought that basic HTML was a format to last.

HTML was never intended to be archival. Archival assumes a long term relationship between format and user-agent, but those two things evolve independently.

> Who is going to update these documents in order to make them conforming to future browsers?

You don't update legacy documents stored in an archive. You find a conforming user-agent (appropriately old browser version) to consume them in their intended state.

> Is it worth it?

Yes, HTML is a versioned format. Improvements to the format are welcomed and necessary.

xg15 · on Dec 14, 2017

Except, as described later in this thread, WHATWG HTML, the spec that is actually implemented is decidedly not versioned and changes from day to day. It's likewise discouraged from keeping old user-agents. (See e.g. chrome's and firefox' aggressive update and support policies.)

From what I understand, the WHATWG's policy regarding archival is "well yes the format is constantly changing but we'll try REALLY hard to not make too many breaking changes."

austincheney · on Dec 14, 2017

To my knowledge the WHATWG specs are the backbone of the W3C specs, but nobody follows the WHATWG specs. Browser vendors prefer to follow the W3C specs precisely because they are versioned and perform against a slower and extremely thorough review process.

om2 · on Dec 14, 2017

That’s not right. Browser implement the WHATWG specs.

That said, validity changes don’t matter to the browser’s ability to render old pages. Changes to remove support for an element entirely are very rare.

mastax · on Dec 14, 2017

This is not (uncontroversially) true. See the long discussion below or on any other W3C post on HN.

austincheney · on Dec 14, 2017

I don't need to see a comment thread here to understand the process. I have been following this for 20 years, long before there was a WHATWG.

Additionally, WHATWG lost some credibility when they attempted to redefine the DOM and arbitrarily delete some node types. Granted, most of those types are legacy types not in use by anybody in long time, except for the attribute node type. Browser vendors simply ignored this foolishness.

om2 · on Dec 14, 2017

I’m not sure what you are referring to specifically, but WebKit aims to conform to WHATWG DOM and we check that against Web Platform Tests. We don’t even look at W3C DOM. I believe it’s th same for the other browser engines.

dannyobrien · on Dec 14, 2017

I wouldn't mind some extra information you have on this. When I've spoken to folks at the browser vendors one-on-one, they've talked about following WHATWG, rather than the W3C standard, but usually that was in conversations that were critical of the W3C, so it was hard to tell how ubiquitous that position was.

austincheney · on Dec 14, 2017

I am having trouble finding the background information on this. Basically, the WHATWG took the W3C DOM and wildly changed some foundational concepts without a thorough understanding of what those decisions mean.

Here is a very simplified description of this problem years after the fact: https://github.com/whatwg/dom/issues/102

It is important to understand the DOM wasn't created for HTML. The DOM, starting with DOM level 2, was created in parallel with XML Schema. This is evident when reading some of the W3C mailing lists and comparing release dates of W3C publications.

Attribute nodes can be independently walked when walking the DOM. By removing attributes as a node type you break this functionality. You can use this little utility I wrote as a proof: https://github.com/prettydiff/getNodesByType/blob/master/get...

Browser vendors are extremely shy about adopting new technology that makes for breaking changes. They will do so, but you need to have an incredibly strong argument. WHATWG's changes to the DOM had no beneficial argument, except perhaps developer convenience for those developers who cannot figure out DOM walking.

The DOM is a pretty solid technology with regard to extensibility, predictability, and sturdiness. If you maintain a large major browser and somebody came to you with breaking changes and a bunch of weak bullshit for justifications what would you do? Also, imagine if you will, that if you ever challenge the people bringing you this pile of shit they will troll the hell out of you in a very visible and immature way.

The response from the browser vendors was to simply say nothing and ignore them like they were never there. I got into an argument about this with the WHATWG on a github issue once, and wish I hadn't. Ignorance is like a black hole that sucks everything in and it never stops to allow rational signals to escape undamaged.

om2 · on Dec 15, 2017

This specific decision turned out to be mistaken, but W3C makes this type of mistake way more often and doesn't even always fix them. You can see in the record of this issue that the problem was eventually resolved. WHATWG Working Mode has also been updated since this change and would not allow this type of change to be made today without implementor support.

Regardless of issues like this, browsers track WHATWG DOM near exclusively. You can see devs from all of the major browser engines commenting in the issue you linked.

austincheney · on Dec 15, 2017

A big difference is that it took somebody new to the WHATWG (many years later) to admit failure and correct the problem very directly. In the past the WHATWG had a severe case of not invented here syndrome and would troll people to death who disagreed with them.

I know from my own conversations with the WHATWG this wasn't something that long time WHATWG members would admit to (or even understand). It was the childishness, perhaps more than anything else, that nobody took them seriously.

> Regardless of issues like this, browsers track WHATWG DOM near exclusively.

I am going to disagree with you there. Perhaps they do now, extremely recently, but historically this is absolutely false.

> You can see devs from all of the major browser engines commenting in the issue you linked.

Yes, everybody participates in the WHATWG. This isn't new. Participation is different than adopting those recommendations back into your software.

Here is what browsers actually implement: https://www.w3.org/DOM/DOMTR and https://www.w3.org/TR/dom41/

It is important to keep in mind that the WHATWG doesn't do a lot of XML work, but the DOM is markup language agnostic. The DOM isn't something created or maintained in an HTML rich vacuum.

om2 · on Dec 15, 2017

Do you work on a browser engine? I do (WebKit). Your claim that browsers actually implement W3C DOM 4.1 is just totally wrong. We don't even read it.

The person who ultimately fixed this problem in the DOM Living Standard is Anne Van Kesteren, who was not even remotely new to WHATWG at the time. The person who filed this issue (Philip) is also a WHATWG old timer.

masswerk · on Dec 15, 2017

"never intended to be archival" – HTML originated as an easy to handle stand-alone documentation standard (as a cut-down version of SGMLguid + links/anchors). The entire point of a documentation standard is backwards compatibility. Especially the just-ignore-what-is-not-implemented policy made this very promising regarding future usage, as long as major structural elements were to be honored. (Compare the drop of framesets, menus and manueitems as primary elements to represent structure and hierarchy, or the drop of major phrase elements conveying meaning and emphasis. Also, referring to the recommendation for substitutes, HTML is now not a stand-alone language anymore, but requires additional CSS.)

Opposed to this, HTML was not intended as presentation layer for fancy web-apps. (There had been better around for this in the Hypertext-world, even then.)

nothrabannosir · on Dec 15, 2017

> HTML was never intended to be archival

This is about the exact opposite of archival: backwards compatibility. We don’t want to split the web into old web and new web. Having to switch browsers for decade old pages as we encounter them, raising the barrier to entry for that lore of old, effectively sepulchring it from the public.

99% of the web’s users are not going to understand when to switch browsers, how, nor why.

austincheney · on Dec 15, 2017

> This is about the exact opposite of archival: backwards compatibility. We don’t want to split the web into old web and new web.

It happens anyways regardless of what people want. The 90s era web doesn't work properly in modern browsers and 90s era browsers don't work with the modern web.

> 99% of the web’s users are not going to understand when to switch browsers, how, nor why.

This also happens naturally. Chrome is the most popular browser and it doesn't come with most operating systems. That is something users must switch to.

jraut · on Dec 14, 2017

Dropping the support does not make the elements non-functional. Instead, they are presented differently. Formats and notations do undergo changes.

The comment mentions long-lived MS-Word doc. Which version of those, specifically, has been around the longest so far?

masswerk · on Dec 14, 2017

I'm really afraid that this is preparing the final drop of browser support. (We've seen similar in http, where many of the http/1.1 (1997) features, like multipart-http, for-headers, etc. aren't supported by any client for years by now.)

As for MS-Word and HTML: When I finished my thesis in the mid-1990s, I saved it both in the MS-Word version, I used to write it (MS Word 5 for Mac), and HTML (expecting future compatibility). I can still open the Word version, but I may be soon unable to conjure a formatted display of the HTML-version. And I can still display a PDF 1.x...

talmand · on Dec 14, 2017

What do you mean by "browser support"? What functionality do you expect to go away that would prevent you from viewing an old HTML file in a modern browser?

masswerk · on Dec 15, 2017

E.g., drop of framesets. Many old documentations use them, as do most websites from the second half of the 1990s (so-called 2nd gen. websites). Without frames, the content can't displayed in context anymore and consistency of presentation is lost entirely.

jraut · on Dec 15, 2017

Oh gosh, that is absolutely true. A major change in presentation when the support is dropped to be sure. On the other hand, what was the level of standardization then? Were there not massive inconsistencies across browsers when you got into fine details of implementation doing frames? Especially parent-child -relations among elements/sets/contexts which is an integral concept in definitive DOM - a grand achievement in standardized HTML format.

masswerk · on Dec 17, 2017

In framesets, parent-child relations were absolutely defined and stable, as were the paths between individual frames.

(Current frame is "self" or "window", parent frame or frameset is "parent", and the top most entry point into the hierarchy "top". Moreover, "self", transcending the window context, is also the only reliable reference to the global object, thus also providing a valid reference to the context of a worker. Specifically, it was for framesets that the notion of hierarchy was introduced, which eventually resulted in the concept of the DOM. Some inconsistencies to this concept of strict parent-child relations were actually introduced by early implementations of the iframe-element, which is, BTW, still a valid HTML element.)

That said, there was a small inconsistency with an early subversion of Netscape 3, regarding, whether the frame source would be relative to any current location of the frame or rather relative to the frameset. (But this was an issue for a rather short period of time, two months or so.) A major difference in styling was the implementation of frame borders, if they would be entirely invisible by just specifying `border="0"` (Netscape and others) or, if they required the two attributes `frameborder="0"` and `framespacing="0"` (MS IE). In practice, next to all sites specified both schemes. And jet another, but minor implementation specific detail was the sizing of framesets: While Netscape Navigator supported, like all other browsers, a size specified in pixels, this was internally translated to percents of the total width. Therefor, depending on rounding to integers, the presentation in the Netscape browser could be off by a pixel or two.

(The latter was, in deed, not unusual behavior at the time, just like MS Word and RTF used to translate any measurements internally to "tips" or twentieths of a point.)

jraut · on Dec 19, 2017

This discussion involves details about content referenced from another domain (or source which is not trusted). The implementation seems to be an issue which has finally been solved with a standard. The new implementations are being worked on. It is called CORS and the problems are still hard to solve in practice.

I am super glad for all the hard work put into all this.

richev · on Dec 14, 2017

Personal MS Word anecdote: Word 2016 can successfully open and render my final year University project report, compiled in 1996 using Word 6.0. It contains a bunch of embedded images, tables and moderately complex diagrams drawn using Visio 2.0.

The report is saved across six .doc files, due to the size limitation of the 3.5" floppy disks we were using back then.

Bahamut · on Dec 14, 2017

Doesn't this have negative ramifications for email too?

duskwuff · on Dec 14, 2017

Not really. HTML rendering by email clients has never been especially standards-conformant; the publication of a new HTML standard (especially by W3C, as mentioned elsewhere) isn't likely to affect that.

frik · on Dec 14, 2017

What's wrong with W3C again? Are we on the same train we were with the ill-fated XHTML 1 strict and XHTML 2.

Anyway what's going on anyway with Google+Microsoft+Apple+W3C, why is there such a big push to HTTPS and HTTP/2, and declaring old HTTP/0.9 and HTTP/1 and HTTPS/1 and HTML5.0 as legacy!? And why is Mail still sent in plain text completely insecure, and no adoption hype to support SMIME/etc? It is beyond fishy. Or is it just pure greed, no one cares about non-walled-garden-open-web (aka everything has to live in LinkedIn/FB/AppStore/PWA) and there is no money in mail?

duskwuff · on Dec 14, 2017

This W3C recommendation is essentially meaningless. The W3C's "HTML 5.x" specifications have never been considered normative by browsers.

The actual HTML specification which browsers follow is maintained by WHATWG at:

https://html.spec.whatwg.org/multipage/

kuschku · on Dec 14, 2017

> The actual HTML specification which browsers follow

...is found here: https://chromium.googlesource.com/

The reality is that the WHATWG (a) only writes descriptive standards, describing what already exists, usually with pseudocode and prosa instead of ABNF or EBNF (see the URL standard replacement), and (b) only describes something once it’s actually been implemented on larger scale.

On the topic of what standards are supposed to do – prescriptively shape and replace what exists – the WHATWG isn’t useful. WHATWG "standards" are the equivalent of Microsoft Office Open XML, a standards body just taking an existing implementation, defining whatever it does as standard, and doing it so incomplete that the result is useless.

Yes, WHATWG and W3C are doing the best they can do in the current climate (where Google can roll out QUIC and SPDY before even any standard is defined across websites accounting for 6% of global traffic, 65%+ of web browsers, and 85%+ of mobile phones), but this is just misleading. It helps no one to pretend to do standardization work when you don’t actually have any power to decide anything – neither WHATWG nor W3C can actually force, or even ask, Google to change SPDY or QUIC. They’re papertigers.

domenicd · on Dec 14, 2017

I work for Chrome, as an editor of the HTML Standard. So let me give you my perspective on this.

In Chrome we ensure that all features we ship to the web go through a public standards process. This allows them to be developed by a collaborative community, including other browser vendors and web developers who would use them. It ensures that if we happen to ship a feature sooner than other vendors, there's a specification and a shared test suite (https://github.com/w3c/web-platform-tests) that allow others to quickly follow. Note that a specification is better than requiring them to read the Chromium source, because specifications are at a higher level that doesn't depend on individual browser architecture details.

In the WHATWG we don't only write descriptive standards. But we do ensure that whatever standards we write, are ones browsers are willing to implement. And we ensure that standards accurately describe how browsers operate, even for legacy features, because that is all part of the mission of allowing browsers to compete on an even playing field and build themselves from scratch without having to go through the kind of costly reverse-engineering that Firefox 1.0 did to catch up to IE6. In practice we've found that algorithmic specs are better for this than BNFs, as it's harder to specify error-handling behavior for BNFs while still staying compatible with the web (i.e. while still producing a standard browsers are willing to ship).

And yes, we're not interested in just creating a standard out of thin air, with no vendor collaboration, calling it "standard", and then hoping some magical power would force browsers to implement. It is indeed much more collaborative than that.

But the fact that we require standards to be developed in tandem with implementations doesn't mean that implementations (such as Chrome) just go ahead and do whatever they want, and we at the WHATWG transcribe it into the spec at some lower level of detail. Instead, the public, collaborative standards process helps to extract out all testable and observable aspects of the feature into a codebase-agnostic description others can use, and provides a forum for them to comment on ideas before any final shipping decisions are made. And, per our working mode (https://whatwg.org/working-mode#changes), changes and additions do require multi-implementer support before they're ready to graduate to a WHATWG Living Standard; proposals not yet at that point are said to be in incubation, and are often developed elsewhere (see https://whatwg.org/working-mode#new-proposals) such as the W3C's WICG.

kuschku · on Dec 14, 2017

> In Chrome we ensure that all features we ship to the web go through a public standards process.

Ehm, basically every major feature Chrome has shipped has been shipped before the standard was even discussed. SPDY shipped long before HTTP/2 was even finalized, and QUIC is doing the same. NaCl shipped in the same way, without any standardization, and to this date, earth.google.com depends on it.

In general, your problem is that you only consider browser developers. In the past, the WHATWG has decided to redefine the URL standard, then shame cURL for not following the standard, without ever involving anyone from the curl project in the discussion. The URL discussion affects everything from Android’s IPC system to curl, from industrial machinery to the web. The WHATWG explicitly declared that the URL spec is designed to completely, and exhaustively, obsolete and deprecate any existing URL or URI spec.

Yet, the only people ever contacted about this, and who were given the ability to take part in the discussion, were representatives from the three large browser vendors.

domenicd · on Dec 14, 2017

Those are fair counterexamples. Perhaps I should be distinguishing between the Blink team and the rest of the Chrome team. I realize that distinction isn't very important to an outsider, but at least realize that that there's a large portion of the Chrome team that cares very much about the web evolving through an open standards process.

The URL Standard was designed in the open with input from many different constituencies. The cURL author has chosen not to participate, for reasons of his own, but e.g. Node.js, PHP, Google's GURL (used by Android IPC, I believe), and others are quite involved.

kuschku · on Dec 14, 2017

The URL Standard didn’t even consider contacting any industrial vendor that relies on them - e.g. SIEMENS. There’s entire industries out there that use these standards, and rely on them to be stable.

The only participants were all either browsers, affiliated with browsers, or a handful of web serving projects.

Other projects that rely on URLs include everything from KDE to Gnome, Microsoft’s OS to the systems used in your car.

Changing a URL standard and only involving web vendors is basically like changing the A4 paper standard and only talking to the Microsoft Office team, the Google Docs team, and HP’s printer team – while entirely ignoring paper manufacturers, envelope manufacturers, the mail companies around the world that will have to ship the envelopes, fax manufacturers that have to build faxes able to fax the new format, newspapers and magazines that have to replace their paper, newspaper shelf manufacturers that build newspaper shelves for newspaper stores, etc.

Most of the time, it’s easy to only think of the web as browsers and servers, but some of the specs the WHATWG touches go through entire industries, sometimes there are millions of companies that have to be notified months or years beforehand to replace their software, update it, potentially even do a recall, and standardize. Not everything moves as fast as the web.

And this entirely disregards the people that are trying to parse the web with HTML parsing, which everyone loves to ignore. And so many other groups of people and companies.

liveoneggs · on Dec 15, 2017

https://github.com/whatwg/url/issues/118 is the curl issue, I believe. As I love curl more than any browser I personally side with it. It has support for ldap urls; curl wins. :)

floatboth · on Dec 14, 2017

W3C has absolutely nothing to do with mail, they never touched anything mail related.

taspeotis · on Dec 14, 2017

https://www.w3.org/TR/html52/changes.html#changes

rwmj · on Dec 14, 2017

For those who have not been paying attention for a decade, what's the relationship between this revision and WHATWG?

stupidcar · on Dec 14, 2017

The W3C version is a sporadically updated, bad-faith fork of the WHATWG version, created to maintain the fiction that it "owns" HTML, which it deems necessary to maintain the organisation's standing (and funding) in the eyes of other organisations and governments.

xg15 · on Dec 14, 2017

And yet, we don't get things like that changes page from the WHATWG version. (Unless you want to dig through the whole commit history)

It's absolutely a fiction, but at the same time, this at least attempts to be a standard.

The WHATWG version seems more like a reflection of "oh by the way that's the rules our browsers are following this month. Your's truly, the browser vendors."

domenicd · on Dec 14, 2017

> And yet, we don't get things like that changes page from the WHATWG version. (Unless you want to dig through the whole commit history)

So far we've done our best to keep a really readable, tidy Git commit log, which should help with understanding changes: https://github.com/whatwg/html/commits/master

If you omit the "Editorial:" or "Meta:" commits, I think it's actually at a similar level of detail as the W3C fork's changes log. (Not completely; scrolling through I do see a number of commits that wouldn't be relevant.) But the W3C fork has only managed to copy-and-paste a small subset of our changes, so indeed, the changes log for the last year of work at the WHATWG will be somewhat daunting compared to the small subset they managed to copy over.

There may be room for someone to compile a higher-level "this week/month/year in the HTML Standard" or similar; before I started working in the WHATWG, that actually used to exist: https://blog.whatwg.org/category/weekly-review (also in very amusing YouTube form: https://www.youtube.com/watch?v=1Bg5BPnmj68). So far we haven't had the bandwidth to restart that, but if you or someone else wants to contribute that sort of thing to the blog or elsewhere, I'd love to help you get started.

xg15 · on Dec 14, 2017

But this is kind of my point: As of now, this doesn't exist.

I don't think the commit history works. It doesn't give you any indication about which changes are relevant or irrelevant and it doesn't tell anything about the larger efforts taking place.

Actually, I don't think it would even make sense to create an equivalent of the W3C diff, because there are no versions or other structures to organize the changes around - there is just a constant stream of changes. (Which is kind of the point of the living standard concept after all)

domenicd · on Dec 15, 2017

The W3C fork's versions are arbitrary too though (yearly). You could organize a yearly update on what's new in the HTML Standard if you thought that would be valuable to people. It wouldn't change the fact that browsers release new features based on the ever-changing standard every six weeks. But it sounds like at least some people would find it useful.

Personally I'd tend toward weekly or monthly, although I admit that yearly is more likely to generate HackerNews posts ;)

collinmanderson · on Dec 14, 2017

I think of it like WHATWG is the git master branch of the "html" standard, and W3C regularly packages a modified (changing things they disagree with) version of it and "release" it with a version number (HTML 5.x), going through alpha, beta, etc. By the time it's finalized, it's out of date.

vog · on Dec 14, 2017

I hope this doesn't sound too snarky, but as far as I know, the WHATWG standard is live, consistent and always up to date (including corrections), while the W3C recommendations are outdated snapshots of the WHATWG standard, which are labeled by arbitrary version numbers instead of the snapshot timestamp, for whatever reason.

EDIT: Apparently even that description was too charitable towards the W3C (see gsnedders comments).

gsnedders · on Dec 14, 2017

They stopped really doing snapshots of the WHATWG standard a while ago when they moved their authoring toolchain away from what the WHATWG document uses, and now just occasionally selectively copy over patches (sometimes incompletely) and make their own changes.

vog · on Dec 14, 2017

Thanks for the update!

But is that actually an improvement over the previous situation? (Serious question.)

gsnedders · on Dec 14, 2017

> But is that actually an improvement over the previous situation? (Serious question.)

No, it means we have two increasingly different documents purportedly defining the same things, and when they do copy patches over they've failed to also copy over other dependent patches too on a number of occasions leaving their spec as defined unimplementable.

exikyut · on Dec 14, 2017

I know the current manglement isn't explicitly malicious, but this is an atrocious state of affairs.

Practically speaking, the Web is a consortium of corporate foghorns that also happen to collectively be the majority ad-hoc directors of new media (translation: agendas with finance). Cable and daytime TV was the old media, which of course still exists, and social media has become a juggernaut majority of its own beside that.

So, you'd think the actual grassroots on-the-ground parts of a project that is ostensibly defined to be open and free, would actually be made of extremely smart people with straightforward management and as little bureaucracy as possible. Because, you know, the part where everything hits the ground needs to be well-oiled, have no chinks in the armor, and provide a secure foundation of independence.

And yet we have... chaos, infighting, politics and wars over (literally) nothing. And while all that's happening, corporations are progressively nibbling away at the capabilities we have today (to set up websites, to communicate freely) that we take for granted. One day we'll wake up checkmated by some incredibly well-engineered chess move...

Sighs

If the net neutrality thing is repealed, I will be exactly 0% surprised. It'll just be another EME, really.

collinmanderson · on Dec 14, 2017

At this point both W3C and WHATWG are not where innovation on the web is (or should be) happening. It's up to the individual browser makers to innovate. W3C and WHATWG's job should be to document any consensus among browser makers.

It shouldn't be their job to decide how browsers should work, that's the browser makers' decision. (Which happens to be large corporations, for the most part.)

Karunamon · on Dec 14, 2017

That just gets the browser makers castigated by the tech community. Every time, say.. Google, intents something new, the entirely predictable incoherent screaming starts about how it's another Microsoft IE/ActiveX.

Nevermind the fact that the landscape has changed to the point where that isn't a realistic outcome anymore.

Nevermind the fact that in the instance I'm describing (which was something like WebASM or WebSockets.. it was WebSomething and I can't recall the name), they had submitted their proposals to the standardization groups, with no change on the volume of the noises.

I wish people would decide whether they want browser makers trying New Stuff or they want New Stuff coming from standards bodies only. There are upsides and downsides to either way, but I really don't beleive that BMing browser makers whenever they try New Stuff is even sort of constructive.

at-fates-hands · on Dec 14, 2017

> It's up to the individual browser makers to innovate.

This seems to make the most sense because the browser is the end product by which people consume their internet.

It seems to me, they have been and always will be years ahead of the governing bodies that make these part of their "standards" decisions. By the time something finally makes into the spec, we're already onto a dozen new things the browsers are capable of and implementing.

At this point it just feels like the spec is an afterthought, not necessarily keeping up with how fast the industry is changing.

int_19h · on Dec 14, 2017

This is exactly how we got IE market dominance back in the day.

igravious · on Dec 14, 2017

In the interests of completeness.

“HTML Living Standard — Last Updated 13 December 2017”

https://html.spec.whatwg.org/multipage/

I've been telling students that the W3C develops and maintains the HTML spec. Looks like I'm dead wrong. Oops.

_blrj · on Dec 14, 2017

Well, they used to.

dmitriid · on Dec 14, 2017

Full story here: https://www.reddit.com/r/javascript/comments/5swe9b/what_is_...

tl;dr: Ignore w3c's HTML "standards" as they are (often poor) copies of WHATWG's standards.

wolfgke · on Dec 14, 2017

> WHATWG's standards

I my opinion one cannot call something a "standard" that changes every few days.

EDIT: In this sense W3C's HTML 5.x can be considered a rather badly authored (cf. other comments here) standard, while what the WHATWG releases is not something that even measures up to a standard, but it is the daily version of how HTML is supposed to be today.

dmitriid · on Dec 14, 2017

https://github.com/whatwg/html/blob/master/FAQ.md

What parts of the standard are stable?

The whole standard is more or less stable. There are some parts of it that describe new technologies that have not yet been implemented everywhere, but at this point those additions are only added after the design itself is pretty stable. Such additions must also have the support of two or more implementers, per our working mode[1].

Why are there no stable snapshots, or versions, of the standard?

In practice, implementations all follow the latest standard anyway, not so-called "finished" snapshots. The problem with following a snapshot is that you end up following something that is known to be wrong. That's obviously not the way to get interoperability!

This has in fact been a real problem at the W3C, where mistakes are found and fixed in the editors' drafts of specifications, but implementers who aren't fully engaged in the process go and implement obsolete snapshots instead, including those bugs. This has resulted in serious differences between browsers.

[1] https://whatwg.org/working-mode#additions

LaGrange · on Dec 14, 2017

It's not enough to be stable to be a standard, you also need authority that enforces it. Either because people "respect you" (whatever that means), or because there's a central authority forcing them to implement the standard, people actually implement it. If they don't, then it's not much of a standard.

So, WHATWG is in constant flux, and W3C has about as much authority as I do. _Thankfully_ in practice WHATWG is "stable enough," but just saying that's what "we" consider a good enough standard for something used in creating all sorts of UIs, from trivial to vitally important, is indicative of a bigger problem.

wolfgke · on Dec 14, 2017

> It's not enough to be stable to be a standard, you also need authority that enforces it.

There exist lots of standards that hardly anybody cares about. So this is clearly not true.

eberkund · on Dec 14, 2017

That was his point...

_blrj · on Dec 14, 2017

Exactly. How the hell is someone supposed to built against their work? You can't, it would just be piecemealed.

Touche · on Dec 14, 2017

Just because something changes often doesn't mean it is unstable. Do you ever shop at amazon.com? I do, and I find it pretty stable.

Just because it changes often doesn't mean things are sloppily accepted, or that experimental ideas are added and later removed.

Indeed, if you want to build something for the web, reading an outdated document that contains bugs browsers have already fixed is not a good idea.

wolfgke · on Dec 14, 2017

> Just because something changes often doesn't mean it is unstable.

When you do a project contract, you surely want to define the exact standard with respect to which the application is to be developed against, so that one can decide whether the reason for something looking wrong is a browser bug (I can work around it - but it will cost extra money) or indeed a bug in my code that the customer found (i.e. I have to work extra hours for no money because I did bad work).

To be able to decide such questions is a central purpose of existence for standards.

Touche · on Dec 14, 2017

When I do a project I want to be able to code against the standards from which browsers were developed; that is the WHATWG standard.

wolfgke · on Dec 14, 2017

> When I do a project I want to be able to code against the standards from which browsers were developed; that is the WHATWG standard.

I already argued that there is no WHATWG standard, but only a document that changes every few days. Even without this nitpicking: Which of these thousands of versions is the one on which the browser implementation is based on?

Touche · on Dec 14, 2017

This one: https://html.spec.whatwg.org/multipage/ . Contrary to some people's perception here, everything that goes into this is implemented by at least 2 browsers.

xg15 · on Dec 14, 2017

Again, which one? The one from November 14, 2017, the one from December 14, 2017 or the one from January 14, 2018?

Do you really want to make a requirements document that describes something different at the day you sign it and at the day you deliver?

Touche · on Dec 14, 2017

Sorry, but you can't print that page and use that forever. I understand that you wish that you could, but you can't. I live in the real world, so rather than reading a snapshot and hoping it stays that way forever, I just read the up-to-date version since that's what is implemented by browsers, not the PDF I saved 3 months ago.

wolfgke · on Dec 14, 2017

It is a strong liability in a project contract if the standard with respect to which you implement the code changes under your feet.

domenicd · on Dec 14, 2017

Given that the browsers with respect to which you implement the code change under your feet every 6 weeks, I think it's better the standard keeps pace with them than having it give a misleading impression of what you're developing against.

tannhaeuser · on Dec 14, 2017

I hope we can agree HTML is used for text content first and foremost. A format that changes all the time at the whim of an ad company is basically useless for long-term preservation of legal documents, or documents in education, etc. Do you think having the latest web app fad is more important? Especially when the format has been around for 25 years now. "Innovation" on the Web is only happening so that Google can keep an edge in search tech, and for similar reasons.

Wake up.

_blrj · on Dec 14, 2017

This just speaks from lack of experience: that's exactly what it means in the world of software. Do you not understand what specification means? "Specific" is even in the word.

You can't compare it to browsing on Amazon, because functionality doesn't just go missing and literally break buying things; functionality doesn't just suddenly get added and people rely on the exact font size and copy of a particular header in the men's clothing department to be precisely 2em and "Men’s Clothing," and now that it's changed to 1.5em and "Men's Winter Fashion" a third-party app can't render the header in an appropriate width size nor find the clothes to begin with.

Touche · on Dec 14, 2017

Nothing you said here is true of the WHATWG standard. It doesn't make breaking changes and it is specific.

wolfgke · on Dec 14, 2017

> Nothing you said here is true of the WHATWG standard. It doesn't make breaking changes and it is specific.

Counterexample: About three months ago I asked on HN why on modern browsers some subtests of Acid3 fail:

> https://news.ycombinator.com/item?id=15256890

I actually got some pretty smart answers, for example:

> https://news.ycombinator.com/item?id=15259428

To quote it for convenience:

"Two changes lead to three failures in Chrome.

The first change is described in the 'note' at https://drafts.csswg.org/selectors-4/#child-index

Chrome is failing a test because the root node claims to be a 'first-child'.

The root node is the first sibling, but since it doesn't have a parent, the selectors 3 spec didn't include it.

The selectors 4 draft does away with the requirement that a 'first-child' have a parent, and chrome's behaviour matches.

The second is discussed here https://github.com/whatwg/dom/issues/319

Roughly, when interpreting qualified names, Chrome is throwing InvalidCharacterErrors when the acid test wants it to throw NamespaceErrors, in situations where you really have both. This leads to two tests failing."

So decide for yourself whether the WHATWG "standard" did breaking changes in the past or not.

Touche · on Dec 14, 2017

The csswg is a W3C working group. It's true that there are sometimes breaking changes if usage is low enough. This is true of the W3C specs as much as it is of the WHATWG specs.

The advantage to following the WHATWG specs is that it reflects how browsers work today, not how they worked a few years ago.

jcrben · on Dec 14, 2017

Domenic Denicola posted similar thoughts on Stackoverflow as well:

https://stackoverflow.com/a/43082734/4200039

dmitriid · on Dec 14, 2017

Yeah, that's him both on SO and Reddit :)

kuschku · on Dec 14, 2017

WHATWG standards don’t even deserve that name.

A standard is something like ISO EN DIN A4. It is defined in cooperation with every stakeholder involved, it is specced, it is tested, and a stable definition is created. Everyone builds against this definition, and it works fine. The standard deprecates everything that existed before, and replaces it.

That is a standard. It’s authoritative, basically immutable, and it is prescriptive.

WHATWG "standards" come after the fact, only consider whatever browsers implement, refuse to ever deprecate anything (unless browsers have already deprecated it), and almost always just are "whatever Google Chrome does". That’s a disgusting abuse of the word standard.

WHATWG "standards" are the equivalent of Microsoft Office Open XML, a standards body just taking an existing implementation, defining whatever it does as standard, and doing it so incomplete that the result is useless.

Yes, WHATWG and W3C are doing the best they can do in the current climate (where Google can roll out QUIC and SPDY before even any standard is defined across websites accounting for 6% of global traffic, 65%+ of web browsers, and 85%+ of mobile phones), but this is just misleading. It helps no one to pretend to do standardization work when you don’t actually have any power to decide anything – neither WHATWG nor W3C can actually force, or even ask, Google to change SPDY or QUIC. They’re papertigers.

ChrisSD · on Dec 14, 2017

The WHATWG is for browser vendors. It's in a constant state of flux as new changes get proposed and those proposals get changed through implementation.

The W3C is for web authors. It presents a more stable recommendation and provides advice (based on research) for authors.

vog · on Dec 14, 2017

I tend to disagree with both points.

> The W3C is for web authors. It presents a more stable recommendation and provides advice (based on research) for authors.

Web authors usually use MDN instead, because it serves that purpose in a much better way. (Note that despite the name MDN is not Mozilla specific but a cross-browser resource, and that Microsoft and Google recently joined MDN.)

> The WHATWG is for browser vendors. It's in a constant state of flux as new changes get proposed and those proposals get changed through implementation.

The W3C recommendation is no different in that regard, they also describe stuff that's not fully implemented in all browsers yet. But the WHATWG version is more up to date, so you'll notice much earlier that the new feature you want to depend on will be abandoned or changed.

gsnedders · on Dec 14, 2017

> The W3C recommendation is no different in that regard, they also describe stuff that's not fully implemented in all browsers yet.

Note for a W3C document to go to Recommendation there must be two interoperable implementations. Of course, that doesn't mean any browser has implemented any of it, just that someone has implemented each part of it.

austincheney · on Dec 14, 2017

The WHATWG specs contain a lot of innovation, but change more rapidly. In many cases WHATWG specs are living documents that gradually evolve and adapt to changes in real time.

Conversely the W3C specifications are fixed to versions and are occasionally patched with updates. The W3C process is incredibly slow and conservative, which frustrates developers on the bleeding edge. Due to the slow process, thoroughness of that process, and formal versioning most software vendors prefer to implement against the W3C publications as more stable or reliable.

pjmlp · on Dec 14, 2017

WHATWG is for fashion victims always eager to try out what might work in a very specific version of a single browser.

W3C is supposed to be properly supported everywhere.

planxty · on Dec 14, 2017

Thanks!

All I want for Christmas is an obvious link in the top of every W3C document showing me what's new.

igravious · on Dec 14, 2017

Their TOC needs to be collapsible and selectively expandable. UX fail imho.

vog · on Dec 14, 2017

> Their TOC needs to be collapsible

Not sure what you mean. In the lower left there is a working collapse button.

igravious · on Dec 14, 2017

No, not the entire sidebar, the individual branch elements of the tree I mean. Like this: http://jsfiddle.net/tygrj1us/

TazeTSchnitzel · on Dec 14, 2017

WHATWG needs to add a <w3c-please-stop-plagiarising-the-whatwg-html-standard /> tag then gate some useful functionality behind it to see if W3C will dare include it

kryptiskt · on Dec 14, 2017

"Although we have asked them to stop doing so, the W3C also republishes some parts of this specification as separate documents."

ChrisSD · on Dec 14, 2017

See https://creativecommons.org/licenses/by/4.0/

> You are free to:

> Share — copy and redistribute the material in any medium or format

> Adapt — remix, transform, and build upon the material for any purpose, even commercially.

geofft · on Dec 14, 2017

You can ask someone to stop doing something legal. I'd hate to live in the sort of world where you can't. HN moderators would send the police after you to seize your laptop if you make bad comments. The only way to get your roommate to stop eating your plums would be to charge them with larceny. Every relationship would end with a restraining order, or it wouldn't be over. Failing to turn in a homework assignment on time would lead to a court date. Buying more than ten items in the express lane would get you arrested for fraud.

If that isn't the world you want to live in, let's get rid of this idea that just because I have no interest in the government stopping you from doing a thing by threatening violence (and that's all a license is - a statement that the following activities are not copyright infringement), I'm totally fine with you doing the thing.

ChrisSD · on Dec 14, 2017

Sure but they are using the copyright to insist on attribution, which undermines the argument that they are simply against using the law. If they really didn't care they'd use CC0.

And in the quoted statement they are not saying the W3C should improve their process for forking WHATWG's work. They are saying the W3C shouldn't fork their work at all. So despite their specifically chosen license (with easy to understand layman's summary) are WHATWG against all forking? Or are they simply against the W3C?

currysausage · on Dec 14, 2017

Most of your points are addressed by Ian in an old email:

"In the case of the WHATWG specifications, the licenses allow broad re-use, so that implementors can copy-and-paste text into their comment blocks, so that tutorial writers can copy-and-paste text into their documentation, so that experiments we haven't considered can spring up without inhibition, and so that, if the WHATWG stops being a good steward (like the W3C stopped being a good steward in the early 2000s), the next group of spec editors doesn't have to start from scratch." (http://lists.w3.org/Archives/Public/www-archive/2014Apr/0034...)

geofft · on Dec 14, 2017

Yes, they do in fact want to use state violence to insist on attribution. They don't want to use state violence to insist on the W3C going away, but they still want the W3C to go away. That seems reasonable to me.

domenicd · on Dec 14, 2017

This change to requiring attribution is actually fairly recent, and was made with some reluctance on the part of us editors, despite eventually agreeing it was the best path forward. See https://blog.whatwg.org/copyright-license-change

ficklepickle · on Dec 14, 2017

Thanks for the link! It's great to be able to read the reasoning behind decisions.

There is a typo in the third paragraph, first sentence, FYI. "Are not" is repeated.

  For example, there are derivative specifications ("forks") that are not are not prominently identified as copies of WHATWG standards

foolip · on Dec 14, 2017

Typo fixed, thanks!

currysausage · on Dec 14, 2017

To quote the W3C itself: "Forking a specification imposes high costs, and is therefore not recommended" (http://www.w3.org/2013/09/html-faq)

And Hixie/the WHATWG: "It is critically important (not just here but in life at large) to understand the difference between something being _a right_, and something being _right_." (http://lists.w3.org/Archives/Public/www-archive/2014Apr/0034...)

oblio · on Dec 14, 2017

I think the point here is not about the legality but more about the ethics of doing so, in this case.

ChrisSD · on Dec 14, 2017

It just amuses me that they're effectively saying that anyone can fork this for any reason... except for the W3C.

dragonwriter · on Dec 14, 2017

No, they are saying anyone is allowed to fork this for any reason, but we’d really prefer the W3C didn't fork this for the reason that they are, because it's confusing and counterproductive.

Whether someone should be permitted to do something is a different issue than whether they should actually do it.

scardine · on Dec 14, 2017

It is debatable if it is unethical given the license but it is definitely a disservice for the community.

dmitriid · on Dec 14, 2017

Yeah. Only it's possibly WHATWG's proposal. Copied over with names changed and multiple errors here and there.

See the full story here: https://www.reddit.com/r/javascript/comments/5swe9b/what_is_...

bringtheaction · on Dec 14, 2017

Wasn't HTML 5 supposed to be the last explicit version and then it would be a living standard without changing the version number?

currysausage · on Dec 14, 2017

"HTML", not "HTML 5", yes: https://html.spec.whatwg.org/multipage/

The W3C regularly publishes forks [1] of the WHATWG HTML Living Standard. They have good SEO, but not much relevance.

[1] http://lists.w3.org/Archives/Public/www-archive/2014Apr/0034...

dragonwriter · on Dec 14, 2017

No, WHATWG dropped numbers in favor of the “living standard”, but W3C continued to issue numbered HTML version standards.

firtoz · on Dec 14, 2017

HTML 2018.1 disagrees

exabrial · on Dec 14, 2017

I really wish we'd "simplify" the HTML spec. The "pave the cow paths" approach to allow non-closed tags and mix of various syntaxes has lead to an explosion of complexity. That has regressed into terrible performance and memory hungry parsers.

rhencke · on Dec 14, 2017

This was done once - it was called XHTML. It was, effectively, just HTML in XML form. Tags had a single syntax (no implicitly self-closing tags). Documents were required to be well-formed, syntactically, or they would not display.

It did not win.

dragonwriter · on Dec 14, 2017

It also didn't lose, since WHATWG HTML retains an XML syntax (no longer formally called XHTML, But it's the evolution of the same thing.)

There are still use cases for the XML syntax.

tannhaeuser · on Dec 14, 2017

HTML had tag omission and other minimization features from day one since HTML is based on SGML which formalizes these notions. If by mix of various syntaxes you mean CSS, then I have to agree with you. There never was a need to define a new syntax for item/value pairs; plain markup attributes were and are sufficient for presentation properties.

_blrj · on Dec 14, 2017

I wish WHATWG would properly version their work. I don't like the idea of a "living standard" because it leads to checking for individual functionality and feature detection, rather than being able to say, "This is fully HTML 5.x.x compliant."

Regardless of the state of W3C, if I built an embedded renderer based on their specs, I could at least say, "this renderer is based on <http-ref> and link to the recommended spec version. Whereas if I did that with the living standard href, I'd be out of date any time they decided to rename an attribute.

geofft · on Dec 14, 2017

But that's the point - web authors are supposed to use feature detection instead of writing to a particular standard version. It turns out to be a better model for large interfaces. Yes, in theory, you can ask "Is this OS POSIX.1-2008-compliant or not." In practice, it takes a while to e fully POSIX.1-2008-compliant, and so you get autoconf, with its individual feature detections of specific function. Less clean, but way more practical.

If you're writing an embedded renderer, you can always say "This is compliant with the standard as of 14 December 2017." If you're writing an embedded renderer that is being applied to the live web and not just to a fixed set of pages that are also embedded (e.g., you're shipping HTML documentation and a viewer, or a kiosk, or something), you will in fact be out-of-date when the living standard changes. There's no point in saying "I'm compatible with HTML 5.2.0" because the live web isn't targeting 5.2 any more. So you can either acknowledge that, or figure out how to get software updates.

colanderman · on Dec 14, 2017

How does one perform feature detection in a static HTML page?

As far as I can tell the only way to author a compatible web page these days is by checking every damn feature of HTML you use against some humongous table like Can I Use? before assuming your audiences' browsers support it.

Compare to versioned specs, where I need simply determine the minimum spec version supported by my target audience (and any exceptions to the spec) and code against that spec.

There is some utility in naming sets of well-supported features...

_blrj · on Dec 14, 2017

This is the annoying part. It's a joke that a site like Can I Use needs to exist, and that browser vendors don't really have apt versions of their own compatibility tables.

Going to a third-party website to check to see if something is supported is disgusting.

reificator · on Dec 14, 2017

I'm quite happy with the various vendor pages, including internet explorer/edge.

But since I'm not developing for a single browser (outside of my day job that is) I'm going to use the aggregate site that shows all of them at once.

tptacek · on Dec 14, 2017

By using conditional includes and fallback CSS.

colanderman · on Dec 14, 2017

Unless "conditional includes" are some new feature of HTML5 that my Google-fu isn't turning up, that kind of negates the whole "static HTML" thing.

Bootstrapping a web page through JavaScript polyfills is like autoconf all over again.

tptacek · on Dec 14, 2017

The problem with autoconf is that it detects every conceivable Unix featuring going all the way back to the 1980s, not that feature detection is itself that problematic.

colanderman · on Dec 14, 2017

Right, hence my assertion that well-known names of feature sets are useful. "HTML 5.2" is useful in the same way that "C99" was, because eventually there is a day I can just assume everything in "HTML 5.2" is present in all my targets. If I don't have such a name, if I'm forever at "HTML 5", I'm forced into the "autoconf" scenario of using feature detection forever for everything not in the base specification.

(That the W3C is apparently incompetent at associating feature sets with names is a separate issue.)

jcranmer · on Dec 15, 2017

It's funny that you use C99 as an analogy. Please list all the compilers you support that support C99. I'll give you a hint--MSVC, Clang, and gcc all don't support C99 fully, and possibly never intend to. It's not just an idle "oh, no one cares about those features; they support it for all intents and purposes": gcc kept its default standard at C89 in part because it didn't support C99 fully.

What you're doing when you say that you assume C99 compliance is you're thinking of the features from C99 that you want to use and relying on that. Admittedly, the generally-unsupported features are very niche. But that means that you potentially have a dozen different ideas of what "we support C99" actually means, and that's before you start asking how reliable an implementation needs to be before it meets the definition of "support." Declaring support for versioned standards is often more problematic than helpful (versioned implementations is a different story).

The real problem with autoconf is that no one removes the unnecessary feature checks and no one audits it to see what's still necessary for the platforms that people intend to support.

colanderman · on Dec 15, 2017

"C99 is substantially completely supported as of GCC 4.5 (with -std=c99 -pedantic-errors used; -fextended-identifiers also needed to enable extended identifiers before GCC 5), modulo bugs and floating-point issues (mainly but not entirely relating to optional C99 features from Annexes F and G)." [1]

Sounds fully supported to me, for all practical purposes.

"Default standard is now GNU11" [2]

What do you know that I don't?

[1] https://gcc.gnu.org/c99status.html

[2] https://gcc.gnu.org/gcc-5/porting_to.html

_blrj · on Dec 14, 2017

Right on about the C99 analogy, thanks. That's my concern.

wahern · on Dec 15, 2017

autoconf doesn't detect much of anything by default. A few commonly used, boilerplate macros do a series of test (e.g. AC_PROG_CC, AC_USE_SYSTEM_EXTENSIONS, AC_SYS_LARGEFILE), but for the most part each and every feature test autoconf does was explicitly and individually requested by the author.

The real issue is that people copy+paste autoconf tests from other projects without thinking about whether they're necessary, or even confirming whether they work for their use case. And because people just copy+paste autoconf tests instead of keeping a browser tab open with the (free) POSIX spec when writing their code, most tests people add are for stuff that no longer needs to be tested for (i.e. all the major Unix platforms support most standard POSIX features by default), and lack the tests for non-standard interfaces they actually use.

But there's no easy way to fix such poor development practices. A good start would be if people just stopped using autoconf, as well as libtool, cmake, maven, etc, unless and until it really became necessary. Follow the KISS principle. Keep your build as simple as possible and regularly test your code on at least one platform other than Linux/glibc, such as FreeBSD or OpenBSD, rather than misplacing your faith in overly wrought tooling.

It works the same way on the web. Don't use the latest + greatest feature if you don't need to. Like with performance optimizations, don't add the burden until there's relevant, empirical evidence that it's worth your while in the particular case. Nobody ever magically achieved high performance or strong portability by adopting over wrought tooling before the problems ever presented themselves. Doing so often ends up with the opposite result.

_blrj · on Dec 14, 2017

Good point, thanks. I suppose that's just the nature of it.

foolip · on Dec 14, 2017

I know it's not quite what you're asking for, but each commit gets its own URL that you can use if you want to refer to a bit of text and know the link won't break: https://html.spec.whatwg.org/commit-snapshots/0184feb8b468f2...

_nalply · on Dec 14, 2017

This specification should be read like all other specifications. First, it should be read cover-to-cover, multiple times. Then, it should be read backwards at least once. Then it should be read by picking random sections from the contents list and following all the cross-references.

Ah, hyperbole. I didn't know that humor could be specified.

philipwhiuk · on Dec 14, 2017

zaarn · on Dec 14, 2017

For the lazy and/or those unable find the link in the massively oversized index; [0] Changes listed here

[0]: https://www.w3.org/TR/html52/changes.html#changes

itsbits · on Dec 14, 2017

thght noreferrer value was already a standard.

detaro · on Dec 14, 2017

It is. W3C HTML5 standards are merely following/copying what WHATWG and the browser makers are doing, with little actual relevance.

pjmlp · on Dec 14, 2017

The relevance it that W3C HTML5 standards are supposed to already be stable everywhere, while WHATWG and the browser is a guessing game of what actually works and behaves the same way everywhere.

detaro · on Dec 14, 2017

I've never seen anyone reference it in that way, which doesn't mean nobody does, but was the basis for my wording of "little actual relevance". (Admittedly, going to either HTML spec is not something that's needed very often for most devs, since most changes happen in other specs (CSS, web platform APIs at W3C, ...) and/or are widely documented outside, but while I've had occasional discussions involving quotes from the WHATWG spec, W3C HTML5 spec hasn't been referenced at all)

For the question "is this supported widely enough", caniuse.com + your local traffic stats is in most cases more relevant than inclusion in some spec or not.

tobyhinloopen · on Dec 14, 2017

what's with the separation of w3c & whatwg? Are there now 2 HTML5-ish standards? Also, I thought HTML5 was going to be the "final version"

magnat · on Dec 14, 2017

What's the point of removing features such as "menu" from HTML standard? If there are browsers supporting it and webpages using it, would Mozilla (or Google or Microsoft) actually remove those features just because newest standard said so? I mean: marquee was deprecated long ago, yet browsers still render it correctly.

sureaboutthis · on Dec 14, 2017

<marquee> has never been part of any HTML standard, ever. It is listed as an obsolete feature in the HTML5 standard for the point of making it obsolete (a weird reason) but not mentioned anywhere else since time began.

I don't recall the reasons for removing <menu>

geofft · on Dec 14, 2017

<isindex> got removed from browsers, which I personally find kind of sad because that's what I learned in 1995 and I've written a web page that uses it. But it's weird and does nothing that a normal form couldn't do, so the browsers seem to want to deprecate it.

domenicd · on Dec 14, 2017

The biggest weirdness about it was that it was essentially a parser macro, not an element. That is, at parse time, it expanded into a form/label/hr/input set of elements into the token stream. Super-bizzare. See the removal patch at https://github.com/whatwg/html/commit/5c44abc734eb483f9a7ec7....

rlanday · on Dec 14, 2017

Whether or not to remove a web platform feature from a browser basically comes down to how many people are using it vs. what is the maintenance cost. I suspect that browsers continue to support <marquee> so they can continue to render all those great webpages from the late 1990s properly.

kenbellows · on Dec 14, 2017

I can't speak for the spec authors, but IMHO, tags should be deprecated and eventually removed when they are deemed to be useless, especially when their functions and/or semantics are covered by another tag, and especially when their use is harmful (or rather, more harmful than beneficial).

In my (very personal) opinion, an HTML tag or attribute, and more generally a feature of any design/development framework, should be considered possibly harmful if it:

- presents possible security problems; for examples, consider some of the points listed here: https://html5sec.org/

- promotes poor usability or accessibility; e.g. interactive tooltips with links or controls in them, for example, are quite difficult to make accessible, and I wouldn't want an HTML <tooltip> tag without a lot of discussion about accessibility

- promotes anti-patterns; e.g., at this point I think <marquee>-style scrolling informational text is an anti-pattern in a web context, since it can the text much harder to read, especially on small screens

Of course, none of these concerns should lead to immediate removal of a thing as soon as they're pointed out, but they should be discussed and considered. It's a cost-benefit analysis: what does this feature actually buy us that isn't easily achievable with other features, what problems is it causing and how severe are they, and are the benefits worth the problems?

As for <menu>, my guess, though I haven't been able to find the actual discussion, is that it was removed because its semantics are somewhat in conflict with <nav>, and probably its most common use was custom context (aka "right-click") menus, which bring a lot of accessibility problems with them. I don't know that I agree with the decision to remove it altogether, since I think its use to semantically identify and group web application controls is very valuable and not covered by any other tags (though I'd love to be corrected), but I do think that context menus, which to me seems like the most common use for the <menu> tag, are a very problematic design element. Again, it's a balance; is it worth the problems it causes? I guess the authors decided it wasn't.

(Just to reiterate, I don't know why <menu> was removed, I'm just guessing. If anyone can find any of the discussions about <menu> and the problems with it, I'd love to read more.)

domenicd · on Dec 14, 2017

I haven't looked at the W3C fork of our work, but in the actual HTML Standard (maintained at the WHATWG), menu was not completely removed---just the mostly-unimplemented context menu feature. We left menu as a semantic alternative to ol/ul for menu-like lists.

See more at https://github.com/whatwg/html/pull/2742

---

For more on removals within the WHATWG process, see:

- https://whatwg.org/faq#removing-bad-ideas

- https://whatwg.org/working-mode#removals

There's also the case of things like marquee, which are not removed, but just marked as obsolete and something that web developers must not use. (Which in practice means that conformance checkers like https://checker.html5.org/ are required to complain about them; it doens't mean there's some godlike web-developer-enforcement committee going around preventing you from writing code that uses marquee.) Their implementation requirements are still in the spec; see e.g. https://html.spec.whatwg.org/multipage/obsolete.html#the-mar... and https://html.spec.whatwg.org/multipage/rendering.html#the-ma.... (Same for frame/frameset, by the way.)

yuhong · on Dec 14, 2017

I already have suggested to base the W3C snapshots on the WHATWG web developer edition.

tebruno99 · on Dec 14, 2017

here is an idea, how about the w3 go to hell after selling us out on DRM.

jlebrech · on Dec 14, 2017

nope, html needs to stop.

it's a document markup language and shouldn't be used for apps.

eberkund · on Dec 14, 2017

What about XAML and all the other XML based UI markups? I find HTML refreshingly easy to use compared to the alternatives.

jlebrech · on Dec 14, 2017

XUL seemed interesting, not being backwards compatible with html was one of it's strengths.

SimeVidas · on Dec 14, 2017

What‘s the alternative (for web apps)?

jlebrech · on Dec 14, 2017

don't make them

CaptSpify · on Dec 14, 2017

Agreed. We should never have been putting "apps" on the web in the first place. Giving control from the client over to the server is a terrible idea, and I'm amazed it ever took off.

lewisflude · on Dec 14, 2017

Where would you draw the line between a "website" and a "web app", would you like to see JS die entirely? Would you like the web to be non-interactive. Genuinely interested.

CaptSpify · on Dec 14, 2017

Not OP, but I would like to see JS on the web die entirely. Executing code without review just by clicking on a link? no thanks.

ksec · on Dec 15, 2017

I really wish there are ways to do most common JS code, Ajax and Pjax alike entirely without JS. Even Advertising and Tracking that is without JS.

jlebrech · on Dec 14, 2017

interactivity should move to a newer platform and the web should just be for brochure sites, articles and legacy stuff.

there should be a new platform that uses a new engine that's isn't backwards compatible.

It might be possible to do this soon with wasm+canvas, but then you have unused rendering engines (html/css) and it's recreated from the ground up.

krapp · on Dec 15, 2017

"Interactivity" can mean anything including hypertext itself.

And what sort of interactivity? Does backend logic count, or only logic in the browser? If only the latter - why does that matter, but not the former? Does any site that uses javascript qualify, regardless of how little?

Hacker News uses javascript, so is it a "web app" and not a "web site?" Would it suddenly become a webapp if the mods hit their heads and decided in a fever delirium to turn the whole thing into a SPA, despite it having the exact same functionality?

In this model, would YC have to publish the static pages of HN on the "static" web but the forum on the "dynamic" web? But what if they cache the threads? Now they're static as well. And having every web developer divide their attention and work between two platforms based on which part of it is "static" and which part is "dynamic" seems needlessly complex and confusing.

I sympathize with the idea - HTML and javascript are terrible for building applications, but if you want the web to only be static HTML files then your "new" platform is going to contain almost every website in existence, including most of the brochure sites, articles and "legacy stuff." Most web apps are also documents, few are strictly one or the other.

It would make more sense to bifurcate the web along WASM, because that will lead to the distinction between HTML and compiled binaries (which, I know, we've already been there with Flash and Java) both in the browser. But even then, WASM is intended to work within the context of javascript and HTML, not necessarily to stand alone.

reificator · on Dec 14, 2017

Is that a practical "should" or an ideological "should"?

dbbk · on Dec 14, 2017

No one should make web apps? Good one.

jrochkind1 · on Dec 14, 2017

good luck with that.