Being completely serious here: who cares? The Accept: header has never been useful. It's an overdesigned disaster of complexity, trying to handle in the client something that is trivially doable at the server level.
Ignoring for the moment that there are "good" and "bad" ways to specify it. Has there ever been worthwhile application for this thing? In what way would a browser that simply left it off have trouble?
Perhaps you are thinking of the case where HTTP is used as a simple file-transfer protocol - where there is a 1-1 mapping between URLs and files on the server. However if you think of things at a higher level, where URLs map to generic "resources", it is plausible that a single resource could have multiple representations. Think of the Twitter API - a tweet can be represented in XML or JSON. This is the principal behind REST (granted, Twitter does not actually follow the HTTP standard, but you get the idea).
That said, for the overwhelming use-case of HTTP (web browsing), REST and the Accept header are probably overkill.
I understand what it's for; I'm dubious that it's actually used. By your own example, twitter doesn't actually do this. Does anyone, to your knowledge? This header has been there for 12+ years. If no one's used it so far, tell me again why I should care that some browser sets a bad value?
Well I don't know of anyone who actually does the right thing, but then again I'm not familiar with that many web services. I also don't necessarily think you should care that some browser sets a bad value - I guess the biggest issue is that since browsers are the most common HTTP clients, other HTTP implementations must conform to what browsers actually do, rather than the HTTP standard.
This points to one of the "problems" with HTTP -- it's very easy to hack together your own solution to a problem instead of following the standard. This generally isn't a huge issue in any particular case, but it does increase the overall cognitive load of developers who deal with many APIs, make it harder to effectively utilize proxies and caching, and increase the amount of work that must be done by servers to accommodate the many clients out there, among other things.
For example, rather than the desired MIME type of the response being specified by the URL itself, or by a query parameter, or by a post body parameter, or by the Accept header, it should just be in one place: the Accept header.
With Apache you can set the server to provide specific resources that the browser supports.
For instance serving regular GIFs or whatever to MSIE, and SVGs or transparent PNGs to Firefox. You can also use it to serve proper XHTML to all browsers except MSIE.
These are real world use cases where I know people use.
That's actually exactly the wrong thing to do. If you're serving content based on the browser, you want to be inspecting User-Agent, not Accept. And yes, people do that all the time. But they never do it based on what the browser claims to support; they do it based on what the client actually claims to be, based on what they have tested against real software.
When I was writing some of our internal HTTP apis, I came up with a reasonable compromise -- honor the Accept header, but treat any file extension as a limitation of it. .../foo could return any supported type, but .../foo.html is, by definition, the HTML version of the resource (and will return a 406 Not Acceptable if the Accept header doesn't include text/html, text/(star), or (star)/(star) ).
EDIT: the asterisks in my MIME type notations were interpreted as markup.
Firefox totally fucks up MIME types in a different way -- since version 2.0.1X it is extraordinarily anal about 'respecting' the Content-Type that the server returns. It basically means that if you want your text to be displayed in the browser, you have to declare it text/plain.
Oh, your webserver helpfully sniffs the extension in the filesystem, and serves up some example C code as 'text/x-csrc'? There's no way Firefox could ever possibly display that! The default should obviously be to 'open with' an external viewer, even if that's less or vim.
It's not very useful to open a downloaded document using a curses program, especially when stdout is not a tty!
This behavior is not configurable or reversible. I found and hacked together an ancient extension that adds 'View in Firefox' to the 'open with' list, but it's not on addons.mozilla.org
Wow. I never expected I would see someone complain about this.
Firefox is following the mime types exactly the correct way.
Compare to the disaster of IE which sometimes looks at the file extension to decide how to display the document - that caused me huge problems way back.
Yes, a view as text option would be nice, but firefox is doing exactly the right thing by listening to the mime type.
If I get text/csv I don't want it showing in the browser, I want it to launch it externally. If firefox gets text/x-csrc it checks the /etc/mailcap, and does what it's told. It has no way to guess that you prefer to view it in the browser - the server said: it's a C file, and firefox listens.
How would you program it? How is it supposed to know that text/x-csrc should show in the browser, while text/csv is an external program? Both are listed in mailcap.
However, like I said, an option "View as Text" in the download box would be nice.
Well then tell Firefox that. You can configure what it's behaviour is in the preferences pane. The defaults are perfect, it's your preference that's different from the standard. Which is why they put it in Preferences.
Firefox is following the mime types in the obnoxious spec whoring way that ignores reality, common to ideology-driven FOSS.
Since in the vast majority of cases the declared Content-Type comes from Apache sniffing the filesystem extension (or at best, the magic number), all you've done is push the problem server-side.
The end result of Mozilla's anal-retentive 'bug fix' is that less content will be served with correct Content-Type, with everything as text/plain if you're lucky (all sorts of shit is served as text/html). I ended up just editing /etc/mime.types when I was a CS Lab administrator -- faculty like it when they can view goddamn text in a browser without spawning fucked processes in the background that eventually crash your session.
I don't know why the fuck they don't have a mailcap blacklist -- not everyone uses GUI-focused distributions that let GNOME defecate in every crevice of /etc/
How would I program it? Always display plain text as text, with an explanatory infobar at the top giving other options. I swear, the infobar idiom is the only good new thing to ever come out of Mozilla.
Blasdel has some good points here, I think, but (s)he has phrased it in an unnecessarily offensive way. Please allow me to translate. I don't entirely agree, but it's not all wrong either.
----------------
Firefox is following the MIME type spec with the sort of excessive zeal for standards documents at the expense of real world practicality that you often get with open source projects.
Usually, the MIME type is coming from the web server guessing the file's type, which is not itself reliable, so you've pushed the problem to the server side. [interpolation by jerf: Slavishly following these MIME types is not a valuable thing to do, since the MIME type is not very reliable here.]
The end result of this is to ruin the entire value of the MIME type by encouraging people in the real world to serve more content up as text/plain (or even worse, text/html). I just edited /etc/mime.types when I administered a lab, because my users just wanted to see the content without spawning extra processes.
I don't know why they don't have a mailcap blacklist; not everyone uses GUI-focused distributions that spew bad values all around /etc in a misguided attempt to be helpful.
How would I program this? Always display unknown MIME types that look like text as text, with an infobar at the top giving other options. The infobar idiom is a nice feature and they should use it.
-------------------
jerf: I do quite like that suggestion. I'd also point out that while deciding whether something is text may be, strictly speaking, undecidable, in practice it's about 98% feasible.
Thanks for tempering my nerd-rage -- having someone else paraphrase your text in a different tone is an alarmingly insightful proofreading proofreading praxis that I never would have considered!
-------------------
The problem with the extra processes was not that they existed independently of Firefox, but that they were silently useless and caused the browser to crash -- launching 'less' in the background with no tty hooked up is a stunningly stupid end-result.
The problem with /etc/mailcap on an install that doesn't have all the GNOME cruft installed is that can have defaults like 'less' and 'emacs' at the head of the list for all sorts of types.
-------------------
RE: Decidability, Firefox is already doing content sniffing for the common Content-Types, even they are not anal enough to deny that all kinds of stuff is served as text/html. It's also fairly easy to detect known non-text types, as producers are kind enough to pick novel values for the first few words ("man 4 magic").
It doesn't really even have to sniff at all to get the obvious cases -- if it's unhandled and in text/, display it! (though that still wouldn't handle inanities like application/x-ruby in Ubuntu's mime.types, or the total retardation that is FF's handling of application/json)
Upmodded for the translation in which you filter out the offensiveness. Blasdel's tone makes it very hard for me to take him seriously. Especially the "common to ideology-driven FOSS" phrase almost snapped a nerve.
Compare to the disaster of IE which sometimes looks at the file extension to decide how to display the document - that caused me huge problems way back.
Agreeing with most of your post, I think you are wrong on this one.
IE will do lots of stuff to detemine what kind of data it is actually getting from the server, somewhat overriding the http spec. Even if something is served as a appliction/octet-stream, IE might resort to looking at the extension of the URL (or the content-disposition header, if one is present).
If these checks fails, it will resort to MIME sniffing, by analyzing the first 256 bytes of the file to see if it can figure out what kind of content it contains. This method is surprisingly accurate, and you can test it yourself. [1]
In my opinion IE does a lot of things to make life easier for it's users to overcome problems caused by people configuring their servers incorrectly, something which is getting increasingly common.
I don't think IE deserves to be ridiculed over this extra effort.
If you served http://example.com/foo.msg as text/plain, older IE versions used to helpfully assume the user wanted to view it in Outlook Express, which exploded because it only supported some binary monstrosity and not text/plain. Dots are not special in an abs_path, and you can't make an interoperable network by assuming your peer does everything the same way you do.
Maybe there's some utility in the Accept header as a signal of the capabilities of the browser/client. There's a nice discussion [1] on using the Accept header to suss out whether to send the 'correct' but generally less-usable 'application/json' mime type or just send the blatantly wrong but working 'text/javascript' instead. Given a client with the capability, you should go ahead and send it the way the client prefers, right?
Ignoring for the moment that there are "good" and "bad" ways to specify it. Has there ever been worthwhile application for this thing? In what way would a browser that simply left it off have trouble?