Seems like a good idea, but the wrong way to achieve it. The right way, as I understand it, would be to write it up as an RFC and submit it to the IETF; and to contribute code for it to some of the popular web servers (apache, nginx, etc). The site doesn't make any mention of either of those things.
Disclaimer: I work for an ISP and every ISP in Russia is legally required to do the censorship (and mirror all traffic to FSB black boxes, but that's another story). I'm not partucilarly happy with the situation, but can't do anything about that.
Nginx is totally relevant as many ISPs including our use GNU/Linux boxes running Nginx as a highly performant transparent proxy (there are TPROXY patches for Nginx) to dive into HTTP traffic and do URL filtering (obviously, after initial crude IP-based filtering). Costs less than those fancy Cisco solutions, and it's not like we're willing to spend additional money on something that downgrades the service.
Also, there are cases where actual sites are legally forced to remove resources. Well, not really forced, but it's just a sort request too many sites can't really decline. You either comply and remove a single page (blocking for Russian visitors only seems sufficient), or get blocked on ISP level and since many ISPs (including several giant ones) just blacklist a whole IP address, that means your site becomes completely unavailable.
> I work for an ISP and every ISP in Russia is legally required to do the censorship (and mirror all traffic to FSB black boxes, but that's another story).
I hate the censorship but I like the fact you're not gagged and can talk about the fact traffic is being mirrored.
Nope, it's codified in the law, in a legalese, but right in the open.
The sad thing is, practically nobody cared about that, for years. The dissatisfaction became visible only when government granted themselves an ability not only sniff on others communications (which is obviously invisible to end user) but also actively censor them.
Given that Russia appears to be engaged in even broader surveillance and monitoring than the U.S., I find it odd that Snowden would be granted temporary asylum there for speaking out about a similar program in the U.S. It makes the whole situation look more like a political game.
He shouldn't be required to fight that battle as well; he's done more than we could possibly ask of somebody already. Russia is providing him with a certain degree of safety, he shouldn't be obligated to reject that.
Authorative: http://minsvyaz.ru/common/upload/prikaz_16-01-2008_N6.pdf (sorry, the document's in Russian and I can't find any translation, nor skilled enough to do that myself) - I'm not a lawyer, but in my understanding (as I was explained) this decree contains requirements to networks that ISPs must conform to (otherwise they can't get the license and provide services), and it states (in thick legalese) that all subscriber-generated traffic must be mirrored to operational search activities control ("пункт управления ОРМ"), which is usually (but maybe not universally) a black box sitting in a rack.
From what I've heard, SORM-2 hardware is a secured 1U *nix-based server (peer was not sure whenever it was BSD or GNU/Linux variant), running some kind of sniffer (probably pcap-based) software with some FSB's in-house tools. They are supposed to be dormant for the most of time, but nobody except FSB knows what they're actually doing (and they don't have to report when they're doing a lawful intercetion).
I once worked somewhere where some resources could not be displayed to all clients. We chose to (ab)use HTTP 409 Conflict.
> 10.4.10 409 Conflict
> The request could not be completed due to a conflict with the current state of the resource. This code is only allowed in situations where it is expected that the user might be able to resolve the conflict and resubmit the request. The response body SHOULD include enough
> information for the user to recognize the source of the conflict. Ideally, the response entity would include enough information for the user or user agent to fix the problem; however, that might not be possible and is not required.
> Conflicts are most likely to occur in response to a PUT request. For example, if versioning were being used and the entity being PUT included changes to a resource which conflict with those made by an earlier (third-party) request, the server might use the 409 response to indicate that it can't complete the request. In this case, the response entity would likely contain a list of the differences between the two versions in a format defined by the response Content-Type.
"As a web user I want our ISPs/governments to give us a nice error page so we understand what is going on when they DNS block or seize websites"
Or is it saying:
"As a web-master, when have to take down content due to legal proceedings I want a nice HTTP code to return"
They give example of the first (Virgin Media), but that takes down an entire domain, so it's kind of irrelevant if the correct HTTP code is returned, it's not like that is going to be resolved quickly. 503 would be the correct code here.
The second might be useful to spiders (who might want to back-off spidering so often for a while), but then wouldn't you just want to show your users a 404 with a nice reason why the content has gone.
I'm also surprised (this Error Code was first mentioned, as far as I know, when Ray Bradbury died)… but I like the subtle Element of not Mentioning it.
Surely this should be within the 5xx range of status codes? I get there's a reference to be had using 451 but this is more of a server error than client.
Usually, 5xx means that the client could retry the request at a later time, and have it succeed. 4xx means the client should expect the request to fail forever unless something is changed.
Some HTTP clients (not browsers, but other things) take advantage of this by showing the user an error dialog on a 4xx error, but just retry at a later time on a 5xx error.
"The 4xx class of status code is intended for cases in which the client seems to have erred"
vs
"Response status codes beginning with the digit '5' indicate cases in which the server is aware that it has encountered an error or is otherwise incapable of performing the request."
There's always that quote people chuck around a lot about censorship being an error so the internet routes around it. By that definition the server knows it has errored so it should be a 5xx response.
I think it is a 4xx client error, and not necessarily one that needs a new status code. It seems to me that it is a fairly simple case of 403 Forbidden:
The server understood the request, but is refusing to
fulfill it. Authorization will not help and the request
SHOULD NOT be repeated. If the request method was not HEAD
and the server wishes to make public why the request has
not been fulfilled, it SHOULD describe the reason for the
refusal in the entity. If the server does not wish to make
this information available to the client, the status code
404 (Not Found) can be used instead.
The 4xx class of status code is intended for cases in which the client seems to have erred. "
The client has not erred by requested a document that exists and which the server can technically provide (separately, the server has not erred by refusing to provide a document to the client which the exists and which access control would allow the client to have, because a government is threatening the server operator in some manner).
> The client has not erred by requested a document that exists and which the server can technically provide
The client has erred in requesting a document which the server is legally forbidden to provide to that client. As specified for 403, the server understands the request and refuses to fulfill it.
Admittedly, a hypothetical 6xx Third-Party Interference series of error codes might be useful for these kind of cases (and some instances currently handled by 503.)
Well, the cool thing about HTTP error codes is that you don't need a campaign or get permission from the W3C, you can just start using them if you want.
I get what you mean but I don't think this is the same thing. HTML has a DTD, something that people conform to when writing, so your <haiku> tag would not follow that guideline, whereas companies like Twitter can implement their own error codes as they see fit. See error code 420, Enhance Your Calm.[0]
Well, sure, just like I can create my invented domain name like hacker.news and use my own IP addresse 1.2.3.4 without being allocated them. The Internet interoperates by everyone agreeing to follow the same agreed conventions, but there is no rule that says you have to. The registry for HTTP status codes has no 420 code and could be assigned for a different purpose in the future: http://www.iana.org/assignments/http-status-codes
Yes, they can respond with 420. However, my browser does not have a predetermined response for that. Whereas if they respond with 200 or 302 or 404, all browsers know what to do; that's what it means to be a "standard".
Tangentially to your point, that's only true of the 4.x and earlier versions of HTML which are SGML applications, WHATWG HTML / W3C HTML5 is not SGML-based and does not have a DTD.
Eg, I can start putting <haiku> tags in my HTML if I want. The issue is whether anyone else will expect this tag or code and do anything meaningful with it.
Honestly, people who believe in this strongly enough should just start using them and provide themselves as examples of good use cases. That doesn't mean it shouldn't be campaigned for to get more people to use it.
There is a lot of discussion below on whether 451 is the right error code and how to implement it properly, but I'm missing one thing - what's the benefit of doing it as a status code at all?
If you're going to say that it raises censorship awareness - Internet protocols are intended as useful technical standards for programs to communicate, not vehicles for political goals.
What is the technical benefit of failing with a different error code? Is there need for client software to react differently to a 451 and a 403? The status code is not intended for the human user. If we want to raise awareness, than we already have means to do that - a 403 with a descriptive page citing the reasons. Many websites already do that when complying to DMCA takedowns.
Still, imho, 4XX could be a response for a given URL, but when "a website is blocked" (from the text of the previous URL), we should go to 5XX, as in 503 - service unavailable.
Legal restrictions on content are almost always issues on the client side, otherwise they should not be made available at all.
Consider for example the blocking of sites in the UK, or the blocking of the PirateBay website in the Netherlands. Those are all very limited audiences where the location of the client is causing a legal reason why the content can not be displayed.
In this form I totally agree. Reason being that by haveing a block at the web server level in essence places the blocking to be done by the hosting site and blocking based upon location and content. This places the onus of censorship upon the host, which they can already code for if they want.
Not sure placeing the onus of censorship into the hands of the host and not the goverment with there IP/DNS blocks or however they impose such blocks (China has a nice firewall for outside China sites and I dare to think of how they block a website inside China though can bet it is just as effective).
That all said the posiblility to volantarly do the blocking in a way that the powers that be will accept and with that allow there country's owners to see parts of yoru site that are legal and not the illegal parts for them. Well that would possibily have uses and opens your site up still instead of a blanket ban.
Piratebay has legal torrents, yet they are blocked as some form of descrimnation blanket ban. So it does have it's possibilities, albiet a dangerous path that should not be tread lightly.
Is this really necessary? How about 456 - unavailable because someone spilled coffee on our backend server? Or 467 - unavailable because garden gnomes invaded our offices?
I can see some reasoning behind this, but the reasoning is that the emphasis of the problem is "people are angry at the site because something is blocked so let's show an error code reflecting the real reason." Using 451 would take the emphasis away from the site and onto the legal oppressor.
On the other hand, why not inverse all inaccessible content to legal oppressors? Change the default meaning of 403 for example to "Access denied for permissive or legal reasons".
Change the default meaning of 403? I don't think that's a good idea. When a user needs to be logged in to do something and they aren't, you show them 403. When access is restricted to people outside a network, they see 403. It'll be hard to force a new behaviour onto the existing web, easier to add a new HTTP code.
My understanding is that the server should respond with 401 Unauthorized when someone is attempting to access a resource that requires authentication. What is the case for using 403 instead?
In case you didn't know, MP Clair Perry is a British Member of Parliament, sadly. She is one of those people who know better, than us "plebs".
IMHO, she is ignorant(1) and irrational(1), and there for scared witless of the internet, so she seeks to control it. I don't think she knows how wrong she is, there for I don't see her as evil, as such. However, it does seem our vote whore of a Prime Minister listens to her every word, hence the attempts to block porn and make people opt in to avoid the blocks, since her position ties in to an awful lot of right wing voters, who are equally, if not more, ignorant and irrational. They are the kind of people who genuinely believed Rock and Roll was the work of the devil, and that black people are a different species.
(1)I use these words for their real meaning and not the insult, judgement, or political value. I honestly think many people literally do not understand the technical issues, and act illogically as a result. I believe this is a lot to do with the traditional media, who politicians rely on, issuing scare stories about the internet because their business were and still are threatened by it. My "evil" in this is those who spread the lies, ie, the media. I hate to admit this, but in many ways I see the likes of Clair Perry as well intentioned victims and mules of the media.
Yet it is, because the courts have a different opinion on this, and they decide what happens. So it's better to make it clear to the end user when it happens, so at least they are not kept in the dark, or are basically lied to, about the reason why they cannot see the document.
If the 451 code is returned by the web server because the site got a DMCA request, then HTTPS doesn't matter because the destination web server already decrypted the session to find the request that you're making before returning the code.
Edit: oops, I was wrong. There is an RFC and it's linked from http://www.451unavailable.org/what-is-error-451/