Show HN: “HTTP 419 Never Gonna Give You Up” for bots

gnyman · on Oct 28, 2021

I definitely agree bots are underserved, I have a few things I do to keep them entertained, ssh bots are tar-pitted to keep them connected but busy, my hope is that I occupy at least one thread of not the whole process.

For wp-login bots I serve them a nice chunk of random (generated by a fuzzer) html in the hopes that 1. It wastes abit of their bandwidth/memory and 2. it crashes their parser

In reality I guess bots nowadays are sturdy enough to not get stuck or crash but who knows, feels good to do something :-)

Tarpit instructions https://nyman.re/super-simple-ssh-tarpit/

Wp-login page https://twitter.com/gnyman/status/1181652421841436672?s=20

And I remembered another nice trick which someone else came up with, zip bomb the bots :-)

https://blog.haschek.at/2017/how-to-defend-your-website-with...

SCHiM · on Oct 28, 2021

Although I think bots should be free to access the same content as humans do, I have a suggestion for your fuzzer anti-bot-spray:

It won't work on the more sturdy samples, but maybe try a GZIP bomb on https streams: https://www.infosecmatter.com/metasploit-module-library/?mm=...

rascul · on Oct 29, 2021

Could there be legal repercussions for doing this?

theshrike79 · on Oct 29, 2021

It's your server and someone is accessing it. It's up to you what you serve them.

If you want to be clear, you can put the gzip bomb behind a link that says "do not click, gzip bomb". The bot won't know the difference.

dusted · on Oct 29, 2021

Pre "guy views html source gets home raided for haxx0ring" I'd have said "you silly!"

Now... I'd say "there shouldn't be, it's your server, people can chose to access it or not, but if the right kind of fool comes along, there's no knowing where the stupid ends."

rexfuzzle · on Oct 29, 2021

There are some very cool ways of doing these tarpit. This for example: https://nullprogram.com/blog/2019/03/22/

inside65 · on Oct 29, 2021

I blocked almost all wp-login bots just using bot fight mode in Cloudflare few months ago along with some CF page rules to run an interstatial. It seems to losing effectiveness over time though, and since I do have WP-login, I wonder how I can implement something like your idea.

Maybe rename the legit login and put this in its place, but that would cause issues for redirects from the legit login link...

austinkhale · on Oct 29, 2021

Change your login path to something like /custom-admin. Then create a page rule to captcha any attempt to access /wp-login. What traffic other than bots is going to go to the old login page? You can change the login link to go to the new page.

danuker · on Oct 29, 2021

or better yet /custom-admin-07a4b58e-3880-11ec-904e-ba0baece2ff4

weird-eye-issue · on Oct 29, 2021

There are some popular WP plugins that takes care of changing the wp-login path

Arnavion · on Oct 29, 2021

Every time I read about ssh tarpits I wish I had a reason to set up one in my VPS. Alas it's much easier to use the VPS provider's network access rules to block all incoming traffic to tcp/22 that isn't from my IP.

gitgud · on Oct 29, 2021

> "And I remembered another nice trick which someone else came up with, zip bomb the bots :-)"

Just curious, is it legal to host a zip bomb on your website? I would think it would be classified under some kind of Cyber crime....

hoppyhoppy2 · on Oct 29, 2021

Legality aside, your web hosting provider may consider it as malicious software / cyberattack activity that breaks their TOS.

Piskvorrr · on Oct 29, 2021

Why would that be? It's not even executable code: someone would need to 1.actively request it, 2.actively save it somewhere 3.actively try to extract it.

krageon · on Oct 29, 2021

If the zip bomb explicitly targets bots it becomes not only a zip bomb, but a mitigation tailor-made to prevent abuse of your platform. Phrase it as the latter and it is probably okay.

pyuser583 · on Oct 29, 2021

It’s a bad idea.

zinekeller · on Oct 28, 2021

> I’m half joking, but if we can have HTTP 418 I’m a Teapot then there is enough room in the HTTP standard for the more useful HTTP 419 Never Gonna Give You Up error code.

Actually, there was a proposal to remove the 418 code formally, but in the end it was grandfathered in. Unfortunately, unless you have convinced a lot of people to allow 419, it would be not allowed anymore (even in a April Fools' RFC) according to the established protocol of IANA controlling the allocation of error codes, and IANA no longer allow "joke" allocations unless there was an RFC clarifying why that particular code must exists in a non-joking manner (see 451, in homage to Fahrenheit 451 but is the recommended code for a informed block). Even 418 was technically only reserved in such a way that allows it to be overridden in case that a good demonstration that 418 should be the code for that error.

bigiain · on Oct 29, 2021

  HTTP/1.1 527 Railgun Error
  Server: Ballistic Research Laboratory - CHECMATE
  Date: Fri Oct 29 02:08:03 2021
  Connection: Keep-Alive-overridden
  Authorization: Rules-Of-Engagement-090624-2021-10-29
  Content-Type: uranium/depleted
  Content-Weight: 248kT equivalent

zinekeller · on Oct 29, 2021

And? So is this the C********* equivalent of C***** forcing web standards without even consulting others?

bigiain · on Oct 29, 2021

No. This is the HN execution of a lame joke on my part...

lofties · on Oct 29, 2021

For what it's worth, I loved the joke.

chrismorgan · on Oct 28, 2021

The thing that really disappoints me is that 418 I'm a Teapot isn’t registered—instead it’s reserved as “(Unused)”: https://www.iana.org/assignments/http-status-codes/http-stat..., https://www.ietf.org/archive/id/draft-ietf-httpbis-semantics.... As it stands, I suspect (as one that’s been involved in a couple and examined more back in 2013–2014) that most even vaguely recent HTTP libraries that have some kind of status code enum or constants defined take their data from the HTTP status codes registry, with a single exception for 418 I'm a Teapot.

As far as a 419 is concerned, I’d argue that 418 is already suitable anyway as a joke alternative to the more serious 429/503: “wp-admin.php? I’m not WordPress, I’m a teapot!” (Similar style to the joke about one cow warning another about the mad cow disease in the area, and the other responding that it’s not not worried because it’s a helicopter.)

nikeee · on Oct 29, 2021

418 is defined in HTCPCP, an _extension_ to HTTP. That's why I never understood why people use it in HTTP. So it makes sense that it's only reserved in HTTP.

chrismorgan · on Oct 29, 2021

WebDAV is also an extension to HTTP. Its additional status codes and methods are added to the corresponding registries.

The entire raison d’être of such registries is to include any extensions; if you only cared about the core stuff, you wouldn’t define a registry.

anonymousiam · on Oct 29, 2021

Another good reason to have a 419 response code:

https://www.419eater.com/

https://en.wikipedia.org/wiki/419eater.com

bradgessler · on Oct 29, 2021

One day I will make an IoT teapot. It will have an HTTP API that responds with a 418 and legitimize the code once and for all.

unanswered · on Oct 28, 2021

IANA controls http codes only insofar as no one has told them to knock it off yet. There's no major interop risk from conflicting (200, 400, 500) codes in the way there is for other namespaces because the semantics are essentially contained only in the first digit.

spc476 · on Oct 29, 2021

I use 418 on my gopher server [1] to inform misinformed webbots that they're not talking to an actual webserver. It works remarkably well.

[1] gopher.conman.org

Jaruzel · on Oct 29, 2021

Added to my gopher bookmarks[1] - is floodgap aware of you? I've not see you on the lists of live servers.

---

[1] Shameless plug: http://jaruzel.com/gopher/gopher-client-browser-for-windows

spc476 · on Oct 29, 2021

Yes they are. I'm on the list of sites under ".com, .net, .info, .land and .org" (#22).

Jaruzel · on Oct 30, 2021

I realised thy would be after I posted my comment - I recognised your username from the Gopher mailing list. :)

pyuser583 · on Oct 31, 2021

What are some good gopher resources?

Slade1 · on Oct 28, 2021

If you're aware that someone is doing penetration tests on your system, but their probing isn't significantly costing you resources, wouldn't you instead just give some generic response to not clue them into you knowing their intention? There's a lot of people who basically do that with scam callers by just leading them on and wasting the scammers time.

LinuxBender · on Oct 28, 2021

I used to do something along this line. If I saw a bot then I would use ACL's in haproxy to serve up some static pages from memory that contained strings their request was looking for. This of course attracted more bots. It didn't cost me anything aside from making my logs a bit more noisy, so I disabled logging for the bots. Then I found a funny side effect of shodan showing my nodes being vulnerable to many things. That was a blemish so I disabled the ACL's. In hind-sight and knowing how bot farms work it wasn't really wasting anyone's time or resources but was a fun little learning exercise.

verdverm · on Oct 28, 2021

I wonder if zip bomb like responses will still work for the majority of bots

https://blog.haschek.at/post/f2fda

LinuxBender · on Nov 4, 2021

Maybe sometimes, but you would just be the reason some random person said "Dammit my machine blue screened again." or "Why is my machine using so much ram?" The C2 machines would detect this node offline and use a different one. On the plus side, maybe a percentage of those people would re-image their machines and patch them.

hyperman1 · on Oct 28, 2021

Send them redirects to a russian governemental site. They'll take care of it

arthurcolle · on Oct 28, 2021

This could be seen as abuse by the .ru and .su folks

WesolyKubeczek · on Oct 29, 2021

Those folks have been actively abusing international laws, sponsoring cybercrime, and responding with “so what?”

That’s what. Deal with it. Build your enclosed cheburashka internets or whatever. I couldn’t care less about hurting their feelings.

Waterluvian · on Oct 28, 2021

Redirect to a honeypot as a service that utterly wastes someone’s time.

t0mas88 · on Oct 28, 2021

You could but it's extra work to build that into the application while you could use a generic off the shelf WAF / IDS type solution that just blocks them. Won't fully stop a targeted manual attack but it is enough to make bots move on to their next target. And it slows down any manual reconnaissance work.

saurik · on Oct 28, 2021

Blocking someone is still more generic than returning a specific HTTP response code specifically designed to inform the other party of your suspicion.

_lqaf · on Oct 28, 2021

I like the spirit of the idea, but messing with bots and script kiddies is best kept a highly local thing.

You don't need a standardized error code to signal to a red team, you can say "hi" in a number of different ways, depending on what they're poking at. And if everyone is doing the same thing to script kiddies, well, where's the sport in that?

dmitrijbelikov · on Oct 28, 2021

https://en.wikipedia.org/wiki/List_of_HTTP_status_codes#Unof...

419 Page Expired (Laravel Framework) Used by the Laravel Framework when a CSRF Token is missing or expired.

throwaway81523 · on Oct 29, 2021

419 error is a Rick roll? Ridiculous. It obviously has to be a once in a lifetime opportunity from a Nigerian prince.

lliamander · on Oct 29, 2021

HTTP 420 - Enhance Your Calm, could also be useful here if you are going to be explicitly rate-limiting the client.

omgitsabird · on Oct 29, 2021

Method Failure in Spring.

"Shut The Fuck Up" in my framework.

andrethegiant · on Oct 28, 2021

If it redirects then it should be in the 3xx class

bradgessler · on Oct 29, 2021

I was hesitant on the redirect. It would probably be easier to demand the spec displays "Never Gonna Give You Up" in the appropriate requested format.

willcipriano · on Oct 28, 2021

400's are errors caused by the client, I think that fits better.

ufmace · on Oct 28, 2021

Probably shouldn't be made an official thing, but it'd be funny to do this on all the various minor manually-adminned sites out there.

zeepzeep · on Oct 29, 2021

Just return random status codes that still render html on every page to troll bots.

https://www.youtube.com/watch?v=I3pNLB3Cq24

nunez · on Oct 28, 2021

a part of me is definitely in favor of this, but another part of me wants to avoid turning http error codes into a meme

dane-pgp · on Oct 28, 2021

No, this is turning a meme into an HTTP error code. You're thinking of:

https://i.redd.it/wmwqgt9kbop41.jpg

djbusby · on Oct 28, 2021

You're thinking of

https://m.youtube.com/watch?v=dQw4w9WgXcQ

dane-pgp · on Oct 29, 2021

You almost got me. ;-) I remembered the timeless advice, though: "XcQ - link stays blue" (or in this case, "black").

DeathArrow · on Oct 29, 2021

Why not redirect the bot to fbi.gov and let them scan that?

hoppla · on Oct 29, 2021

If the requirement is that client should follow the redirect, one should not use a 4xx status code. I think “319 never gonna give you up” is more adequate

omgitsabird · on Oct 29, 2021

Laravel uses 419 as Page Expired:

https://github.com/laravel/framework/blob/a2c557a1b697c46292...

dusted · on Oct 29, 2021

Superficially a fun idea..

Side efffects may include:

* Helping bot authors improve their bot so it won't be identified.

* Revealing how good you are at detecting bots.

grodes · on Oct 29, 2021

I prefer to just return a 404 if I know for sure that it is a bot to try to cheat them

ChrisMarshallNY · on Oct 29, 2021

I'll vote for that (but no one asked me). I usually use 418 for similar stuff.

bencollier49 · on Oct 29, 2021

I mean, technically, wouldn't this make bot scanning more efficient?

pull_my_finger · on Oct 29, 2021

NGINX has a very nice status 444 that silently closes the connection, I think serves as a great way to deal with uninvited connections.

zhte415 · on Oct 29, 2021

Limit request rate and be done with it other than reviewing 429s, 404s, 401s etc?

noduerme · on Oct 29, 2021

hell, let's go up to 10000 response codes and sell them to the highest bidding meme of the year.

oshiar53-0 · on Oct 29, 2021

Scammer prank calls, but for bots

spiderfarmer · on Oct 28, 2021

I would like to target the Brave search crawler/bot, but they’re hiding themselves like every other spambot: https://twitter.com/tinusg/status/1453862793933897729?s=21

sp332 · on Oct 28, 2021

This says that the index is created from users' own web browsing, not from a bot.

zeepzeep · on Oct 29, 2021

Wait brave indexes what you browse? How has there not been some bug yet? Can't imagine that going well...

spiderfarmer · on Oct 31, 2021

It all seems quite scammy. They claim it's opt-in, but as a Brave user I can't find the option to opt-in or opt-out.

They look the searches you do at Google and other search engines. The search terms and the results you click on gets sent their way, including metadata from the page.

The content itself is downloaded by their bot (more like a fetcher). That bot has the user agent of a regular browser, so you won't see it.

You also can't specifically block their fetcher. It only adheres to disallow *.