Defending a website with Zip bombs

colanderman · on July 6, 2017

I would think Transfer-Encoding would be a better choice than Content-Encoding. It's processed at a lower level of the stack and must be decoded – Content-Encoding is generally only decoded if the client is specifically interested in whatever's inside the payload. (Note that you don't have to specify the large Content-Length in this case as it is implied by the transfer coding.)

Also worth trying is an XML bomb [1], though that's higher up the stack.

Of course you can combine all three in one payload (since it's more likely that lower levels of the stack implement streaming processing): gzip an XML bomb followed by a gigabyte of space characters, then gzip that followed by a gigabyte of NULs, then serve it up as application/xml with both Content-Encoding and Transfer-Encoding: gzip.

(Actually now that I think of it, even though a terabyte of NULs compresses to 1 GiB [2], I bet that file is itself highly compressible, or could be made to be if it's handcrafted. You could probably serve that up easily with a few MiB file using the above technique.)

EDIT: In fact a 100 GiB version of such a payload compresses down do ~160 KiB on the wire. (No, I won't be sharing it as I'm pretty sure that such reverse-hacking is legally not much different than serving up malware, especially since black-hat crawlers are more likely than not running on compromised devices.)

[1] https://en.wikipedia.org/wiki/Billion_laughs

[2] https://superuser.com/questions/139253/what-is-the-maximum-c...

masklinn · on July 6, 2017

> EDIT: In fact a 100 GiB version of such a payload compresses down do ~160 KiB on the wire. (No, I won't be sharing it as I'm pretty sure that such reverse-hacking is legally not much different than serving up malware, especially since black-hat crawlers are more likely than not running on compromised devices.)

http://www.aerasec.de/security/advisories/decompression-bomb... has a triple-gziped 100GB file down to 6k, the double-gzipped version is 230k.

I'm trying on 1TB, but it turns out to take some time.

mirimir · on July 6, 2017

> ... I'm pretty sure that such reverse-hacking is legally not much different than serving up malware, especially since black-hat crawlers are more likely than not running on compromised devices.

Sure. But gzip bombs don't do substantial (if any) permanent damage. At most, they'd crash the system. And indeed, that might attract the attention of owners, who might then discover that their devices had been compromised.

colanderman · on July 6, 2017

You only get two gzips, not three :) (transfer and content codings)

Unless you're serving up XML which is itself zipped, such as an Open Office document. But most clients won't be looking for that.

Filligree · on July 6, 2017

One useful trick is that, for gzip, d(z(x+y)) = d(z(x) + z(y)).

So you don't need to compress the entire terabyte.

masklinn · on July 6, 2017

I'd expect that to provide a lower compression, though it may not matter given the additional followup gzips.

The compression finally finished after 3h (on an old MBP), "dd if=/dev/zero bs=1m count=1m | gzip | gzip | gzip" yields a bit under 10k (10082 bytes), and adding a 4th gzip yields a bit under 4k (4004 bytes). The 5th gzip starts increasing the size of the archive.

Filligree · on July 7, 2017

It does, though I once used that trick to create a file containing more "Hello, World" lines than there are atoms in the universe. By, hmm, quite a large factor. It probably isn't a serious concern.

It still fit on a floppy disk. :)

colanderman · on July 6, 2017

That's true for the content stream but not gzip files themselves, which do have a minimal header.

Filligree · on July 11, 2017

Which gunzip will overlook / handle correctly, so concatenating the compressed files does work.

amenghra · on July 6, 2017

If you are smart, you craft your compressed files to be infinite by making them Quines: https://alf.nu/ZipQuine

cyounkins · on July 6, 2017

It's unlikely any of the parsers would apply recursive decompression, so a ZipQuine isn't going to help exhaust resources.

vog · on July 6, 2017

Indeed. ZipQuines are not targeted against servers, but against middleware. More precisely, these try to attack certain kinds of "security scanners". Those security scanners want to unpack as much as possible, by design(!), otherwise bad code could hide behind yet another round of compressed container formats.

masklinn · on July 6, 2017

Yes scanners and AV is what's targetted by quines.

pinpeliponni · on July 6, 2017

At least none of the Java libraries seemed to fall for this when I tried.

amenghra · on July 6, 2017

True. I wonder what's the largest file you can decompress to given some bound on the compressed result.

jwilk · on July 6, 2017

Most (all?) user-agents don't set the TE header (which specifies the transfer encodings the UA is willing to accept), so I doubt they would bother to decode unsolicited "Transfer-Encoding: gzip".

mnarayan01 · on July 6, 2017

The Accept-Encoding header is used to specify the acceptable transfer encodings. Certainly all browsers set it, and most "advanced" frameworks will at least handle the common values of Transfer-Encoding regardless; a "malicious" crawler will almost certainly have to, as plenty of sites use Accept-Encoding along with User-Agent to block undesirable bots.

colanderman · on July 6, 2017

No, Accept-Encoding pairs with Content-Encoding, not Transfer-Encoding (which is controlled by TE as GP indicated). [1] [2]

[1] https://tools.ietf.org/html/rfc7231#section-5.3.4

[2] https://tools.ietf.org/html/rfc7230#section-4.3

mnarayan01 · on July 6, 2017

Bah I can't read...you two are correct.

colanderman · on July 6, 2017

Good point, I didn't realize that.

rixrax · on July 6, 2017

I wonder how these techniques play along with random IDS/IPS/deep packet inspection/WAF/AV/DLP/web cache/proxy/load balancer/etc. devices that happen to look/peak into traffic that passes through network. I would wager my $ that more than a couple will need some admin(istrative) care after running some of this stuff via them.

And btw -- when you end up accidentally crashing/DoS:ing your corporate WAF or ISPs DPI, who are they going to call?

colanderman · on July 6, 2017

I worked on an IPS a few years back. It was specifically designed NOT to inflate arbitrary ZIP files. All decoders worked in a streaming fashion and were limited to a fixed (low) nesting depth and preallocated memory. Typically they would be configured to reject traffic such as the payload we've been discussing.

geek_at · on July 6, 2017

How come when I posted this (my blog post) here I only got 2 points? https://news.ycombinator.com/item?id=14704462 :D

z3t4 · on July 6, 2017

I'm also intrigued by this, it also happens in comments, that the top post with 20+ upvotes can have the same content as the most down-voted post with -3 points, but not as common as reposts getting to the front page, while the original only got 2 points.

After submitting something to HN I like to watch the HTTP logs, I get a lot of visitors from bots, but it's actually only ca 10-20 real people that actually read your blog. I don't know eneough of statisitcs to explain it well, but as 20 people is so small amount of the total HN readers, it's basically luck. And the representation of those who reads the "new" section might be a bit skewed from those who only reads the front page. If you want to help HN get better with more interesting content, you can help by actually visiting the "new" section.

mncharity · on July 6, 2017

We know how to deal with this, and have for years. A bot which instruments and invokes humans, learning about content and individuals both. Few humans are needed each time, and those need not be experts, if used well. 20 people is much more than enough. A candy machine, in an undergraduate lounge, can grade CS 101 exams.1 Ah, but discussion support - from Usenet to reddit (and on HN too), incentives do not align with need. Decades pass, and little changes. Perhaps as ML and croudsourcing and AR mature? Civilization may someday be thought worth the candles. Someday?

1 http://represent.berkeley.edu/umati/

Edit: tl;dr: Future bot: "I have a submission. It has a topic, a shape, and other metrics. It's from a submitter, with a history. Perhaps it has comments, also with metrics, from people also with histories. I have people available, currently reading HN, who all have histories. That's a lot of data - I can do statistics. Who might best reduce my optimization function uncertainty? I choose consults, draw the submission to their attention, and ask them questions. I iterate and converge." Versus drooling bot: "Uhh, a down vote click. Might be an expert, might be eternal September... duh, don't know, don't care. points--. Duh, done."

mncharity · on July 6, 2017

Hmm, two downvotes. My first. So comments would be especially welcome.

Context: The parent observed that with a small number of people up/down voting, the result was noisy. I observed the numbers were sufficient, if the system used more of the information available to it. And that the failure to do so is a long-standing problem.

Details, in reverse order: "Civilization": Does anyone not think the state of decision and discussion support tech is a critical bottleneck in engineering, business, or governance? "AR": A principle difficulty is always integrating support tech with existing process. AR opens this up greatly. At a minimum, think slack bots for in-person conversations. "crowdsourcing": or human computation, or social computing, is creating hybrid human-computer processes, where the computer system better understands the domain, the humans involved, and better utilizes the humans, than does a traditional systems. "ML": a common way to better understand a domain. As ML, human computation, textual analysis, etc, all mature, the cost and barrier to utilizing them in creating better discussion systems declines. "Usenet": Usenet discussion support tooling plateaued years before it declined. Years of having a problem, and it not being addressed. Of "did person X ever finishing their research/implementation of tool Y". "decades": that was mid-1990's, two decades ago. "little changes": Discussion support systems remains an "active" area of research - for a very low-activity and low-quality value of "active". I'm unclear on what else could be controversial here.

For anyone who hasn't read the paper, it's fun - it was a best paper at SIGCHI that year, and the web page has a video. A key idea is that redundant use of a small number of less-skilled humans (undergraduates grading exam questions), can if intelligently combined, give performance comparable to an expert human (graduate student grader). Similar results have since been shown in other domains, such as combining "is this relevant research?" judgments from cheaper less-specialized doctors with more-expensive specialized ones. On HN, it's not possible to have a fulltime staff of highly skilled editors. But it is technically plausible to use the exiting human participants to achieve a similar effect. That we, as a field, are not trying very hard, reflects on incentives.

isolli · on July 6, 2017

Maybe the "new" section should be the default one.

The Economist once changed their default comments view from "most recommended" to "newest". Suddenly the advantage of being the first to post a moderately good comment vanished. Design matters.

peterburkimsher · on July 6, 2017

You can try writing "Show HN" at the start of your title, but that hasn't even helped me too much. I still get more upvotes/karma from writing witty comments instead of publishing code.

em3rgent0rdr · on July 6, 2017

a bit of a witty comment there...upvoted. :)

pwm · on July 6, 2017

My pet theory is that very few people visit /newest and even less are consistently and persistently active over there. This means that, in general, only a handful of people's inherently subjective taste steers the ship. I find this fascinating.

bfung · on July 6, 2017

Lessons learned:

    1. Stop posting to HN and let others do it for you
    2. Spend more effort in writing blog and less on HN karma
    3. ... (monetize)
    4. Profit!!! (+ more blog posts about how you monetized and instantly gain more HN karma than posting will ever do)

</saracasm>

=) upvoted for original content.

Just1689 · on July 6, 2017

Upload time of day is also important

vacri · on July 6, 2017

Also if it doesn't make it to the front page, it's hardly seen at all.

vog · on July 6, 2017

In that respect, news aggregators are remarkably similar to growth-based startups.

mirimir · on July 6, 2017

I've read that all posts make it to the front page. But they don't stay there very long, unless they get enough upvotes quickly enough. But if they get too many upvotes too quickly, with few comments, they get kicked off the front page.

dTal · on July 6, 2017

I've read the converse, that a comment without a vote harms a story's ranking.

I'm certain there's something (unintentionally) perverse in the ranking system, but as it's secret it's impossible to say for sure.

Secret rules that manipulate our behaviour are damaging.

mirimir · on July 6, 2017

I may have misremembered. Whichever it was, I believe that the intent was to discourage low-quality discussion.

_0ffh · on July 6, 2017

Yup, I think an article needs to get over a certain threshold before it even stands a chance to collect lots of points. Having a couple of early upvotes helps an article get over that first hump.

gpvos · on July 6, 2017

Bad luck. There aren't very many people looking at the new posts page, and posts drop off quite fast.

ridgewell · on July 6, 2017

Sorry! Didn't realize you already posted this before.

amenghra · on July 6, 2017

Meritocracy /s

Scryptonite · on July 6, 2017

Reminds me of a time I once wrote a script in Node to send an endless stream of bytes at a slow & steady pace to bots that were scanning for vulnerable endpoints. It would cause them to hang, preventing them from continuing on to their next scanning job, some remaining connected for as long as weeks.

I presume the ones that gave out sooner were manually stopped by whoever maintains them or they hit some sort of memory limit. Good times.

reitanqild · on July 6, 2017

I do the same to "Microsoft representatives" that call me because I have "lots of malware on my computer".

Keep them on line by being a very dumb customer until they start cursing and hang up on me. : - )

Gaelan · on July 6, 2017

1-347-514-7296 is a phone number that automates this. Add it to a conference call, and frustrate the caller with no additional work. http://Reddit.com/r/itslenny is the closest thing it has to an official site.

ConfucianNardin · on July 6, 2017

There's also this person https://www.youtube.com/watch?v=EzedMdx6QG4, counter call-flooding the scammers' phone lines.

porjo · on July 6, 2017

Thanks, I hadn't heard of Lenny before. Jolly Roger is similar thing - creator gave a Ted talk: https://www.youtube.com/watch?v=UXVJ4JQ3SUw

throwanem · on July 6, 2017

It just rang forever when I tried it...

Gaelan · on July 6, 2017

Ah, apparently you need to be whitelisted, my bad. There’s more information in the Reddit link.

EDIT: Nope, apparently the person who runs it takes it down at night for some reason. Maybe to minimize people using it as a prank call?

cdubzzz · on July 6, 2017

You do have to get whitelisted though, last I checked. I was able to use it one time and now if I call it I get a message about whitelisting. Same for JollyRoger.

throwanem · on July 6, 2017

Doesn't that burn up a lot of time? I just cut straight to the hanging-up part, with optional cursing to taste.

alkonaut · on July 6, 2017

Yes, it burns their time. I just put them on speaker phone and continue working as usual on my computer, occasionally uttering a "yes" or "what's that?" when they tell me for the hundredth time to go to some phishing site. It's astonishing how long it takes for them to give up.

When working from home this is one of the few joys/social interactions if the day :D

morrbo · on July 6, 2017

This is actually a new take on a fairly well known security/DoS attack called the "slow post" attack (https://blog.qualys.com/securitylabs/2012/01/05/slow-read)

pretty novel to see it used the other way around though!

ruytlm · on July 6, 2017

Interesting and related re attacks on a Tor hidden service: http://www.hackerfactor.com/blog/index.php?/archives/762-Att...

And the follow up: http://www.hackerfactor.com/blog/index.php?/archives/763-The...

vgb2k11 · on July 6, 2017

Very interesting read indeed. I've a question about it; the article is about defeating malicious crawlers/bots affecting a TOR hidden service, so my question is, how might the author differentiate bot requests from standard client requests on a request-by-request basis? I mean, can I assume that many kinds of requests arrive at hidden service through shared/common relays? Would this mean other fingerprinting methods (user agent etc) would be important, and if so, what options remain for the author if the attackers dynamically change/randomise their fingerprint on a per-request basis?

Laforet · on July 6, 2017

Very entertaining read, thank you for sharing these. I used to dismiss black ICE [0] as cyberpunk tropes and it's good to be proven ignorant.

[0]:https://en.wikipedia.org/wiki/Intrusion_Countermeasures_Elec...

compguy · on July 6, 2017

Wait a minute... He is doing the exact same thing as the former RaaS (ransomware as a service) operator Jeiphoos (he operated Encryptor RaaS). It's know that Jeiphoos is from Austria. Exactly one year after the shutdown of the service, someone from Austria is publishing the exactly same thing an Austrian ransomware operator were doing a year ago.

jagermo · on July 6, 2017

Aha! the Hackernews detectives are on the case!

avaer · on July 6, 2017

Does anyone know if this kind of white hat stuff has been tested by law?

Because it seems in the realm of possibility that if a large botnet hits you and your responses crash a bunch of computers you could do serious time for trying it. I'm hoping there's precedent against this...

ajarmst · on July 6, 2017

He's got a pretty good defence in that all he's really doing is filtering requests and serving up a really large file to some of them. No active agency, and no executable code. If merely loading a large file crashes a computer, that's arguably the fault of the browser and/or OS.

shakna · on July 6, 2017

Intent really matters, especially in cases like these. He's serving up files deliberately, knowing they will likely cause problems.

Microsoft doesn't take the fall for malware, even if its a fault in SMB or the like.

The intent is damage.

ajarmst · on July 6, 2017

Probably his best defence is the fact that it's really unlikely that the attackers would ever swear a complaint or testify. Kind of a "robbing drug dealers problem". I'd be more worried about being targeted by a massive DDOS.

colechristensen · on July 6, 2017

What if their intent is to find things to sue for?

hobs · on July 6, 2017

They have a term for that, its "vexatious litigant." If you do this enough, the court generally makes it hard to get counsel, will make you get your lawsuits approved by a judge ahead of time, and more.

0xfeba · on July 6, 2017

eg. Jack Thompson

jfoutz · on July 6, 2017

They're probably already committing a felony accessing the computer. The scan is an intent to transmit malware. If that's true, you could make a pretty good fleeing felon argument.

shakna · on July 6, 2017

> They're probably already committing a felony accessing the computer.

He bases this attack on IP addresses. IPv4 addresses are regularly shared between consumers. He's tossing a knife into a crowd because he thought he saw someone.

> you could make a pretty good fleeing felon argument.

In a nation that allows you to attack, not just restrain, a fleeing felon.

But his attack may hit a nation that doesn't allow that.

nostoc · on July 6, 2017

He does not base the attack on IP address. He detects vulnerability scanner and send them the crafted content.

You ask for something a vulnerability scanner would ask for? You get a gzip bomb.

shakna · on July 6, 2017

> Awesome! My production implementation of the bomb also looks at 404's and 403's per IP and if there are too many of those it will send the bomb. [0]

[0] https://www.reddit.com/r/PHP/comments/6lfl6p/i_have_created_...

mcv · on July 6, 2017

But he's serving those files only to people looking to cause problems. It's self-defense.

shakna · on July 6, 2017

Self-defence is not normally an acceptable reason where technology and law collide.

Let's be frank.

He's serving up malware to potential users who hit too many 404s.

> Awesome! My production implementation of the bomb also looks at 404's and 403's per IP and if there are too many of those it will send the bomb. [0]

This could be exploited by a third party, which makes him complicit.

He targets IP addresses, and as the IPv4 world often shares those, he can attack innocent bystanders who happen to be in the same allocation as a miscreant.

Finally, self-defence is established as denial or dropped connections. As he's intentionally avoided established practice, and developed an attack instead, it becomes undue harm.

Let alone if he attacks someone in a nation that has an extradition treaty, but no concept of this sort of "fighting back".

[0] https://www.reddit.com/r/PHP/comments/6lfl6p/i_have_created_...

scaryclam · on July 6, 2017

In a perfect world, that's what he's doing. In reality, he's potentially being a big jerk to legitimate users and giving a tool that can allow malicious people to send victims his way. It'd be self defence to cut the connection, not to send harmful files.

ajarmst · on July 6, 2017

That's a good point, although I think the innocuosness of the action would be at least a mitigating factor. I wouldn't expect MS to take any blame, but the "damage" being due to faults in the OS or browser would also be mitigating---a minor rearend collision on a Ford Pinto could cause it to explode because of a design flaw, but the driver of the other car wouldn't be charged with arson. (Afterthought: he might be if he rammed it deliberately, so I guess that supports your thesis rather than mine)

vbezhenar · on July 6, 2017

There are laws allowing person to shoot intruder in their house. And I can't serve nulls from my own web server? That would be ridiculous.

test1235 · on July 6, 2017

From what I've read, in some parts of America it seems okay to shoot at intruders running away from your house, which I find unreasonable.

A farmer here in UK stirred up a whole load of shit when he shot two burglars [1] trying to escape from his property.

[1] https://en.wikipedia.org/wiki/Tony_Martin_(farmer)

rmc · on July 6, 2017

The UK (or English?) law about self defence is "back to the wall", i.e. you can invoke leathal force to defend your own life when your back is against the wall, when you have no other option, and no way to escape. In other words, if you can retreat from the situation, then you must retreat.

Some places in the USA have "stand your ground" laws. These say you aren't required to retreat, that you can "stand your ground", that you can use (legally) leathal force without requiring that your back is against the wall.

rocqua · on July 6, 2017

As I recall, stand your ground laws, based on castle doctrine, means that "but you could have fled your own home" does not invalidate self-defence. I think you are still required to retreat when on the street.

As for people running away, the only way I see self defence working is when they still pose an 'imminent threat to life' which seems rather hard to argue.

andrewla · on July 6, 2017

The castle doctrine is distinct from "standing your ground", though it is to some extent subsumed because most stand-your-ground laws say that you have no duty to retreat from a place that you have a legal right to be, which naturally includes your home.

Florida [1], for example, says:

> ... A person who uses or threatens to use deadly force in accordance with this subsection does not have a duty to retreat and has the right to stand his or her ground if the person using or threatening to use the deadly force is not engaged in a criminal activity and is in a place where he or she has a right to be.

In section 0776.013, the castle doctrine is also noted, but is more expansive, and includes the use of deadly force even if there is no threat of imminent harm.

[1] http://www.leg.state.fl.us/statutes/index.cfm?App_mode=Displ...

test1235 · on July 6, 2017

Your last line is the bit I've never been able to understand. If someone is running from you, do you have any legal argument for killing them?

rocqua · on July 7, 2017

Never say never, but it would be very rare for a fleeing person to pose immediate threat. Examples would be people running for a gun, running to get help, running to kill someone else, or running for cover.

I think all of those cases are covered by any imminent threat clause, and thus do not need special exemptions. Just like there isn't an exemption that you are not allowed to shoot a retreating person. It simply follows because (with exceptions) retreating people aren't imminent threats.

Filligree · on July 6, 2017

It gets much harder to argue, but you'd have a case if they were, for example, running back to their car to get weapons. I'm sure you can think of a hundred other scenarios as well.

That isn't normal, though. It's likely that you were already feuding, and so the law will look askance at you for not bringing authorities into it much earlier.

suneilp · on July 6, 2017

What if they have stolen your property. Do you not have the right to get it back by force? Does the value of the property matter? If so, who gets to decide that in the moment?

icefo · on July 6, 2017

If you live in Texas you can use deadly force if "the land or property cannot be protected or recovered by any other means". Source: http://codes.findlaw.com/tx/penal-code/penal-sect-9-42.html

I've read but couldn't find again the story of someone shooting a tief to get back his VHS player and walk free.

rocqua · on July 7, 2017

The value of the property does not matter. They do not pose a lethal threat and therefor cannot be shot. At least, that's how it should be, I don't know legally.

chillydawg · on July 6, 2017

You have zero right to kill someone for stealing.

technofiend · on July 6, 2017

In Texas you may make use of your weapon to stop the execution of a crime if you yourself are not also engaged in criminal activity. It's far larger than castle doctrine because it applies anywhere.

I'm not arguing for actually using the law to shoot people: I don't ever want to be in that situation myself, but I'm saying depending on the situation you do in fact have the law on your side.

dec0dedab0de · on July 6, 2017

what if they stole your laptop that contains a new branch you havent pushed to a remote yet.

technofiend · on July 6, 2017

No jury would convict.

andrewla · on July 6, 2017

UK common law as it pertains to "duty to retreat" and "self-defence", is largely a question for the jury. There is no fixed legal standard other than whether the actions were reasonable given the person's knowledge of the situation at the time.

The US tends to be a little more prescriptive, leaving a situation where different jurisdictions have more specific requirements for defining what constitutes self-defense.

Juries in the UK tend to have significantly more responsibility for making judgments like these, leading to a system where evolving views of what is right and wrong can result in standards naturally evolving over time, rather than being fixed by what people thought was okay thirty years ago.

kodt · on July 6, 2017

Yes, in Texas you can use lethal force to prevent a burglary, robbery or theft (at night) and can also use lethal force on someone fleeing with stolen property in order to recover it.

rmc · on July 6, 2017

Most of those laws are self defence laws. The US & the UK have slight differences, but you're often allowed to use leathal force to prevent yourself being killed.

vacri · on July 6, 2017

You're allowed to use equal force in the UK, as I understand it, which means if someone attacks you with fists, you can't shoot them in return. If you're in danger of being killed, then you'd be able to use lethal force.

blunte · on July 6, 2017

Do you have an NRA?

ianai · on July 6, 2017

A better outcome for an infected machine is complete failure than silent intrusion. The person then definitely knows something is wrong, AV software or not.

microcolonel · on July 6, 2017

I don't think there's a law against serving obscenely large files on the web, at least nowhere except Germany.

komali2 · on July 6, 2017

>I don't think there's a law against...

Connecting to a server...( A lot)

Putting random strings into forms...( A lot)

Moving your money between banks... (In different countries)

Buying stocks... (With insider knowledge)

A simple act doesn't spell the whole story, and fraud, computer crime, etc laws are written vaguely enough for a country to prosecute someone " sending large files."

MichaelRenor · on July 6, 2017

This is why web crawlers are built with upper boundaries on everything!

Nobody malicious brings down crawlers. It's just unexpected things you find out on the internet.

jacquesm · on July 6, 2017

> Nobody malicious brings down crawlers.

You're wrong about that. I've more than once brought down crawlers on purpose, especially the ones that didn't respect robots.txt.

eyuelt · on July 6, 2017

The article says that 42.zip compresses 4.5 petabytes down to 42 bytes. It should say 42 kilobytes.

I don't see a way to comment on the article itself, but hopefully the author reads this.

ben174 · on July 6, 2017

Thank you. I was going crazy trying to think of what the contents of that 42 bytes would have been.

rootlocus · on July 6, 2017

Without any headers, metadata or padding and using RLE one byte for 0 and 8 bytes for the number of zeroes: 10^15 will easily fit 9 bytes and can be used to generate a file filled with one petabyte of zeroes.

gberger · on July 6, 2017

Well, yes, but that's not ZIP encoding.

geek_at · on July 6, 2017

thanks I fixed it

philjohn · on July 6, 2017

Another small point, you don't need that starts with function, just use strpos === 0.

jacquesm · on July 6, 2017

It's actually good practice to wrap a bit of code like that in a function with a name, as long as it isn't in some kind of extremely frequently executed inner loop.

'starts_with' is descriptive and language agnostic where '=== 0' is neither.

TekMol · on July 6, 2017

I don't think this "Defends" your website. If anything, it draws attention to it.

Might also be used for some kind of reflection attack. Want to kill some service that let's users provide a url (for an avatar image or something) - point it to your zip bomber.

_pctq · on July 6, 2017

To be fair, people wanting to do that don't need author to have create a zip bomber, they can do it by themselves.

Actually, I don't see how to defend this. Is there any way to ask a gzip file which size it will be once unzipped, without needing to decompress it?

halomru · on July 6, 2017

>Is there any way to ask a gzip file which size it will be once unzipped, without needing to decompress it?

The closest is uncompressing it and counting and immediately discarding the bytes in the output stream.

But of course the proper defense is to give up if you exceed a predefined memory or time budget.

Qantourisc · on July 6, 2017

Yes, but your decompression middle ware might need an update/change: When you ask for a decompress, you specify the max size (if you are asking it to decompress everything).

danesparza · on July 6, 2017

I think this is exactly what the HTTP 'HEAD' verb is for: https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/HE...

_pctq · on July 6, 2017

Wouldn't a HEAD give the size of the zipped content, and not the size of the content once decompressed?

danesparza · on July 6, 2017

This is a great question. I'm actually not sure, since gzip encoding/decoding is built-in to several webservers and browsers.

c3833174 · on July 6, 2017

Decompress it in chunks and stop when a preset limit is reached.

jacquesm · on July 6, 2017

A friend of mine has a very useful little service that tracks attempts to breach servers from all over the world:

https://www.blockedservers.com/

It's a lot more effective to kill the connection rather than to start sending data if you're faced with a large number of attempts.

leipert · on July 6, 2017

Interesting page :) I'll write your friend an email with improvements for the site. The text shadow in the code blocks make them barely readable and the map color coding is bad for color blind people.

jacquesm · on July 6, 2017

I'm sure he will appreciate that. He's sysadmin at a Dutch IPSP and this is his side project.

cypherpunks01 · on July 6, 2017

This is like the soft equivalent of leaving a USBKill device in your backpack, to punish anyone who successfully steals it and tries to comb through your data.

Pitarou · on July 6, 2017

Way to make friends with the TSA. ;-)

rocqua · on July 6, 2017

Is this actually illegal? I'm sure they'd arrest you, but could they actually charge you?

I guess if you see them try to use the USB killer, you'd be obligated to report it. Otherwise I don't think its an issue.

Pitarou · on July 6, 2017

Regardless of what the law says, in practice, breaking TSA equipment, interfering with TSA duties and, above all else, pissing off a TSA officer are all arrestable offences.

If there’s any question about culpability, all they have to do is ask you, “Is there anything in your baggage you think we should know about?” and if you don’t disclose it then, you’re screwed.

What? You say you were never asked such a question? You say you even tried to warn them? Well I have sworn testimony from a TSA officer that says they ALWAYS ask that question, and you’re the guy who was caught carrying a piece of equipment designed for trickery and vandalism. Case dismissed.

danesparza · on July 6, 2017

Except that "I thought it would be fine" is a pretty reasonable answer at that point.

bberenberg · on July 6, 2017

I think the natural next step is to make this into a Wordpress plugin.

hirsin · on July 6, 2017

This would be an entertaining way of dealing with MITM agents as well, over HTTP. As long as the client knows not to open the request, you could trade them back and forth with the MITM spy wasting tons of overhead.

steego · on July 6, 2017

It would be an interesting way of streaming data if both sides used a custom decompression algorithm that skipped n bytes without allocating it anywhere.

The payload could be encrypted text of two chat bots talking jibberish.

mirimir · on July 6, 2017

Now that's very interesting. Maybe hack a custom ssh with this feature. Adversaries that intercepted data or attempted MitM would be inconvenienced.

Edit: Or even more useful, bbcp. Which is the best file transfer app that I've ever used.

danesparza · on July 6, 2017

This is similar to kippo or cowrie, SSH honeypots:

https://github.com/desaster/kippo

https://github.com/micheloosterhof/cowrie

mirimir · on July 6, 2017

I meant more than a honeypot. But rather, a functional SSH app that messes with standard SSH apps and libraries.

shabble · on July 6, 2017

Sounds a bit like a simplified "Chaffing and Winnowing"[1], where the chaff identification is pre-shared through your custom compression parameters.

There was a HN story[2] on Chaffinch[2], which is where I came across teh idea.

[1] https://en.wikipedia.org/wiki/Chaffing_and_winnowing [2] https://news.ycombinator.com/item?id=14408757 [3] https://www.cl.cam.ac.uk/~rnc1/Chaffinch.html#Chaffing

petre · on July 6, 2017

Another method is wasting attackers' time by sending out a character per second or so. It works so well for spam, that OpenBSD includes such a spamd honeypot.

SXX · on July 6, 2017

Sending out character per second would mean that you'll keep connection open for long time and even if your server is behind CDN this would eventually let attacker exhaust your resources.

dredmorbius · on July 6, 2017

Think this through.

If a large number of hosts treats some behaviour as deserving a slow-service attack, then clients exhibiting that behaviour are faced with a large set of slow-serving servers.

Any given server can monitor how many slow-service attacks it is currently providing. Given that a criterion for an SSA is having already determined that the connection is not a friendly one, then monitoring useful vs. useless (e.g., SSA) connections, and being prepared to terminate (or better: simply abandon) the SSA connections as normal traffic ramps up, is a net benefit.

Meantime, the hostile clients are faced by a pervasive wall of mud, slowing their access.

SXX · on July 6, 2017

If this monitoring and priorities already implemented within software that handle SSA that's okay, but using some custom untested trickery is simply dangerous.

This is why I want to point out that simply serving content with delay and increasing number of active connections create additional attack vector that more dangerous than script kiddies scanner was in first place.

catdog · on July 6, 2017

An open connection with no communication going on does not take away a lot of resources, it's just some values in some table maintained by the TCP stack. If you implement that slow down service event based so it can handle a lot of concurrent connections it should not take away much resources either. In the end you can always limit the amount of connections you treat that way to a value your system can easily bear.

DamonHD · on July 6, 2017

This was exactly what I did for a while, and I was able to tie up tens or hundreds of SPAMmer connections without hurting myself, on quite a small mail server.

hossbeast · on July 6, 2017

I would love to know how to configure this for ssh connection attempts

tyingq · on July 6, 2017

Ssh does support compression, but it seems to be only if the client requests it (ssh -C).

You could, though, write a pam module to trickle data out very slowly. Maybe pam_python would be easier to experiment with.

I use pam_shield to just null route ssh connections with X failed login attempts. There's no retaliation in that approach, but it does stop the brute forcing.

yorwba · on July 6, 2017

If you are being adventurous, I guess you can just let them log in for a special user that has the shell set to a program that sends single characters very slowly. It is probably quite insecure, though.

est · on July 6, 2017

I have something similar on my VPS, edit /etc/issues.net to this

    Permission denied. Please try again.

dredmorbius · on July 6, 2017

fail2ban, in a general sense.

pmlnr · on July 6, 2017

No, this is a tarpit. Fail2ban simply rejects or drops vua iptables.

dredmorbius · on July 6, 2017

https://gist.github.com/Belphemur/82d27b1b6dfd675d15f2

Tarpit Action for Fail2ban with rate limit

banku_brougham · on July 6, 2017

We need some legal advice in this thread.

What if the compressed file is plausibly valid content? How could intent be malicious if a request is served with actual content?

eksemplar · on July 6, 2017

In this day and age, finding a vulnerability in a system like a mistakenly open API and running a script to call it a few times to investigate the weakness is considered hacking.

It probably shouldn't be, but law is funny that way.

Intentionally sending a zip bomb could potentially get you in trouble as well. Especially if you're just one private person or a small company without a legal division to brush it off.

There isn't a real black/white interpretation though, at least not outside the US (where there may be history to influence ruling on the subject), and obviously most victims wouldn't report you, but more often than not you wouldn't want to test interpretation of IT related law.

ajarmst · on July 6, 2017

Reminds me a bit of Upside-Down-Ternet: http://www.ex-parrot.com/pete/upside-down-ternet.html

em3rgent0rdr · on July 6, 2017

Defending by throwing things back at the attacker, instead of simply locking your door.

ajarmst · on July 6, 2017

This is more that the thief is parked at your front door permanently trying to pick the lock, so you replace the valuables he's looking for with big chunks of lead.

bulatb · on July 6, 2017

More like hungry bears.

leggomylibro · on July 6, 2017

This might actually work well with fail2ban integration. Every time you block a connection, you also respond to the final request with a big ol' file.

Pitarou · on July 6, 2017

No, he’s doing both.

Theizestooke · on July 6, 2017

A great way... to provoke a war with people running botnets.

ianai · on July 6, 2017

This could also be seen as a bug on the browser side. I'd also be interested in the browser results for the petabyte version.

I wonder if there's room to do this with other protocols? Ultimately we want to crash whatever tool the scriptkiddy uses.

tyingq · on July 6, 2017

I thought of http2's hpack. It does have built in protection though...the client sets a maximum header table size. Which encourages client implementations to think about it.

ioquatix · on July 6, 2017

About a month ago one of my websites was being scraped. They were grabbing JSON data from a mapping system.

I replaced it with a GZIP bomb. It was very satisfying to watch the requests start slowing down, and eventually stop.

DamonHD · on July 6, 2017

Interesting!

That also crossed with another thought about pre-compressing (real!) content so that Apache can serve it gzipped entirely statically with sendfile() rather than using mod_deflate on the fly, so unless I've misunderstood I think that bot defences can be served entirely statically to minimise CPU demand. I don't mind a non-checked-in gzip -v9 file of a few MB sitting there waiting...

http://www.earth.org.uk/note-on-site-technicals.html

merricksb · on July 6, 2017