Hacker News new | past | comments | ask | show | jobs | submit login
The Trouble with CloudFlare (torproject.org)
620 points by tshtf on April 1, 2016 | hide | past | favorite | 351 comments



Tor has acknowledged their "botnet problem" since at least 2013:

https://research.torproject.org/techreports/botnet-tr-2013-1...

That same paper walks through the challenges of dealing with it and doesn't find any satisfactory solutions.

As I wrote in our post on the topic, there's a trade off between security, anonymity, and convenience. CloudFlare provides security to our customers. We believe in the importance of anonymously accessing the Internet. Unfortunately, that means we have to sacrifice some convenience. If you haven't read it, I encourage you to see the post I wrote on the topic:

https://blog.cloudflare.com/the-trouble-with-tor/

The two long-term solutions we proposed — blinded tokens or CloudFlare supporting .onion addresses — we believe could reduce the inconvenience, but they'll require help from the Tor developers. While public posts like this are discouraging in terms of coming up with a better solution, I'm encouraged by private conversations we've had with Tor developers who acknowledge this is a hard problem and want to find solutions.


The post from the Tor project does not state anywhere that Tor is not used by botnets or that it's used for malicious purposes. They specifically question your specific assertion that 94% of Tor traffic is malicious. I'm not surprised, it's quite a statement and it calls for some supporting evidence.

Surely you can see how, given the amount of outreach they do to educate regular people about the positive uses of Tor, putting forth unfounded statements like that might be perceived negatively.

I am glad that CloudFlare has put effort into this problem and as a Tor user I appreciate it (though obviously, this problem goes far beyond Tor, as mentioned in the article - your systems will have the exact same problems with large-scale IPv4 NAT, so really it's not optional for a CDN provider).

But please, stick to facts when presenting the case.


The calculation must be based on their internal data. How exactly are they supposed to show the supporting evidence without compromising their users privacy?

You seem to be engaged in goalpost shifting. CloudFlare have no incentive to make this figure up. They aren't going to give random people on the internet root access to their servers to recalculate the figure themselves. By claiming they aren't "sticking to the facts" all you do is show a closed mind.

Tor proxies tons of bad stuff. Everyone who has run a big web site knows this. Remember you only need one or two bad guys with fast enough tools to generate a flood of malicious traffic that completely overwhelms thousands of legit web browsing users. It's just so trivial for a minority of bad actors to end up dominating traffic profiles. So, I believe Cloudflare.

Tor guys love to talk about journalists, whistleblowers etc. That must be a really tiny amount of their overall traffic compared to people who just want to torrent, be assholes on forums etc. Just because they love to "educate" anyone who disagrees with them doesn't mean they're right.


If the calculation is based on data, they should show it. Since they did not, the Tor project made a best-effort guess as to how they could have come up with this (obviously ridiculous) number of 94% of traffic being malicious.

I have run a Tor exit node, so I have some intuitive idea of the amount of malicious traffic. CloudFlare are full of shit.

The rest of your comment is engaging in the exact same goalpost shifting you accuse me of, suggesting that because there are one or two bad guys it's really no big issue that they block thousands of legitimate users.

Also you completely ignored the most important point of my comment, which is that this problem is not restricted to tor!

If you wish to further the conversation, please actually respond to my points. Thank you.


Yes, that botnet dig is bullshit. Some botnets use Tor for C&C. But jerks don't need botnets to have lots of Tor exits. That's what VMs are for. Botnets give you lots of residential IPs.


C&C? VMs? What?


C&C = command and control [0]

VM = virtual machine (local or remote VPS) [1]

Let's say that I have a box with a couple quad-core Xeons and 64GB RAM, and a 100Mbps uplink. I can easily run 150-200 Debian VMs, each running one or more tor processes.

[0] https://security.radware.com/ddos-knowledge-center/ddospedia...

[1] VPS = virtual private server


Ah right, didn't think about running several Tor instances in VMs. I'm sure there are better ways to run Tor from VMs though, if you know how the protocol works just run instances of some program that runs it instead of running the whole thing in a VM.


Sure. You can also run many tor processes. Or you can use light virtualization, so there's not much overhead. But the point is that it's easy to create lots of Tor instances. Much easier than building a botnet.


You still purposefully resolve spam domains on at least tim.ns.cloudflare.com and leah.ns.cloudflare.com and refuse do to anything about it.

I see a certain hypocrisy in claiming to protect your customers, and at the same time enabling criminal operations through allowing them use of your infrastructure.

(For the record and because you tell me this at every point of contact, I know your main business is reverse-proxy. I don't care. You run DNS infrastructure, you are responsible for it.)


Yep and on top of that they have for years they have allowed DDoS 'booters' to stay online, with the claim of "we have no way to remove the content, we're just a reverse proxy." If they're not actually hosting it, they think it's OK for whatever it is to pass through their network.

http://www.crimeflare.com/damon.html


I've had one incident that was very similar to this (DDoS raiding forum proxied through Cloudflare staging an attack against our servers).

When I reported it, they said they had informed the attackers of my report, which is sortof like having the police tell a gang you snitched on them and could have enabled retaliation.

When I asked them if they had indeed leaked my personal contact information in this report, they responded with this:

As indicated at https://www.cloudflare.com/abuse/form and to which you expressly agreed: "By submitting this report, you consent to the above information potentially being released by CloudFlare to third parties such as the website owner, the responsible hosting provider, law enforcement, and/or entities like Chilling Effects"

And then when I followed up again, they responded with this:

Again, to re-iterate, by submitting a report at https://www.cloudflare.com/abuse/form, you expressly agreed: "By submitting this report, you consent to the above information potentially being released by CloudFlare to third parties such as the website owner, the responsible hosting provider, law enforcement, and/or entities like Chilling Effects"

In this case, it appears that we chose not to forward your report to the website owner. However, we reserve the right to, and you should assume that this should happen when making reports to us.

I don't know what the legal implications of this are, suffice to say, protecting DDoS attackers for free while asking for legitimate sites to pay (in my case, the $6000/mo plan would be needed) feels a hell of a lot like extortion.

As for liability, ISPs aren't liable for hosted content, but there are exceptions (DMCA, CP), and legally, Cloudflare absolutely has legal liability here. They're not just linking to that content like with a BitTorrent tracker, they're literally serving it through their nginx servers.


HN won't let me update this comment for some reason, so I'll just add this here:

The real issue here is that the web is becoming increasingly centralized, which means that we're becoming more dependant on the internal processes of a small handful of venture capital corporations for the web to work. Regardless of Cloudflare's current policy on Tor (they seem to be trying which is good), they could also just arbitrarily change that policy anytime they want, and this is a scary situation for the future of the web. It's a single point of failure for a large chunk of the web, for political manipulation and for advertiser and government spying. Tor (your last chance at a privacy web) users being unable to access major swaths of the web just happens to be the first sign of the implications of this. It's no surprise to me at all that we've seen so much interest in distributed web technologies lately (IPFS, ZeroNet).


They're not unable to access large swathes. It is inconvenient however. And site owners can disable the captchas.

Also moving away from Cloudflare is just a DNS update away.


Aye, thanks for the write-up!

I don't really get them either, surely they have enough paying customers to be able to afford some basic data hygiene.

I'd really like to find some ways to make it more costly for them to keep the scum on their networks than to send out stupid "we r a reverse proxy!!!" replies, that clearly have nothing to do with anything.

But yeah, keep complaining about Tor, good job CloudFlare /s


Let me get this straight.

People are criticizing CloudFlare for inconveniencing Tor users - a tool which, among other things, can be used to fight censorship.

At the same time, people are calling them out on their abuse policy which essentially boils down to "We won't take down sites based on content unless we receive a court order telling us to."

That's interesting, to say the least.


One is a tool that can be (and is being) used for all sorts of good purposes; run by mostly volunteers and whose entire reason for existence is not policing their network, because that would defeat the entire purpose of the endeavor.

The other one is a for-profit organization who's CEO's rationalization for taking money from internet scum is that if he doesn't take it, someone else will [0].

And to be clear, you are not quoting their abuse policy correctly. They say they will MITM phishing and malware sites to insert warning banners and take down childporn.

They will however not tell a scammer "Hi, please take your business elsewhere" based on some misguided "CloudFlare will save the internet"-fantasy.

[0] https://blog.cloudflare.com/thoughts-on-abuse/


Both are services which are used by a variety of people for both good and nefarious reasons. To paint TOR as the white knights of the internet is at best incredibly naive. There is a good reason that most mature security operations groups maintain blocklists of TOR exit nodes (in case it's not clear, that reason is the amount of alerts triggered by traffic from those nodes, relative to any other set of addresses).

Below is a direct quote from the article you have cited. It refutes your underlying premise, namely that cloudflare is in it for the money and doesn't care. The knee-jerk reaction that corporations take money for services, therefore they're evil, needs to die in fire.

""" LulzSec and other problematic customers tend to sign up for our free service and we don't make a dime off of them. When they upgrade they usually pay with stolen credit cards."""


Maybe my previous post might have not been especially nuanced for the sake of illustration.

The point is: Tor is trying to provide a useful service that requires it to be very hard to block abuse, yet as you say: the publish their list of exit nodes and everyone is free to block them.

CloudFlare on the other hands, could extremely easily respond to abuse requests the same way that almost every other legitimate network service provider responds to them: investigate abuse by their users and terminate malicious users.

You misunderstood: corporations taking money for services is excellent.

Corporations continuing to provide service for criminals even after they've been notified of, and have acknowledged that they are hosting problematic customers mixed in with their legitimate customers is extremely problematic.

They're not evil, they have responsibility.


Hi,

I'm completely sympathetic to your problem - Tor is used by lots of spammers - totally understandable to try to prevent this spam from hitting your customers.

But most of the services I run can't really be affected by this sort of spam (no public comment systems for example). I use CloudFlare on a few of my domains, if I don't care about bot traffic and just want to turn this CAPTCHA system off entirely, is there a way with just a pro account to do so? I appreciate the anti-DDoS protection and certainly having it automatically kick these CAPTCHAs on with only extremely high volumes of traffic could be appropriate, but enforcing them is just too much.

I have security level set to minimum, but recently spun up a hidden service because users were still getting CAPTCHA'd over Tor.


Yes, you can whitelist Tor. This is available at no cost for all customers regardless of plan level:

https://support.cloudflare.com/hc/en-us/articles/203306930-D...


Thanks. Can I whitelist the whole internet (by using /0 perhaps?) using this method? I've seen people complaining about this issue on VPNs and such too.


You can whitelist CIDR ranges of /16 and /24 currently. So yes, you can to some extent whitelist everyone.


If I can only to /16s at the largest, I would have to add on the order 65k ranges to accomplish it... not exactly desirable way to spend an afternoon. Even via API I don't think they'd be too happy with me.


I agree, this is not the most efficient way to whitelist everyone, but my point was to outline you can whitelist whoever you want. Hopefully support for wider ranges will be added.


Yes, just add 65536 exceptions!


No need to add 10.0.0.0/8, 127.0.0.0/8, 172.16.0.0/12, and a few others...


Yeah, I don't get why CloudFlare are so overaggressive with the captchas.

The vast majority of captcha'd pages by CloudFlare on Tor which makes secure web browsing so cumbersome are completely read-only, while some may have a comment system hosted by a third party like Disqus and Facebook (and are therefore protected already). Other sites should have the captchas on a different level than the front page, like the login forms and other pages where write/spam access is somehow enabled.

I can think of very few sites that are in need of a total ban on anonymous users/bots or that are write-access enabled by default without any login process.


They also protect against ddos, sqli, overuse, bots etc. so comment spam is only one problem and get requests are not always safe.

You'd think they could be far more sophisticated about reputation though and adjust it in realtime so that ips are by default trusted and are marked down temporarily for bad behaviour.


The problem is that with Tor the IP isn't much use as an identifier to establish reputation. As CloudFlare say in their post, they do use reputation of the exit node IPs to some extent, but the trouble is that there's so much abuse from Tor, most exit nodes always have bad rep. The zero knowledge tokens t hat they talk about are a way of establishing reputation without losing anonymity.


What they do at present appears to be to assign long term reputation to tor nodes, but if instead they assigned reputation over a few minutes and reset it soon after, wouldn't that work better and avoid being overly broad? More load for them obviously but could possibly be done just for tor nodes.

The token scheme looks interesting though as long as it was truly anonymous and tokens were different on each request to avoid tracking.

https://github.com/gtank/captcha-draft/blob/master/captcha-p...

i wonder if they could do a plugin for the tor browser so that they don't have to wait for tor?


When are GET requests unsafe?


ddos or sqli, see the cf article


Other CDN/WAF providers are indeed doing that.


"Read-only" pages are still fertile ground for layer 7 DDOS.


It wouldn't be hard to only display the captchas if sudden increased traffic from Tor exits indicated that a DDOS through Tor was in action.


Not only that but you could only display CAPTCHAs on the actual target of the DDoS.


The Tor egress bandwidth is sufficiently small and sparsely located that this wouldn't be an issue for a CDN.


It remains an issue for the origin server if the DDoS targets uncacheable content.


Why don't you just drop IP-based reputation system for Tor IPs completely and develop something else for these IPs, something based on data from actual requests and responses?

Because it sounds like you want to preserve an incorrect system and are pushing this problem on Tor.


Quoting from our post (https://blog.cloudflare.com/the-trouble-with-tor/):

At CloudFlare we've not explicitly treated traffic from Tor any differently, however users of the Tor browser have been more likely to have their browsing experience interrupted by CAPTCHAs or other restrictions. This is because, like all IP addresses that connect to our network, we check the requests that they make and assign a threat score to the IP. Unfortunately, since such a high percentage of requests that are coming from the Tor network are malicious, the IPs of the Tor exit nodes often have a very high threat score.

With most browsers, we can use the reputation of the browser from other requests it’s made across our network to override the bad reputation of the IP address connecting to our network. For instance, if you visit a coffee shop that is only used by hackers, the IP of the coffee shop's WiFi may have a bad reputation. But, if we've seen your browser behave elsewhere on the Internet acting like a regular web surfer and not a hacker, then we can use your browser’s good reputation to override the bad reputation of the hacker coffee shop's IP.

The design of the Tor browser intentionally makes building a reputation for an individual browser very difficult. And that's a good thing. The promise of Tor is anonymity. Tracking a browser's behavior across requests would sacrifice that anonymity. So, while we could probably do things using super cookies or other techniques to try to get around Tor's anonymity protections, we think that would be creepy and choose not to because we believe that anonymity online is important. Unfortunately, that then means all we can rely on when a request connects to our network is the reputation of the IP and the contents of the request itself.


> But, if we've seen your browser behave elsewhere on the Internet acting like a regular web surfer and not a hacker, then we can use your browser’s good reputation to override the bad reputation of the hacker coffee shop's IP.

Look, please correct me if I'm misunderstanding or taking your words out of context.

But what I hear you saying is that CloudFlare is fundamentally opposed to user privacy at a business and an architectural level.

I.e., if you don't agree to let CloudFlare track you around the web (perhaps by simply declining cookies) CloudFlare is likely to degrade your user experience to the point of being borderline unusable and then point the blame at you for coming from a bad network neighborhood.


Yes, reputation is a form of tracking. And if you show up to a site with no reputation of your own, from an IP that has a known-bad reputation, it is in the best interest of the site to challenge (not block) you. You are 97% likely to be malicious traffic.

Edit: 97% is a real number, not an exaggeration, based on numbers from the report linked in the article.


Yes, but Tor flips through IP addresses regularly so you'd get challenged every few minutes.

Similarly, if you block cookies/supercookies/etc to avoid being tracked ... you'll be challenged every view.


Yes, you will. But the point is, with no other information to go on, that is the best option for the website. If you don't want to be challenged constantly, you need to give the website operator some incentive to accept your traffic.


There is plenty of information to go off of. They just don't want to put in the engineering effort required to utilize it.

Is there really a constant DDoS attack on all of these sites from users with no cookies?


On a given site, not necessarily, but on some sites that cloudflare is protecting, pretty much always would be my guess... cloudflare doesn't know you're not the spammer/bot/malicious actor using the same exit node... it only knows that you don't have any cookies, and that means you look a lot like the bad guys coming from the same IP.

One could setup an IDENT-like service that delivers a hash for the source's route, and that would enable better scoring, but also could be used as a tracking measure... you can't have one without the other.

Even then, it would take either the user allowing cookies, or the TOR system to change their exit nodes.


There are a number of other ways to identify abuse.

Listed those in other responses.


Honestly, it probably would be beneficial to my productivity if I dropped all of Cloudflare's IP ranges since it'd keep me from going on HN, Reddit, etc. :P

The need for incentive you mention is silly. The website operator [much like a job searcher with a resume] wants to be in front of as many non-malicious people as possible. And while you might argue .04% of malicious traffic comes over Tor, I've operated sites where 20%+ came over some sort of proxy with poor IP reputation.

You know what?

Fuck it. I'll just build my own site that doesn't use Cloudflare for such a purpose.


Global traffic patterns don't tell the story. I've seen sites where 100% of Tor traffic was malicious, so the fact that 0.04% is typically malicious is meaningless to the people operating that site.

It all depends on what you do. I had a customer a few years ago that was forced to geo-block IP addresses from China, most African nations and Bulgaria. The nature of that customer's business made that an easy solution.

A company like Cloudflare serves everyone without a lot of context. If your site serves a Tor-heavy niche, it's not the right solution.


Privacy oriented niche, but yeah.


Even better, treat the website that does so as malicious and stop visiting it.


That's an entirely valid response. Just don't expect anything to change due to your boycott, since only 0.04% of all legitimate traffic comes over Tor. They won't miss you.


> I.e., if you don't agree to let CloudFlare track you around the web (perhaps by simply declining cookies) CloudFlare is likely to degrade your user experience to the point of being borderline unusable and then point the blame at you for coming from a bad network neighborhood.

It seems like CloudFlare is not absolutely committed to this position because they're willing to explore things like the blinded tokens approach.


Can you conceive of an alternate way to score traffic on the Internet? What might that be?


So there are two problems here, right? Spam and DoS.

Comment spam isn't a CloudFlare-level problem. If sites want to allow anonymous comments then they get the consequences of anonymous comments (or have their own CAPTCHA for them); if they want to require account registration and some vouching or proof of work or payment to get an account then they can have that as well.

DoS is a CloudFlare problem, but you don't need historical IP reputation for that, you only need what that IP address is doing right now.


Comment spam is one of the things CloudFlare advertise as protecting sites from, so yes it is their problem.


It's only their problem because they chose to claim they could solve it when their solution is the one that causes all of this trouble.

They're not in a position to do it accurately.


There are a million ways.

Order of requests for that IP in the last n minutes, timing of requests, request headers order, type of content requested, captcha content timings, specific-for-site content requested, etc


None of those sound particularly effective


Not without decreasing anonymity...


Since traffic from Tor is so low (0.4%) why do even care to do IP-based blocking? Won't all your other threat detection models kick in when necessary anyway?


> Unfortunately, that then means all we can rely on when a request connects to our network is the reputation of the IP and the contents of the request itself.

So, essentially, Cloudflare relies on defense in depth as a security company but as a side effect of this is Tor [and IP anonymity services in general] are affected.

Fair enough but you may want to seriously consider just giving the availability to ignore IP reputation altogether [except during a DDoS] to your customers since it isn't just Tor but also VPNs, etc. that are impacted by this sort of strategy.


Customers can toggle how much they want IP reputation to be taken into account on a site by site basis. Agree that it should be a customer's choice, which is why it has been since the day we launched in 2010.

The seeming disconnect is that the vast majority of our customers ask us to provide them a way to block Tor entirely. And we've resisted that because we believe the anonymity Tor provides is a good thing. Same reason we don't allow the vast majority of customers to entirely block traffic from an entire a country, even though it's one of our top customer support requests.


For what it's worth, as somebody who used to manage a site that was under constant attack and who's users were regularly victims of phishing...I appreciate blocking Tor.

We weren't using Cloudflare but our own systems that were using IP threat rating services like MaxMind but eventually we had to totally prevent anything important from being done on the site via anonymous proxies. Bids, Listing Creation, Payments of any kind had to be completely blocked from those sources. People were using Tor to create fake listings on fake users with stolen credit cards that we were then paying charge back fees for. Using Tor to bid up their own auctions. Direct messages soliciting users to take the transactions off site.

Blocking those systems was one of the most effective things that we had to do and our users were vocally happier about it.


> Same reason we don't allow the vast majority of customers to entirely block traffic from an entire a country, even though it's one of our top customer support requests.

https://www.cloudflare.com/features-security/

> In addition to CloudFlare’s automatic detection, you can easily add an IP address, IP ranges or entire countries to your Trust and Block list.

Umm, vast majority is non-paying I take it since I believe its available on every paid plain?

https://support.cloudflare.com/hc/en-us/articles/200170056-W...

> A low security setting will challenge only the most threatening visitors. A high security setting will challenge all visitors that have exhibited threatening behavior within the last 14 days.

I'm guessing you mean the "Essentially Off" option which implies Cloudflare basically stops providing security?


How do people not understand the problem? There are not "temporarily good" and "temporarily bad" Tor IPs. Every single Tor IP is at any second a possible threat. This is the nature of the network - any Tor client can connect through any given Tor exit node at any given time. One single client can hit through every single Tor exit node in less than an hour's time. When that client is malicious, and any "well-written" piece of software that is using Tor for malicious purposes would be, they rotate through servers quickly specifically to avoid IP blacklisting as much as possible.

Couple that with most clients passing all the same fingerprintable data (browser brand and version, OS version, etc), and you can't uniquely identify different clients coming from the same IP with any level of accuracy.

There is no solution where everyone wins. Simple as that.


I don't use tor but I sometimes use a VPN which occasionally have the same problem. I don't tend to solve the capchas, I tend to use the back button - and I don't think I am alone.

You should take this into account as well, since your customers are presumably not interested in losing business.


The bots in the "botnet problem" were connecting to a hidden service that was hosted in the Tor network. That isn't quite the same thing as bots that connect to hosts outside of Tor. Therefore, this isn't an issue for CloudFlare, and a bad example.


More than a hard problem, this is arguably an intractable problem. Tor is already vulnerable, and trading anonymity for convenience (or even appearing to do so) would be unworkable. There are other anonymity systems waiting in the wings, and they would arguably gain support if the Tor Project sold out to CloudFlare. The only viable approach for Tor, as I see it, is to treat CloudFlare as damage, to be routed around.


As a developer I will direct my clients away from CloudFlare services as long as CloudFlare continues this sort of attack on Tor which is ultimately an attack on privacy.


  * facepalm *
No room for nuance, huh? Or appreciation for the position CloudFlare is in and their obligation to their clients?

How would you solve this? Abuse from Tor IPs is a known and documented problem. If you have a solution, I'll bet CloudFlare has a job opening.


If CloudFlare keeps blocking Tor, the blocks will be circumvented. Unfortunately, it's the assholes who will figure that out first. But so it goes in war.


If CloudFlare would at least let their client decide by themselves, that would be an awesome start. Even if it's enabled by default.


They do this already. See their blog post. You can explicitly whitelist Tor IPs, I thought it was plan limited to only enterprise, but it seems it's available to all actually.


It's a fundamental problem with IP-based reputation. Same goes for VPNs where there isn't a specific list of hosts you can whitelist.


IP based reputation has served us well, but it's no longer relevant. IPv4 has reached exhaustion, carrier-grade NAT is being deployed and the idea that an IP address correlates with one person or a very small group of people no longer holds. You can sort of pretend it does if you only serve america and europe, but that will change too.

Any company that currently bases their offering on IP-based reputation better be working on different solutions to the problem or they're not going to stay relevant for long. This is an existential issue for CDNs.


Absolutely not. I believe in privacy and have zero tolerance for big businesses who throw their weight around at the expense of minority communities (Tor users in this case). I understand CloudFlare's need to make a profit. That is why we need to turn treating Tor traffic like normal traffic into a good business decision.


Why so much hate against CloudFlare? What's the big deal about being a "company" if you can see the CEO replying to you right here, right now?

On top of that CloudFlare as a free service helped lots of small controversial, hated-by-some-government websites to stay alive. They protect >ANYONE< that is being DDoS-ed usually for free.

If keeping freedom of speech ( as in "being online" ) for all those users they protect, for you is "at the expense of minority", I think you are totally biased.


You misunderstand the problem. The issue is that CloudFlare is treating Tor traffic like any other traffic. What you want is special treatment for Tor above and beyond the whitelisting feature CloudFlare already offers to site owners.


> CloudFlare is treating Tor traffic like any other traffic

I don't see this in any practical fashion. I can visit a CloudFlare hosted site from the regular internet for hours (even scrape automatically) with no problems; the first time I hit the same site through Tor, it gets a double or triple capchca.

Perhaps it should be a blacklist instead of a whitelist. Defaults matter.


Of course you can visit it for hours, because you are most likely one of very few people who are accessing the CloudFlare netrwork from that IP. If you were to go through a public VPN, then the chance of captcha will also go up. The issue is that with higher traffic out of a single IP, there is a much higher likelihood of malicious activity.

A blacklist would do nothing to solve this, since the fundamental problem is the way Tor and VPNs work, by aggregating traffic into exit nodes at specific IPs.

Edit: And upon further thought, it most likely is a blacklist. A bunch of malicious requests go out from one IP, so that IP is blocked. Because it's an exit node, it also blocks a bunch of other legitimate people.


> A blacklist would do nothing to solve this

I mean blacklist "Tor" to give them capchcas, instead of having to whitelist them to not give them capchcas.


Eh. No thanks. "Default deny" is the only sane security default.


> I don't see this in any practical fashion.

You don't?

> I can visit a CloudFlare hosted site from the regular internet for hours (even scrape automatically) with no problems

Ah but this is not the same. Try doing so from an IP which is also sending malicious traffic, and you will see the same issue.


Technically, no, it's not the same thing. But, for a user, it is the same thing. Ultimately, it's the user's experience that matters, not the technical details.

Much like I don't care which bus gets me from point A to point B, or if I'm the only the one on the bus or not... it's the experience of the trip between points that matters.


If you don't care which bus gets you from point A to point B, then stay off of the bus that all of the malicious packets are riding...


"Absolutely not. I believe in privacy and have zero tolerance for big businesses who throw their weight around at the expense of minority communities (Tor users in this case)."

You're fine with the anonymity industry throwing their weight around at the expense of a handful of CDN's, though. I understand Tor's need to protect users. That's why they need to find a way to treat malicious traffic different than normal traffic. Then adopting them might be a good business decision.


I think you underestimate how much of the traffic from Tor that is malicious.

I have barely ever seen any legitimate Tor traffic, it has all been spam bots or other kinds of traffic I would rather avoid.


A lot of services block TOR exit nodes. Go to any large ecommerce site and try to buy something while using a TOR client. Can't even search on google most of the time while using a TOR client.


>Tor has acknowledged their "botnet problem" since at least 2013: >https://research.torproject.org/techreports/botnet-tr-2013-1.... >That same paper walks through the challenges of dealing with it and doesn't find any satisfactory solutions.

But that's all utter nonsense! These botnets only access stuff inside the Tor network (i.e the C&C, that the operators want to hide), they don't use Tor to access clearnet content. Even a small botnet will have more nodes than there exists tor exit nodes (966 as of right now), what possible benefit could there be for the botnet operator from doing that?


Maybe I'm a cranky, old-school network operator, but this is a very cut and dry problem. Tor runs a network that is rife with abuse and fraud. Tor needs to clean up and police its network. If it doesn't, it will be put on blacklists and customers will take active measures to block traffic from it.

This is no different than a network or AS that is spammer friendly, botnet friendly, carder friendly, etc. All of those networks eventually end up on blacklists or Spamhaus lists and their efficacy goes down. Eventually, the network dies out and the criminals move somewhere else. Yes, it's a game of whack-a-mole, but it's proven to work well.

I know Tor doesn't want to be in the network regulation business, but they need to be if they want their product to thrive. Otherwise, good bye Tor.


This is exactly what is wrong with this form of idealism. People create these things which remove accountability/reputation, it works great for awhile and is lots of fun (just like a mask party), and then the leeches move in and use it for spam/trolling/illegal stuff. It's usually the leeches who are the real long-term beneficiaries of these kinds of networks. However, the idealistic people who originally created it don't want to admit that their experiment failed and they actually created something which is now serving the interests of something not so idealistic and perhaps even quite sinister.

Bitcoin has the same issue: there are lots of legit uses of it, but to make it a good widely used currency, a reputation system is going to emerge, and from there you've already erased half the benefits of using Bitcoin. However, in the mean time Bitcoin is used by a bunch of people purely interested in speculation or as a way of avoiding taxes/money laundering laws/etc. There are people, just like Tor, using it for legit reasons, but my bet it's for mostly reasons nobody in the Bitcoin community likes to admit.


Exactly. But I strongly disagree.

You don't blame mask manufacturers for malicious people wearing masks.

It's like city guards banning everyone with a mask from entering and issuing IDs to them. Then they're using those IDs to determine what they should and shouldn't see in the city, tracking them everywhere "across cities" etc.

In the interest of privacy, it is best to instead use the dynamic nature and types of the requests to figure out what the behavior is like.

Going with the mask analogy, they should instead check if a person is brute forcing lock combinations. Maybe even condition on the fact that they're wearing a mask.


> Going with the mask analogy, they should instead check if a person is brute forcing lock combinations. Maybe even condition on the fact that they're wearing a mask.

That's what they're doing. They are seeing brute forcing come from a bunch of IPs and they're blocking those. What do you expect them to block on? The people using the anonymous service voluntarily identifying themselves on every request (cookies, browser fingerprinting, or pretty much anything else coming from the client side that can be faked)?


Instead of having IP-based reputation system, that persists for quite a while, they could have a time limit per IP for specific kinds of requests.

Like if you fail to log in to a site, 2^(attempts) timeout from that IP for that page only. Can also integrate a combination of request headers. Sure, it's still IP-based reputation, but it doesn't persist and is much less intrusive.

Most sites require specific cookies on consecutive requests, and such blocking should be on the app side only.

There are solutions in each case and all of them are harder than IP-based blocking. However, in the interest of privacy, they should adopt these more nuanced solutions.


So a single IP address can DDoS each page of a website for a little while before CloudFlare blocks them? That makes the whole protection pretty useless. I guess it would stop someone from brute-forcing password attempts, but that's not the only thing they're trying to protect against here.


Not necessarily. These work in combination.

If they're requesting specific type of content like images or some weird request that queries DB, these would be grouped together.

What I'm saying is gather more information for each request and use it more wisely to expire IP reputation quicker - within minutes as opposed to months.

The DDoS problem is actually easier than the rest because you need a large volume of requests to do anything. Usually these requests are very similar, come in rapid succession and come from the same bunch of IPs.

Edit:

Going with the mask analogy again, it's like you see 1000 masked people rush into a bar and block the entrance with their bodies.

Is the solution really to ban wearing masks everywhere?


A single IP can't "DDoS" anything.


Ha! Seriously though if some set of IPs is DoSing then they have to take action against at least some of the IPs in the set.


> It's like city guards banning everyone with a mask from entering and issuing IDs to them.

The flaw in this analogy is that in this case the mask makes every person completely indistinguishable from every other person wearing the mask. In this case, one ID is issued to every person wearing the mask.

When 90%+ of the people with this ID are criminals and vandals, blocking anyone with this ID is a pretty obvious and effective way to prevent crime and vandalism. It's seems pretty reasonable to me when presented this way.


That's a crazy thing to do. Why would you block everyone? This would completely erode privacy online.

As I said elsewhere, if you see 1000 masked people rush into a bar and block the entrance with their bodies, is the solution to block all masked people from going to all establishments?

Clearly, if this happened IRL, people would just put a limit on the number of masked people entering that bar until there wasn't a group of 1000 of them trying to get in.


>Clearly, if this happened IRL, people would just put a limit on the number of masked people entering that bar until there wasn't a group of 1000 of them trying to get in.

IRL, the bar would call the police and anti-riot forces would move in with crowd control equipment. Tear gass would be launced at the masked people and a lot of the masked people would be hauled to the police station where their identity would be recorded and a background check would be performed. It's not pretty but it's reasonable.

I have used Tor out of a legitimate wish for privacy. I have cursed Cloudflare and Google in passing to myself for their captchas presented to me when I've browsed through Tor.

Captchas in general are a royal pain in the butt, but they are among the most effective at protecting sites from abuse, so even though they annoy me at times, I hold the view that they are a net positive.

If you want to help preserve anonymity, I think the best course of action is not to focus on Clouflare, but instead to help maintain one or more communities on onion sites. The change must come from within. Once it has been shown that an onion site is able to provide useful services over time with privacy but with the same level of protection from abuse and bad people, then, in my view, it is time to reach out and educate the wider 'net on how this can be done.


Except tell the individuals apart and so you can't limit it to 1,000 individuals. You can't just let some in if you can't tell them apart. There is no door that works that way in this case.

It's best to imagine it's a walk-up bar rather than one with a door.


> As I said elsewhere, if you see 1000 masked people rush into a bar and block the entrance with their bodies, is the solution to block all masked people from going to all establishments?

Try wearing a mask into a petrol station or convenience store. They've already performed the assessment of 'potential sale vs getting robbed', and decided the risk factor they'd like to accept.


That's because IRL you're already pseudo anonymous.

Now imagine a convenience store that demands you to tell them where you've been this month and doesn't let you in otherwise.


It's a problem inherent in any system offering pure anonymity in an unrestricted way.

It's really a shame. An opinionless platform offering anonymity cannot flourish in an opinionated world. At some point if these things want to succeed, they need to play by the rules of the world that they exist in. But I don't think anyone's figured out a common set of systemic restrictions that Tor, 4chan, etc. can implement that avoid taking away their primary affordance: freedom.


The main point of Tor is that nobody knows where the traffic comes from. Realize you're asking them to break their own service.

Your premise seems to be that you can't be bothered to protect your networks so you want to put that responsibility on someone else. It's called intermediary liability and it's terrible because the intermediary has all the wrong incentives.

You demand that the intermediary eliminate malicious traffic but they suffer much less than individual users if they also eliminate non-malicious traffic, so they set up a system with a high rate of false positives and harm many honest people. YouTube does this with Content ID. Spam registries do this with innocent small mail servers. CloudFlare does this with Tor.

What you're doing is called externalizing costs. It's generally recognized as antisocial behavior. So if you're going to claim benefits to yourself at the expense of other people, at least recognize that you're doing it.


> What you're doing is called externalizing costs. It's generally recognized as antisocial behavior. So if you're going to claim benefits to yourself at the expense of other people, at least recognize that you're doing it.

Remember his preface - cranky old-school network operator.

Let's say you have a hundred networks all connected together into some sort of "inter-net" system. If one AS starts sending out malicious traffic, what makes more sense:

1. That AS starts policing their users.

2. The other 99 ASs have to deal with the malicious traffic.

You're expecting the other 99 groups that are being targeted by the one group to bear the cost of dealing with that group's malicious users. Who exactly is externalizing costs here?

In a system without any real rules or authority, I think "those adversely effected choosing to block the bad actor" is a fairly democratic solution to the problem. You either play nice or you get voted off of the island.


> In a system without any real rules or authority, I think "those adversely effected choosing to block the bad actor" is a fairly democratic solution to the problem.

That's the part which is adverse to the rest of your argument. You're not voting off the bad actor, you're voting off everyone in the bad actor's country.

We know how to deal with this problem. You go to a website, you sign up for an account, it can be pseudonymous but to get it you have to put up some collateral. Money/Bitcoin, proof of work, vouching by an existing member, whatever you like. Then if your account misbehaves you forfeit your collateral.

But this isn't a CloudFlare-level problem. They're trying to solve it at the wrong layer of abstraction. Identity isn't a global invariant, it's a relationship between individuals. Endpoints identify each other with persistent pseudonyms. The middle of the network should have nothing to do with it.


> That's the part which is adverse to the rest of your argument. You're not voting off the bad actor, you're voting off everyone in the bad actor's country.

The bad actor is the organization or person responsible for administering the network where the abuse is originating.

When I'm being attacked by someone's VPS, I report them to their host. After the fourth time I report them only to have their host pass along my report but take no further action, the host becomes, maybe not a bad actor but, a "bad citizen".

My choices are to allow them to externalize the costs of their lack of enforcement (or decision not to enforce) and attempt to find a way to block the specific actor under their purview, or just to block that host and accept whatever collateral damage that occurs. (And yes, sometimes that "bad citizen" may end up being most of a country - it doesn't change the equation for me.)

It's the only method I have to exert any pressure on the host to act responsibly. If enough people agree with me, then it quickly becomes "their problem" rather than "my problem" as they get blackholed from everywhere on the internet.


> The bad actor is the organization or person responsible for administering the network where the abuse is originating.

The bad actor is the individual who acts bad. The Post Office is not a bad actor for delivering letters.

> allow them to externalize the costs of their lack of enforcement

Tor is not an enforcement agency. Neither is CloudFlare. The costs of bad actors are your costs. You have the technical ability to retaliate against common carriers for not allowing you to push those costs onto them, but that doesn't make you right to do it in any sense other than might makes right. And you should realize that in doing it you're knowingly hurting innocent people.


Of course they're not an enforcement agency, that's kind of the point - there is no enforcement agency. We all have to contribute by being good citizens.

I subscribe to the idea that it's an ISP's responsibility to police its own network for abuse and my responsibility to police mine.

You apparently subscribe to the idea that it's my responsibility to just accept whatever shit you fling at me and it's my problem to deal with and yet somehow I have a responsibility or moral obligation to still provide services to you and your customers.

I suspect I'm never going to agree with you.


There are two ways to deal with badness. The first is to try to identify bad people and then stop them from doing anything whatsoever. The second is to identify bad acts and stop anyone from doing bad acts.

The second one is the only one that works without massive collateral damage.

Identifying bad people is only an abstraction over identifying bad acts and it leaks like a sieve. A reformed thief is entirely capable of buying an apple without incident, because not all acts by bad people are bad acts. But a thief has no reputation as a thief until after they steal for the first time. The only way to stop bad things is to detect bad things.

But the true failure of reputation systems is that as soon as multiple people share an identity they disintegrate entirely. Innocent people get blamed for malicious acts of other people through no fault of their own and with no ability to prevent it. The only way reputation systems can work at all is if people can prevent other people from using their identities.

Which means that IPv4 addresses can't be identities, because we don't have enough of them for them not to be shared.

And forcing common carriers to stop doing business with anyone who has ever done anything bad has another problem. It imposes the death penalty for jaywalking. You send spam once -- or get falsely accused of sending spam -- and you're blacklisted. It puts too many innocent people into the same bucket as guilty people and then the innocent people fight you alongside the guilty. It creates the market for these VPN services because too many servers are wrongly using IP addresses as identities. Then the bad people also use the VPN services and bypass your "security" because it was never security to begin with, so you block the VPN services which destroys those and they're replaced with others you haven't blocked. Meanwhile the real bad people also use botnets which are unaffected, so you aren't actually blocking the bad people, you're only blocking the one IP address that they share with the good guys.

You don't want this fight. Most of the people you're fighting are innocent. People need to learn to detect bad acts, not "bad IP addresses."


"The bad actor is the individual who acts bad. The Post Office is not a bad actor for delivering letters."

If they actively ignored death threats/didn't send them off to the police when it came to their attention, they become responsible.

"The costs of bad actors are your costs."

Cloudflare is sick and tired of getting attacked by people from the TOR network. I don't blame them for the ban. It's costing them money..and TOR isn't going to raise the funds to pay them for the lost bandwidth and customer revenue.

We used to have a bigger problem with mail spam because server operators constantly would leave anonymous relaying open. How did we stop a great deal of it? By blacklisting the IP until the problem is fixed. It has worked out pretty well.

"And you should realize that in doing it you're knowingly hurting innocent people."

I guess we have to determine which 'innocent' people are more important: The ones getting their websites attacked and hacked anonymously, the people that can't access those websites because they are down/DOS attacked, or the random people that want to use the TOR network.


> If they actively ignored death threats/didn't send them off to the police when it came to their attention, they become responsible.

The Post Office reads your mail?

> We used to have a bigger problem with mail spam because server operators constantly would leave anonymous relaying open. How did we stop a great deal of it? By blacklisting the IP until the problem is fixed. It has worked out pretty well.

Only if you're willing to disregard innocent people.

> I guess we have to determine which 'innocent' people are more important: The ones getting their websites attacked and hacked anonymously, the people that can't access those websites because they are down/DOS attacked, or the random people that want to use the TOR network.

The "random people" who use Tor aren't doing it because it's trendy. They're doing it because it's the only way they can access the internet. Or because not using it would get them stoned to death by religious fundamentalists or imprisoned by an oppressive government.


We're not dealing with 99 ASs blocking another bad actor. We're talking about one service that sits in front of many popular services on the internet deciding to block another for dubious reasons. Cloudflare has near monopoly power here.


I guess I look at it differently - every site using CloudFare made the decision to delegate their web security to them. I don't see it as "one entity blocking another" but "all of those individual sites blocking a single network".

In that context, it's a lot of votes for Tor to find a solution to this problem.

Really, I'm surprised at CloudFare's restraint here. A lot of their customers probably couldn't care less about Tor, but they've been putting a lot of effort into trying to avoid blocking Tor users (actually blocking, not inconveniencing) or compromising their anonymity.


Developers care and they are very important to Cloudflare. It is in their best interest to actually do it properly.


That's exactly the problem: customers who don't care.

Based on what I've seen in the thread of the bug report that spawned this debate, most of the browsing activity that these CAPTCHAs get in the way of is read-only. The only ways (nominally) read-only requests can cause harm are DDOS and exploiting vulnerabilities in the server software. Tor doesn't have enough bandwidth to be a big contributor to a DDOS, and sticking CAPTCHAs before some users is at best a probabilistic solution, as it only avoids exploits that:

(a) are untargeted (scanning the whole internet - if a human attacker cared about your site in particular then they could easily switch to one of the following methods to circumvent the CAPTCHA);

(b) use Tor rather than going to more effort to get access to less tainted IPs (VPS, botnets...) - assuming that the attack itself doesn't gather bad reputation (in cases where CloudFlare can detect malicious traffic by inspection, it can do better than IP blocking);

(c) don't use a service that farms CAPTCHA solving out to humans - which increases the attacker's cost but not by much.

Since the harm reduction is so minor, I suspect that for most sites, if the administrators had even a small incentive to support Tor users and the time to think about it, they would not choose CloudFlare's coarse-grained CAPTCHA approach. Rather, they'd make sure to have their own CAPTCHAs before anonymous write actions and before user registration - which they should be doing anyway - and leave read-only access alone. And the benefits of Tor to users in repressive countries should be enough to provide that small incentive, if they cared.

But they don't care. They don't want to change anything (like adding CAPTCHAs) unless there's a problem, and if CloudFlare can reduce that problem without their having to think about it, then that's the path of least resistance and they'll go with it even if there are consequences. I suspect most site owners, if asked about Tor, would say "just block it", which is why CloudFlare has - admirably - gone out of its way to doing so in its UI difficult. This is (a large portion of) who CloudFlare represents and I agree with you that they're showing restraint.

But here's where I differ: I don't think mass apathy counts as "votes for Tor to find a solution to this problem". While the magnitude of harm is of course completely different, that's like saying that in the case of discrimination against a minority group, since most majority group members just want the issue to go away, they're voting for the minority to "find a solution" - when the only real solution is for the majority to change and stop discriminating. I mean, maybe they will vote that way in actual elections, but apathy votes don't reflect the "wisdom of the crowd" as much as others do; the minority shouldn't just consider themselves overruled and give up.

CloudFlare is already "defying" those votes to some extent, and if there is no good solution that can make both parties happy, I'd say it would be the right thing to do for them to go a little further and open up a little more for Tor users, even if it's not what their customers would decide in a knee-jerk reaction. I hope that this blinded CAPTCHA idea will turn out to be such a solution, though. It's not ideal, since having any kind of CAPTCHA blocks potentially-legitimate automated traffic, but I think it's a good enough compromise for now - sites that care could still turn it off entirely. I hope the Tor developers won't let the perfect be the enemy of the good.

Oh, and - I think there is one act for which CloudFlare deserves some blame: signing up those customers in the first place with the promise to provide "security" at the CDN level. It's not that what they do is useless, but given the fundamental limits of (all) "web application firewalls" that only see the application from the outside and thus can only guess heuristically what is an attack, less technical customers are probably misled somewhat about the necessity and benefit of them. Most people, including less technical site administrators or owners, don't even understand the difference between DDOS and "real" attacks, let alone what WAFs do or, say, what concerns apply to Tor in particular. I'm not sure what CF could do to fix this short of not advertising security at all, and that would undersell what they do provide. Even so...


So tor should be a democracy that let's people vote spammers off the island? But again, the promise of tor is anonymity, and this breaks anonymity.


Google does this with Linode servers. I route my HTTP traffic through a proxy on a Linode server. Google blocks me all the time for no reason other than that some other IPs in my range are doing bad things. I've tried to contact Google about this but they could care less about a handful of users.


I've experienced the same issue from multiple service providers, including but not limited to Google, Akamai, Cloudflare, and more.

My Linode IP address was assigned to me years ago. I do not use it maliciously, do not share it with other people, and have never used it for tor. Yet find myself regularly blocked for no logical reasons when I proxy my web-request through it.


Complain to a Google dev here on HN, I had the same issue with my OVH proxy.


The problem with this kind of thinking is that yes, while content ID does catch a lot of false positives, and yes, it results in creators being dinged for no reason, the reason it was created in the first place was that too many people were abusing YouTube to distribute pirated material. It's the same cause/effect here with CloudFlare, too many people are abusing the anonymity Tor provides to do shitty things to their network.

It's not as if CF set out to screw over Tor users, by the nature of Tor they'd have no way to do it with any kind of ease. Tor traffic just happens to have a whole lot of bad actors using it and that causes the reputation of those IP's to down.


Content ID was created because Google wanted Hollywood content and Hollywood wanted to externalize costs.

Copyright in the context of the internet has costs that can only be paid by innocent people. It will either have many false positives or many false negatives. So the question is whether those costs should be paid by the innocent people who benefit from the system that created those costs or the innocent people who don't.


"...you can't be bothered to protect your network..."

Huh? Isn't this exactly what CF is attempting to do? And Tor traffic tends to be abusive so the good is caught up with the bad, but it's all in the name of protection.


> Isn't this exactly what CF is attempting to do?

No. The real bad people have botnets and can cycle through a million random IP addresses every time you block one. A real solution needs to be secure against someone you don't yet know is bad.

This is going to get a lot worse as we've run out of IPv4 addresses. ISPs are going to start to NAT many users behind one IP address. Some already have. Then you have the same issue as Tor where one user is malicious but shares the same IP address as a thousand innocent users. IP blocking isn't going to work anymore so you might as well find an alternative solution now.


TOR exit nodes could run an IDENT-like service, that offers only a hash based on the source of the traffic... it would still be psuedo-anonymous, but easier to filter by the target of said traffic.


At a point in the recent past, around 90% of all E-mail traffic was spam. Now it's down to around 50% or so [1]. What happened? It could have been due to thousands of ISPs simultaneously cleaning up and policing their networks. But it also could be due to blocking tools getting better. Maybe the spammers moved away from E-mail to more profitable spam channels. Or is there just more legit traffic now, and the percentage is down because the denominator is larger?

1. http://www.bbc.com/news/technology-33564016


It's definitely because of the major botnets being taken down. I remember the McColo bust specifically, spam volumes never recovered to their previous levels after that bust. (PDF) https://www.rsaconference.com/writable/presentations/file_up...


Email didn't outrun the bear. It outran the other hiker.

The honeypot space for spam is now systems other than email. Social media. Blogspam. Web advertising. Dating services. SMS.

(I'm not sure precisely which, but pick from among there and you'll likely turn up the issues.)

Email is fairly well defended at this point, though not without considerable collateral damage (small / self-hosted email is quite difficult, most of us rely on a small number of high-volume providers who may present a considerable privacy risk).


> I know Tor doesn't want to be in the network regulation business, but ....

That is exactly why there is a Tor. Tor is for enabling anonymous communication. Now deciding who can do what or why would limit use and that would limit its ability to anonymous communication.


Definitely, but they shouldn't complain when the public Internet (Cloudflare) blocks them or views their traffic differently. Anonymity comes at a price, and this is one of them. I think TOR is an important project, but this blog post by them is completely ridiculous and ignores reality.

Why should TOR get a pass on this when network operators need to protect their network from abuse? The choices are either 1) let the abuse continue or 2) try to block the specific attack traffic in question by heuristics which is often difficult or impossible or 3) block abusive IP addresses.


Complain is exactly what they should do. People should care about privacy even when it's other people's privacy.


No one's privacy is being violated here.

The right to privacy is not the same thing as the right to access. If someone doesn't want to allow you anonymous access it is their right to block you. Anyone using CF and not whitelisting Tor is effectively saying. If you want to visit my site you need to verify that you aren't a bad actor. If you don't want to do that because of privacy reasons then you can't visit.


Haha if CF get this kind of response to their gentle observations I'd hate to see what a frank honest complaint would get...


It's not a binary thing. You can regulate out fraud, abuse, and DDoS attacks without harming the legitimate use cases.

It's like selling alcohol (cigarettes, porn, gambling), but not to minors.

You can say "it's OK for this crowd not OK for this crowd". Otherwise you'd probably claim that said regulation would go against the very thing that the merchant is trying to do: make as much money as possible.


"It's like selling alcohol (cigarettes, porn, gambling), but not to minors."

It's not like that at all, because selling to everyone other than minors requires identity. Sure, you just need to verify the subject is over 18, i.e. you don't need to know name, birth date or address. But you DO need to know identity to issue the token that provisions the token that proves an age > 18.

And that breaks Tor's raison d'être.


It's very easy to say that one can filter out certain things, but in practice I think it's far more difficult. Especially in Tor's case where the goal is to provide truly anonymous access.

Do you have a proposal for filtering out that kind of traffic while allowing other traffic? I can't think of a way off the top of my head and I'm sure the people behind Tor would at least consider a logical solution.

Edit: This sounds kind of confrontational, but I don't mean it like that. I honestly would like to hear of a potential solution to this because I really can't think of one.


>You can regulate out fraud, abuse, and DDoS attacks without harming the legitimate use cases.

How? If you think this was possible without completely compromising the Tor protocol, don't you think it would have been done already?


> It's not a binary thing.

Well the thing is how do you regulate totally anonymous users? How would you know what the binary 0 and 1 are doing in tor?


> Tor needs to clean up and police its network

I think you don't understand what Tor is or how it works. Tor is a way to anonymize its users. You have no way to analyze a packet until it reaches an exit node, and you have no way to analyze that packet if it's done over https, and you have no way to block an ip from that exit node because it comes from another node where plenty of other ips are coming from. If you start blocking this node, then the spammer can choose a different path and come from another node, or just change his exit node.

tl;dr: what you are saying goes against Tor's principles.


I think it's clear he understands how it works.

> what you are saying goes against Tor's principals

That's why it will probably never be cleaned up. That's also why more and more people will probably block access from Tor. CloudFlare says they get a 95% attack rate from it. A blog post the other day said FotoForensics gets about 91% attacks from Tor.

No one is going to put up with 91% attacks for long. And if that means Tor becomes it's own walled garden that doesn't 'interact' with the public internet, so be it.


> I think it's clear he understands how it works.

Agree to disagree :)

> if that means Tor becomes it's own walled garden that doesn't 'interact' with the public internet, so be it

Most websites are not using Cloudflare and have no idea how to block a range of ips. So no Tor is not going to become its own walled garden.


>> I think it's clear he understands how it works. > Agree to disagree :)

I know very well how it works thank you very much. It doesn't mean I have to like it.

Since you're apt to throwing around unfounded accusations, I'll join the party. I have more experience than you do at running large networks, dealing with fraud, dealing with abuse, dealing with malware, and dealing with law enforcement/government agents. The Internet has enough problems with the well-run networks that actually care. We could care less about Tor users that might be blocked.


I work for a company that makes appliances to secure companies' networks.

I'm promise you that CloudFlare isn't the only group that offers to block TOR as an option (or just has it blocked by default).


Filtering of outbound traffic from Tor exit nodes can only be done by Tor exit node operators. The Tor project can kick out some defaults (like, say, blocking SMTP) but ultimately it wouldn't matter...the bad actors will just find ways around the filters, or devise new ways to abuse the system. Static filters will never work. It's like hoping everyone will do source address verification on egress.

Cloudflare just have to get smarter here. It's their business to do dynamic filtering and balance this stuff, not Tors.


"Cloudflare just have to get smarter here. It's their business to do dynamic filtering and balance this stuff, not Tors."

It's not their business given Cloudfare doesn't make shit off Tor: it looses bandwidth and time fighting malice instead. It's actually their business to block it. Tor needs a reputation system or some other method to deal with this stuff.


The main point I took away from the article, that from one exit node many users originate. Some users are spammer. They contaminate the exit node IP. CF blocks an IP for spam, but does not remove the block after some time (when the spammer moved on).


That's definitely a legitimate point, but not the main point IMHO. The main point is that Tor makes zero effort to clean up the problem and uses the legitimate Tor users as helpless, scapegoated victims and a bullying tactic. "But think of the oppressed users!" Sorry, not buying it.

The blacklisted IP lifetime problem is real though. It's a problem I've had to raise several times with our product and network teams. People would see an abusive IP and just ban it...without a TTL or lifetime. This really upset me as they seemed to think that was OK not just for the time being, but that it was good enough. When I describe IPv6 to them, their faces just melt as it sinks in that they can't just keep banning IP's and must do something higher up the stack to detect and block fraud.


Interesting point, what would "Tor cleans up" mean?


What will happen when IPv6 takes off and anyone can just go get a unique, random, throwaway IP address each day? Do "IP reputation" systems work in such a world?


I guess reputation will be based on /64, that's the most practical solution.


Maybe coalesce neighboring bad IPs and subnets into larger bad subnets.


The somewhat-faster way to reduce this annoyance (I've been hit by this) is to 'block'/captcha the offending IPs for a time (depends on whatever metrics CloudFlare think is best) then unblock it.

At least this will reduce legitimate user's annoyance, instead of being blocked indefinitely

My own experience: Tried accessing wikialpha from work LAN, 'blocked' by endless captcha (for a few months already) and opening the said wiki from home network, all is perfectly fine...

well at least now I know why, still does NOT makes it OK from end-users perpective


Given how much of TOR traffic is malicious (94%), what you propose would get all TOR traffic always considered malicious.


Care to share a source for "TOR traffic is malicious (94%)"?

What does malicious percentage of traffic in this case mean? Malicious sessions? IPs used? Packets? Users?


The number is in the CloudFlare blog post that started this.

> Based on data across the CloudFlare network, 94% of requests that we see across the Tor network are per se malicious. That doesn’t mean they are visiting controversial content, but instead that they are automated requests designed to harm our customers.


Actually, I think the real problem is the idea that networks are responsible for policing their users, rather than the idea that servers should be responsible for policing their clients. The former is what people want (because it's easy: blame an IP, ban it, be done), but the latter is the reality.

CloudFlare's CAPTCHAs are an attempt to deal with that reality, but they're heavy-handed. Worse, they're at the wrong level: the protected site may have already verified that the user is legitimate, but CloudFlare imposes its block when the user's source IP changes again.

CAPTCHAs belong at the application layer, not the transport layer.


You've made one of the most compelling points I've seen in this discussion so far. Really - why mix an application security mechanism into the transport layer? DDoS attacks will happen whether Tor is blocked or not. Add to this some other interesting points I've read here regarding: IPv4 exhaustion, NAT and IPv6 growth. All signs seem to point to the need for an application level solution.


>they need to be if they want their product to thrive. Otherwise, good bye Tor.

The ironic thing is actually that by applying any kind of "network regulation" the Tor project would abandon its own primary purpose. The only way it can continue to exist is actually if it doesn't practice any kind of censorship of its users.


This is a very simplistic, binary view of the world. Just because the Internet's view of a product differs from the author's, it doesn't make the Internet wrong. It means the author probably needs to step back and re-evaluate things.

By your logic, gun owners would be able to shoot whatever they want. The primary purpose of a gun is to shoot things. To paraphrase: "To regulate what you can and can't shoot would go against the primary purpose of a gun and the gun manufacturers can't exist that way."


Something working or not is a binary.


As is cloudfare. I have lost count of sites that pirate our software that are using cloudfare.

Cloudfare know of the problem and refuse to do anything about it.


Thankfully. They're already the "Wifi Captive Portal" of the internet, be glad that they're not also the police of the internet.


I hate to admit it, but I think you and Tor both have fair points.

The facilitator of much abuse isn't anonymity per se, but impunity. The ability to act without consequence.

Finding a way to imbue reputation across an anonymised connection seems to be one way to operate. Not an easy problem. There's some work toward solutions, though none are yet widespread.


> I know Tor doesn't want to be in the network regulation business, but they need to be if they want their product to thrive. Otherwise, good bye Tor.

I'm not sure if you know what you're talking about. From your comment, it's crystal-clear that you don't understand what their service is.

Plus, calling a free service based largely based on volunteers, university and non-profits a product is derogatory.


I think Cloudflare's blog post was incredibly nuanced, well thoughtout and (dare I say) pro-Tor. They implemented a way for their users to whitelist Tor traffic (bypassing all Captcha's), without allowing their users to blacklist Tor traffic.

This response seems a bit of a childish knee-jerk reaction from the Tor project, which could've been worded more maturely.


I didn't spot anything worded immaturely. What specifically do you think could be more maturely worded?


Instead of addressing the very real problems with usage of Tor, they try to pick holes in the 94% figure from cloudflare (which isn't actually very important), and go on to cite a study by cherry picking stats: https://news.ycombinator.com/item?id=11405101. They don't mention that it explicitly states something which backs up cloudflare's position:

Tor exit nodes were far more likely to contain malicious requests

and even:

Risk averse companies may wish to block all Tor traffic

The article then goes on to suggest that it is perfectly reasonable to use the word 'block' to mean showing a captcha - in common usage block means block - deny requests, not attempt to determine if a user is human with a captcha or some other method - that's not blocking, it's annoying and potentially pointless, but it's certainly not simply blocking users and it's disingenuous to describe it as such.

All of that adds up to a response which seems to be more interested in scoring points than finding a solution for legitimate Tor users. I'm not sure I'd describe it as immature, but it's not a very constructive response, to an article which went out of its way to be Tor friendly and propose solutions. It would be much easier for cloudflare to really block Tor traffic, they would probably suffer very little from doing so.


They are being defensive.

> Tor exit nodes were far more likely to contain malicious requests

They also were far more likely to issue requests in general. This data point has no meaning. But that's statistics for you :)


Thanks, agree, I'd like to know exactly what was childish and happy to own up to it if true and attempt to fix the issue.


The number of hn users replying solely to tone and not content is pretty disappointing.


Possible it's tone, but also given the relatively low volume of voting, and it's unclear to me what if any steps HN takes to reduce abuse and fine tune community comment policies, it's very possible that something else is going on. For example, some users appear to have comment histories that are highly irregular; one comment suggesting Snowden should be invited to return to the US without any chance of being behind bars or worst, and in another appearing to state that the US has not overstepped their rights; which is possible, though strikes me as bit odd; 100% sure my comments to some are a bit odd, though do try to respond if a direct statement is expressed.


I [I'm CloudFlare's CTO] have been engaging with the Tor folks through their Trac interface here for about 6 weeks: https://trac.torproject.org/projects/tor/ticket/18361 and been very open about CloudFlare is addressing this.

My plan is to continue to do so through that ticket as I've made various commitments there (some of which, like whitelisting, we've already rolled out). It's worth reading the entire ticket to get a sense of the conversation. We are in no way finished improving the situation.


Hello, please also consider VPN usage. Unlike Tor, we even pay for this service, because we take it so seriously.

Despite using the most reputable VPN provider I could find with a serious privacy policy; I've seen a steady increase in Captcha requests from CloudFlare to simply view read-only pages. And all I can think is that the Captcha page often requires the same amount of bandwidth as the page I was requesting in the first place.

The net effect is that it's not saving you any CPU usage or bandwidth (if anything, it's costing you more as we still request the actual page after the Captcha system runs), it's making customers like me abandon your customer's sites out of frustration, and it's eroding the last line of defense we have against invasive tracking.

I'm sympathetic to the problem you're trying to solve, but surely there must be a better way for simple GET requests.

This doesn't just affect people like me as a user. Having experienced this, I would be averse to deploying or recommending CloudFlare in its current state.


Which VPN are you talking about?


It's happened to me with Astrill, ExpressVPN, and vpn.ac so far. Usually I can reconnect to get a new IP to avoid catpchas again, so it seems only certain IPs are being poisoned. And no doubt because they are doing things they shouldn't. Hence the sympathy part.

If anything, it's been surprising to me to learn just how many sites are using CloudFlare ;)

I don't have a site handy that's triggering it right this moment, but here's a somewhat recent screenshot I grabbed of the captcha wall hitting my VPN: http://i.imgur.com/OnvK05l.png ; I received this for simply trying to view a product page with an ordinary GET request. Ironically, I couldn't solve the captcha, despite being human >_<


Thanks. I am hopeful that the solution we come up with for Tor will be applicable to situations like this.


Awesome news!! Thank you very much for taking the time to respond, and for looking into a solution for Tor/VPNs. It's very much appreciated :D


I'm a long-time Tor user that's been affected by CloudFlare captchas for a few years. I appreciate that you (CloudFlare) are trying to tackle the problem, but I feel that both CloudFlare and the Tor community have defeatist attitudes toward this issue.

I have the following suggestions for CloudFlare:

1. Can you provide better documentation for your customers about what Tor is, and reasons for/against white/blacklisting Tor? For example, when a customer selects to Block or Captcha Tor, a tiny link could show up somewhere that says something like "This affects users who seek privacy, find out more."

2. In addition to better docs, can you setup something that lets site operators view the site as a Tor user? The screenshot on this site does not do the Tor+Captcha experience justice:

https://support.cloudflare.com/hc/en-us/articles/203306930-D...

In particular, the screenshot assumes the Tor browser is running Javascript and that all the user has to do is click that button. In reality, Tor users have to click a bunch of checkmarks, try again once or twice, then copy a code into a textbox then hit submit.

If you provided a "view site as Tor user" demo (JS and non-JS versions), then site operators might be more reluctant to enable Captcha for Tor users.

3. The latest CloudFlare blog post on Tor says "you can do a lot of harm just with GETs." I wish you would give more thought to the idea of a read-only option for non-whitelisted Tor users. If GET requests are harmful (I'm skeptical), reducing the harm of GET requests seems like a much easier problem than the overall problem. Afterall, what good is a CDN that can't handle lots of requests for static content?

I also have the following suggestions for the Tor community:

1. Continue to improve the Tor user experience to gain users! The more people use Tor, the harder it is to ignore (by CloudFlare and others) and the safer it gets. Acquiring more users is one of the best ways to fight back against Captchas. One way to get lots of new users is Firefox integration.

2. Fix hidden services. It's awesome that CloudFlare wants to give the option to setup hidden services for their customers. Make that possible for them!

3. If CloudFlare doesn't want to provide some sort of read-only mode, build that functionality yourselves! For example: when the Tor Browser detects a CloudFlare captcha, it could give the user the option to read a read-only cache from some other CDN.


1. Can you provide better documentation for your customers about what Tor is, and reasons for/against white/blacklisting Tor? For example, when a customer selects to Block or Captcha Tor, a tiny link could show up somewhere that says something like "This affects users who seek privacy, find out more."

I'll make sure the product team sees that suggestion.

2. In addition to better docs, can you setup something that lets site operators view the site as a Tor user?

That seems like an enormous amount of work when anyone can just get the Tor Browser and test it out.

3. The latest CloudFlare blog post on Tor says "you can do a lot of harm just with GETs." I wish you would give more thought to the idea of a read-only option for non-whitelisted Tor users. If GET requests are harmful (I'm skeptical), reducing the harm of GET requests seems like a much easier problem than the overall problem.

If you read the Trac thread you'll see that I've answered that. In short, I don't want to do it because that diverts engineering resource away from the right thing to work on (which is reduce the need for CAPTCHA).


> If you read the Trac thread you'll see that I've answered that. In short, I don't want to do it because that diverts engineering resource away from the right thing to work on (which is reduce the need for CAPTCHA).

I agree, but thinking about GET-only requests is one approach to reducing the need for CAPTCHA. For example, maybe CloudFlare could have better Tor defaults for sites that are serving only static content, and default to Captcha for sites that are POST-heavy (just a high level idea).

Basically, most of the time, GETs have nice properties: idempotent, pure, etc. I think a solution to the captcha problem could take these into account.

I do not think the other solution proposed in the CF blog post, using proof of work with some sort of blinded tokens, is going to work well. A hashcash style proof-of-work is easily defeated with a botnet or FPGA, and reputation-based systems are an ongoing research area.

It's possible there is a silver bullet that we haven't found yet. Have Tor or CloudFlare considered putting out a call for research into the problem?


I agree, but thinking about GET-only requests is one approach to reducing the need for CAPTCHA. For example, maybe CloudFlare could have better Tor defaults for sites that are serving only static content, and default to Captcha for sites that are POST-heavy (just a high level idea).

To be honest I'm not interested in solving the CAPTCHA problem just for Tor. That doesn't make a lot of sense. What I am working on is an overall solution so that the need for CAPTCHAs at all is diminished.


> What I am working on is an overall solution so that the need for CAPTCHAs at all is diminished.

I like that idea, but my worry is it will take years to reach that point, and in the meantime Tor/VPN users will just have to suffer. I'd rather see some short-term fixes now and long-term solutions on the horizon.

I admit: I have not read the entire Trac thread, so I'm not sure what your current roadmap is.


I like that idea, but my worry is it will take years to reach that point

It won't.


>> 2. In addition to better docs, can you setup something that lets site operators view the site as a Tor user? > > That seems like an enormous amount of work when anyone can just get the Tor Browser and test it out.

Sure, but most site operators aren't going to get the Tor browser to test it out -- especially if they don't realize that that is something they should do.

By "view site as Tor user" I simply meant having a way for site operators to interact with the captcha page that CloudFlare presents users. That should be easy to setup; it can even be just a static HTML page that's linked from the documentation on Tor. I didn't mean that you should display the page through Tor.


But Tor users are complaining about the specific situation where the Tor Browser is used. I don't think that's easy to simulate (or at least I think it's easier for people just to get the Tor Browser).


Tor users are complaining about the captchas. I'm not sure why you think it's so hard to create a sample captcha page to demonstrate the experience. You just need two pages: one with JS-enabled captchas and one with JS-disabled captchas. Do you need me to do this for you?

Alternatively, you can create a dummy CloudFlare instance with /0 under Captcha, and put the URL to this in the docs, but this wont let users try the JS-disabled captchas.

The only difference between this demo and the actual Tor experience is that over Tor, these pages load more slowly. But people are more annoyed at seeing the captcha in the first place, not the fact that they sometimes load slowly...


Tor users are complaining about the captchas. I'm not sure why you think it's so hard to create a sample captcha page to demonstrate the experience. You just need two pages: one with JS-enabled captchas and one with JS-disabled captchas. Do you need me to do this for you?

It's not that simple.

reCAPTCHA makes an on the fly decision about the strength of the CAPTCHA served depending on the visitor. In the case of Tor there's no visitor information other than IP address (and whatever the browser gives as User-Agent etc.).

So, I can dummy up a CAPTCHA page trivially, what I can't do is dummy up the experience of a Tor Browser user hitting a reCAPTCHA. The way to do that is run the Tor Browser.


Good point. That did not occur to me.

edit: Thanks for the dialogue thus far.


What's your timeline for being as open as the Tor folk about abuse of your network and start taking responsibility for all sort of internet scum being protected by your network and allowing them use of your infrastructure?

You seem keen on making locking out malicious use of Tor, when can we expect you to lock out malicious use of CloudFlare?


Out of curiosity, did you also write the previous blog post from CloudFlare that sparked this reaction from Tor?


No, that was written by the CEO (eastdakota here). He asked me to copy edit it which I did. I wrote the conclusion (because he was sleeping) and I added the two charts.

On the CloudFlare blog if someone's name is on it they wrote it .


That's just flawed reasoning all around. I can't even find any e-commerce-specific data in their sources.

> A report by CloudFlare competitor Akamai found that the percentage of legitimate e-commerce traffic originating from Tor IP addresses is nearly identical to that originating from the Internet at large. (Specifically, Akamai found that the "conversion rate" of Tor IP addresses clicking on ads and performing commercial activity was "virtually equal" to that of non-Tor IP addresses).

Actual data from the report:

  • Comparison of Tor and non-Tor Traffic:

  	Of legitimate requests, non-Tor IPs accounted for 99.96 percent of requests, while Tor exit nodes accounted for 0.04 percent

  	Of malicious requests, non-Tor IPs accounted for 98.74 percent of requests, while Tor exit nodes accounted for 1.26 percent

  • Tor exit nodes were far more likely to contain malicious requests:

  	1:11,500 non-Tor IPs contained malicious requests

  	1:380 Tor exit nodes contained malicious requests

  • However, traffic from Tor exit nodes yielded a conversion rate virtually equal to non-Tor IPs:

  	Conversion rate for non-Tor IPs was 1:834

  	Conversion rate for Tor exit nodes was 1:895
Source: slide 7 of the report they link in the article – https://i.imgur.com/TcstnWD.jpg


[IP addresses of] Tor exit nodes were far more likely to contain malicious requests

However, traffic from Tor exit nodes yielded a conversion rate virtually equal to non-Tor IPs

You just described every busy IP address: if you handle more requests, you are more likely to handle a malicious one. This is the problem with IP based reputation.


Yes, I think that the more revealing ratio would have been total malicious requests to all requests for each class of IP. If each Tor exit node is sending out 30x as much traffic, with an average of 30 unique users per IP, then the cited ratios are meaningless.

The only thing that can be drawn from that data is that Tor makes IP-based reputation tools ineffective. The thing is, for many people that may be enough to justify what Cloudflare is doing.


That's what CloudFlare did in their blog post:

> Based on data across the CloudFlare network, 94% of requests that we see across the Tor network are per se malicious.


That number does not fit with the 1:380 ratio of Tor IPs that emitted malicious requests, though. That is, unless the vast majority of all Tor traffic is routed out of only a couple of exit nodes, which does not seem consistent with the way that Tor works.

How can Cloudflare assert that 94% of all requests over Tor are malicious, when Akamai seems to be saying that less than 1% of Tor IPs contain malicious requests?


> That's just flawed reasoning all around.

Starting with a blanket blaming statement isn't that great either. Conversion rates and e-commerce sorta go together, so I would say it's fine to entangle them logically. Tor exit nodes entangle users with each other through an IP. If someone's lens is limited, they'll run the risk of blanket blaming the legitimate traffic as well.


> Tor exit nodes were far more likely to contain malicious requests

That's comparing apples to oranges though. A lot of different people send requests from tor exit nodes. That might be comparable to some corporate networks, but many IP addresses are used by only one person (or a family).

Intuition would suggest that tor traffic is more likely to be malicious than average traffic, but suppose there are 500 tor exit nodes in the world, and 1 malicious tor user: then 1 in 500 tor-exit node IPs would have sent a malicious request!


I don't know what the solution is here.

One of my sites enjoys a ridiculous number of fraudsters trying to make purchases, many - but very much not all - from the tor network.

The easy solution is to punish everyone and ban tor exit nodes from access, and woo, a significant reduction in my fraud rate.

The way I justify this to myself is that the site only accepts payment via PayPal and/or credit cards, and paying with those in itself gives up a good amount of privacy.

For sites that don't make a profit and have to use unpaid time to clean up the mess from some tor nodes, I really don't know what the solution is.

It definitely sucks for legitimate users.

Edit: one more difficulty is that I don't know if I was targeted by one or two lazy-yet-determined fraudsters who only use tor, and so make tor look worse than it is with their repeated attempts. No idea even where to begin with that one.


This is where 3D Secure truly shines; instead of completely refusing a transaction, you can request the issuing bank (= bank of the card used to pay with) to accept the liability in case of fraud (normally, it's the merchant who has to give the money back). Usually the issuing bank will then request the customer for additional challenge, e.g. a 2FA token, a code in SMS, or just their birthday. Some don't even require anything. To you as a merchant, though, that doesn't matter: if the person who bought your product was a fraudster, the money is still yours. Now it's the issuing bank on the hook for fraud, instead of the merchant.

The argument against 3DS is it kills conversion rates (meaning: lots of legitimate customers don't complete that extra challenge, who would otherwise have made the purchase). But for legitimate customers using TOR, I wouldn't be surprised if that number were very different :) I know I wouldn't mind whipping out a 2fa device every time I purchased something over TOR. I mean, fair play, right?

3DS is available today in Europe, Asia and Africa! :'( but not ubiquitously in the USA. Yet. It's getting there.

More info: https://en.wikipedia.org/wiki/3-D_Secure


3D Secure is a complete disaster. It encourages users to put ridiculously sensitive information like social security numbers and bank credentials into an iframe in the merchant site. This trains users to be phished.


Really depends on the bank, some of them have a decent 2fa system for 3d secure.


Some of my credit cards redirect to a page hosted by the issuer. Others require me to authorise the transaction in an app on my phone.


Social security numbers?! Who the hell implemented it like THAT?

3D Secure redirects to the bank's site (not in an iframe! a real window with a visible address bar) where you enter a one-time code from SMS!


I have worked with systems implementing 3D Secure and have multiple credit cards that trigger it. I can assure you that the standard deployment for US-based banks and merchants uses iframes and in the majority of cases will ask for enough personal information to steal your identity or drain your bank account.


I assume this is a US problem because in the US card fees are high enough that banks don't need to worry too much about fraud yet. Look at European banks if you want to see reasonable 3D Secure.


Eh. 3D Secure is a protocol which can be implemented in reasonable ways. My bank has a 2FA in there.


Why is the choice of using 3DSecure up to the merchant? If 3Dsecure is enabled it shouldn't be possible to make an online purchase without going through 3D secure. Or is this just for cards that don't have 3D secure enabled yet?

For example I have 3D secure enabled, and all my online purchases (in my own country) always require to type in the password on the bank's gateway site. If I see a site that allows me to make a purchase without going through 3Dsecure I'd be worried and probably call my bank.


Because the liability is never with the customer in the first place. It's fine that you're worried, but, in reality, you don't really have to be; card gets stolen? See a charge you don't recognize? Not your problem! Call your bank, get your money back.

The one who has to fear fraudsters is not the card owner; it's the merchant! Without 3D Secure, they have to pay the money back! + an extra fine, for good measure. And, of course, the product / service is already gone.


That conversion rate tanking is real though, I don't buy anything that requires it, and we know that extra steps means fewer purchases.


> The way I justify this to myself is that the site only accepts payment via PayPal and/or credit cards, and paying with those in itself gives up a good amount of privacy.

Prepaid credit cards are essentially anonymous, as far as I know.


>Prepaid credit cards are essentially anonymous, as far as I know.

My understanding is that you can only buy prepaid cards after showing ID in many jurisdictions, and many other places require you to register them with ID in order to use the cards.


What jurisdictions are those? You don't need ID to purchase or register Visa, MasterCard or American Express prepaid/gift cards in the US. I've bought all of those with cash, and registered them all with nothing more than the card number and CVV code.


In Canada, the prepaid cards I got once required me to submit a government ID. These are the ones from Canada Post in case anyone is curious.


I have never bought them from Canada Post, but I have bought them countless other times in Canada and never showed ID for any of them.

However, in order to make certain purchases online, it's often been the case that you need to "Register" the card with the provider, and supply details that would match billing information for the selling party. I can't see any reason why you couldn't fudge that, although shipping information for real goods would leak information.


I am not aware of any places that ID for purchasing prepaid cards, let alone prepaid burner phones.

One of the way scammers get "cash" is to buy Amex gift cards with stolen credit cards. Then use those Amex gift cards to buy more amex gift cards until they feel confident the trail is murky enough. Or they use a combination of store gift cards to buy Amex/Visa gift cards.


Bitcoin thieves and malware scammers buy gift cards and prepaid cards on Rakuten with bitcoin to convert their gains to non-Internet-funny-money. Ship to the foreclosed house at the end of block. Done without anyone being the wiser.


You could offer Tor users only the option to pay with Bitcoin. No chargebacks and less loss of privacy.


Tor can't possibly be the only signal of fraudulent activity though? It may be one that has an easy "solution" however, but one that's easily circumvented (one of the many free VPN services out there).


>Tor can't possibly be the only signal of fraudulent activity though?

Yes and no.

Yes: many fraudsters are ridiculously lazy, and fraud rates go down when the tor block is in place.

No: plenty of fraudsters access from elsewhere (and I put them through a separate fraud-detection-SaaS)


Signal according to Cloudflare isn't "TOR" but the IP's the are used by TOR exit nodes get banned due to them being shared and abused enough to trigger that IP being tagged as a source of trouble. I personally don't buy this, since I know of IPs they don't block again would get enough abussive traffic to merit the same treatment, but don't get the treatment TOR's IPs get.


Since you think CloudFlare is lying, what do you think the truth is?


Not sure, though given enough dialog on the topic, I believe that a better solution will be found or it'll become clear that Cloudflare is not responding to the issue.

Simple answer would be that the original analysis is flawed, they've forgotten that the wrote a script to block TOR exit IPs; TOR intentionally provides a list of these IPs to the public.

Might be worth noting that TOR users are often the target of National Security Letters, that Cloudflare based on their own report received National Security Letters, and as such, would be unable to say if those letters impacted code on the topic.


What kind of "better solution" do you envision? Right now you seem to be insisting that there must be one, which I must say does not make a very compelling case that one actually exists or is possible.


Given Cloudflare appears to have received National Security Letters, it's possible their is no answers.

That said, based on what I know, the answer is to whitelist the TOR IPs, give TOR users a global session that the user has the option to opt into (likely make sense for TOR publish what the impact of this is and Cloudflare to link to it in from that page) and always let users know a global session is set in case the user believe that using TOR they reset the session; resetting it via Cloudflare would be meaningless. General gist though is humans are not bots, don't behave as bots, and Cloudflare treats ever request as the same from an IP, which is a poor way to block bots.


> That said, based on what I know, the answer is to whitelist the TOR IPs

That's an option CloudFlare is offering to their customers now.

> give TOR users a global session that the user has the option to opt into (likely make sense for TOR publish what the impact of this is and Cloudflare to link to it in from that page) and always let users know a global session is set in case the user believe that using TOR they reset the session

Has this been researched or suggested by the Tor project at all? I think it's fairly dangerous to suggest CloudFlare starts offering something like this before it has been vetted.


> I don't know what the solution is here.

Should the ToR network be doing some amount of self policing? (Can it?)

I know this might be against some of its principles, but it seems that ToR is there to create privacy, not to be used for criminal activity. And yes, criminal activity differs based on jurisdiction, but I think fraud is generally something everyone agrees should not be allowed.


If this were possible, it would be a serious flaw in Tor's design.


How do you classify traffic as from the Tor network? I am interested in how this is done for a real e-commerce site in production.


We get a list of Tor exit nodes and regex our logs. We were under attack from Tor for about 6 weeks. We didn't block Tor as we are pro-Tor/privacy etc. The attacker was basically running very crappy pen testing kiddie script stuff against us. We just logged everything and enjoyed the free Pen test.


The list of Tor exit nodes is public and updated live.

https://check.torproject.org/exit-addresses


I feel like Tor is burying their head in the sand here.

I think Tor is great, but I don't find it at all surprising or unlikely that 94% of traffic (not users) is malicious (spam, vulnerability scanning, scraping, etc) because it's likely that malicious traffic is automated while legitimate traffic is not.

That said, I'd also like to hear more about CloudFlare's methodology.


Exchanged comments with Cloudflare's CEO on the topic and in my opinion it appears that they simply don't understand that their view of the situation is skewed.

Here's hoping that given they truly do appear to care about TOR users that they'll revisit the situation and find a better solution.

Here's a link to Cloudflare's blog post an the related comments on HN:

https://news.ycombinator.com/item?id=11388560


Of course CloudFlare's "view" of the situation is "skewed".

So's the Tor project's.

So's the view of the website operators receiving this traffic.

Cloudflare's post acknowledge the fact that there are at least three major points of view on this problem. The Tor project, by contrast, is increasingly striking me as taking on a petulant tone by refusing to acknowledge that and acting (implicitly if nothing else) as if their view is the only one.

To be honest, the core problem here is not Cloudflare. The core problem is that their customers don't really want Tor traffic. Cloudflare is, to my eye, bending over backwards for Tor compared to what I'd expect from a corporation, however it may feel to Tor. I would suggest the Tor project and its users, however annoyed they may be at their day-to-day experience, are ill-advised to take a petulant tone here, lest Cloudflare indeed give their customers the ability to whitelist and blacklist Tor as a whole... because I completely agree with Cloudflare that effectively nobody is going to whitelist it.


Wrong, Google is Cloudflare partner, so it is the opposite, at the very least, one company (Google) loves the fact Cloudflare is doing what they're doing; my estimates peg the value of the data in the hundreds of millions based on what Google already pays to get the same type of data from users.

Second, volume counts do not equal session counts and I find it very hard to believe that a human non-abussive human session looks the same as an abussive session. If true, then it's Cloudflare that's abusing users and exploiting the situation, not TOR.

Also, TOR users are not blocked, but flagged to provided data to Google. Also, Cloudflare's clients like don't even know about the issue since according to Cloudflare they're flagging IPs, not TOR.


Did you intend this as a reply to something else? I can't even connect your comment to what I said. It starts with "wrong" but doesn't seem to address anything I said.


Please bullet/number your concerns as self-contained statements and I'll explicitly reference them. And yes, my response is to your comment, though do see how it's possible it's ambiguous to how I'm addressing your concerns. Thanks for the comment.


It's not his job to format his comment to make it convenient for you to rebut. I agree that it's not at all clear how your reply addresses the original comment. Perhaps you could fix that by quoting the portions of the original comment you're replying to, rather than requiring line numbers.


Correct me if I am wrong, but I should format my comment because he refuses to do so, right? He refused to respond to my response to his even in part, though I should spend more time on it, right. My reading of his comment is he didn't read my comment, and he he can't read, there's nothing I am able to do.


Look, I'm just letting you know that from a third party perspective, you come across as the unreasonable one, both in this sub-thread, and throughout most of these comments. It seems like people are bringing up good points and you're refusing to engage, perhaps because you're so invested in your own point of view that you're unable to recognize the legitimacy of others'.

To your specific question, the parent comment's points were well reasoned, and your response read as completely orthogonal. If it wasn't, then yes, the onus is on you to demonstrate the relevance.


Plain English please, no idea beyond your meta ranting about how I comment on a topic, when your not even express a logic response to my comment. More to the point if you have something to express about the topic, if you want to be meta on comments post a link to an "Ask HN:" state a position in Plain English and post a link to it here. Cheers!


You know, I worried a bit that I was overshooting with my choice of the word "petulant". Thank you for putting that worry to rest.

You are not doing the Tor project any favors here.


Cloudflare's purpose is to make money. If anyone thinks they are here to help make the world better, that's a naive view. Tor's purpose is to help people access data that may be inaccessible to them without it and to help guard against invasion of privacy. While those things can be used for illicit purposes (as shown by the amount of rouge traffic on Tor exit nodes) the return on quality of life for the whole is greatly improved. Tor literally makes the world a better place, regardless of the fact malicious traffic also comes out of it.

Cloudflare does not make the world better for everyone, but it does a great job of making the world better for those who have access to resources, such as dedicated IPs and venture capitalists.

Capitalism has failed, and it's starting to force that fail onto the Internet in a big way. We need to be smarter about how we approach building trusted infrastructure. All that starts with how we approach building infrastructure companies.

If it's an infrastructure service model and it ain't bootstrapped and sustainable, don't use it.


That's a sad statement. Cloudflare is one of my role model for publicity and profit tactics. Everyone can use Cloudflare for free. Companies are their only customers.


I hear that statement may make you sad, but statements themselves hold no emotions, unless you are speaking for the feelings of those who made the statement. I don't allow others to speak for my feelings and I try to not speak for theirs.

The hard fact is that capitalism is a complex type of game theory, with the objective of winning and making more money. If there are those that think that building the best infrastructure we can for all is dependent on building companies that make VCs and limited partnerships even more money, I will do whatever is in my power to dispel those beliefs.

I do this because I believe our future is dependent on it, not because I'm sad about it. If anything, I don't trust the current process.


Cloudflare literally and knowingly sucks the life out of people by consuming their time and forcing them to perform labor for free to the benefit of their business partners to the tune of hundreds of millions of dollars.

Unless they attempt to change, sorry, but they are a bad company, potiental evil if the data is being used to dox TOR users via a NSL.


The only correspondence you had with the CloudFlare CEO in that thread was:

> eastdakota: I work for CloudFlare. We don't get anything from Google for using reCAPTCHA.

I think you might have gotten a username confused.


Why do you believe I'm mistaken? eastdakota, is the Cloudflare CEO's account.


It's Cloudflare, not Cloudflair.


Cloudflair is how their employees express themselves.


Thanks, fixed the typo!


:)

Actually, I got it wrong as well, because it's really CloudFlare.


Also Tor, not TOR...


It was TOR, now it's Tor. I like TOR better, since it makes it clear the letters have a meaning.


> 5) A report by CloudFlare competitor Akamai found that the percentage of legitimate e-commerce traffic originating from Tor IP addresses is nearly identical to that originating from the Internet at large. (Specifically, Akamai found that the "conversion rate" of Tor IP addresses clicking on ads and performing commercial activity was "virtually equal" to that of non-Tor IP addresses).

This point seems rather odd. I'm not following the connection between a large percentage of Tor requests being malicious and the fact that Tor users have almost the same conversion rate. Malicious requests are coming from botnets and/or fraudsters. They're, for the most part, not in the subset of Tor users which click ads or do anything else that would be tracked as part of a site's conversion rate. What's funny about this is that the linked report even confirms that requests from exit nodes are far more likely to be malicious:

    Tor exit nodes were far more likely to contain malicious requests:
      • 1:11,500 non-Tor IPs contained malicious requests
      • 1:380 Tor exit nodes contained malicious requests
I'm a huge supporter of Tor and have been running a relay node for years, but it seems their stance on this topic is quite fundamentalist and they chose to ignore any arguments or facts that they don't like while basically grasping at straws in their counterarguments.

It's okay to be concerned about CloudFlare having such a huge market share. They're a huge target for nation states and others alike. Global passive¹ adversaries are a problem for things like Tor, and they might very well be forced to become one at some point. It's essential to have more competition in this area, and that's a fair argument to make. However, with regards to how they're handling Tor, I don't think there's anything wrong with what they're doing, and the explanations presented in their blog post seemed sound to me.

¹ Or, rather, possibly an active adversary too?


Is this a wording issue?

> Akamai found that the "conversion rate" of Tor IP addresses clicking on ads and performing commercial activity was "virtually equal" to that of non-Tor IP addresses

So when seeing actual web traffic things are identical. That only measures real web traffic. It doesn't measure all the SSH attacks, SPAM being sent, possibly checking for vulnerabilities and unpatched software/etc.


CloudFlare doesn't do anything other than web traffic. It's basically an nginx reverse proxy on steroids.


Oh, right. Well you still have automated scans for vulnerable servers on HTTP that don't try to really 'access' the website.


> I'm not following the connection between a large percentage of Tor requests being malicious and the fact that Tor users have almost the same conversion rate.

The point is that blocking or de facto blocking an IP address which is shared by many different users just because one is malicious is costing CloudFlare's customers money.


The reason some e-commerce sites are blocking Tor is not because of low conversion rates (that would be silly), but because of fraud (and attacks) coming from Tor users. Those two numbers are not related, and it has nothing to do with why CloudFlare shows captchas for Tor users. The argument doesn't address the fact that a large percentage of Tor traffic is malicious at all. It's a straw man argument, really.

On top of that, it's not as easy as "blocking some legitimate users = losing money". The cost of fraud caused by Tor users might very well exceed the additional revenue Tor users generate - or not.


The point of the argument is that preventing fraud using IP blocking is costing you money that you could have in your pocket if you would instead prevent fraud using signature detection or some other method.


With signature detection, you're referring to browser fingerprinting? Because that's not going to work for Tor users (or, more specifically, TBB users).


I'm talking about, people who commit credit card fraud have a credit card whose billing address is in New York City but try to get the product shipped to Nigeria.


Thats only one type of credit card fraud. There's fraud against digital goods and gift cards. There's even carders who test credit cards online with real information before coding them to magnetic strip or selling them off. Heuristic based detection is very limited if there is no ip reputation or Javascript to do fingerprinting and other tricks to umask the user.

Ecommerce knows full well the cost of not supporting TOR. Just like we know the full cost when they deprecate browsers like IE9. Opportunity cost of building out more advanced systems to detect fraud compared to just blank banning open relays and TOR. I didn't even factor in the cost towards hardware when ecommerce site get hit with a bot running through a TOR endpoint or Open Relay.

Tor is not the only IPs get blocked. Ecommerce site frequently blacklist Azure, AWS and other hosting providers. They have the data they crunch and know full well who they will affect and what the cost is. There is always collateral damage.

You don't have the right to use any ecommerce site while using TOR much like you don't have the right to walk into a bank with a ski mask on and get service.


> Thats only one type of credit card fraud.

It's also only one type of bad act. Yet no others exist for which IP blacklisting is the only possible solution.

> There's fraud against digital goods and gift cards.

So treat gift cards as passthrough. Don't ship something to Nigeria if it was paid for with a gift card purchased with a credit card with a billing address in NYC.

And the idea that any meaningful number of people are going to use stolen credit cards to buy digital goods instead of just torrenting them is ridiculous.

> You don't have the right to use any ecommerce site while using TOR much like you don't have the right to walk into a bank with a ski mask on and get service.

Yet I can use an ATM or online banking while wearing a ski mask or using coffee house wifi with no trouble at all for either me or the bank, because we know how to solve that problem and it doesn't require IP blacklisting.


Original Cloudflare blog post that this is a response to: https://blog.cloudflare.com/the-trouble-with-tor/


That post also suggests two things that Tor could do to improve the situation for their users:

* Support a stronger hashing algorithm to make it possible for CloudFlare to make .onion versions of all of their customers' sites. * Implement "client-side" CAPTCHAs.

I don't how feasible either of these are, but it seems strange (evasive?) that Tor Project's blog post does not discuss either.


I believe Tor is working on a new version of hidden services which would address the first concern.


This is a tough situation. I don't know about 94% of TOR traffic being fraudulent but I'm sure it's high. But I'm one of the legit users that gets taken out by blacklisting. I use a VPN service pretty regularly and it makes accessing my Cloudflare account and sites using it incredibly annoying.


> I don't know about 94% of TOR traffic being fraudulent but I'm sure it's high.

I was curious and ran a quick check on my servers. 5 servers, about 300 domains, checked my logs going back 1 week. I could find just ONE legitimate session. Everything else was something trying to break WordPress or PHPMyAdmin or something else. I'd love to support Tor but I feel like I can't fight this fight.


While still higher than regular ISP addresses, VPN abuse should be significantly lower than Tor abuse on account of these services costing real money. And nearly all of them not accepting anonymous forms of payment.

I expect my VPN use to keep me hidden amongst a crowd from various internet companies trying to track and profile me; but I don't expect for a minute that it offers protection against criminal activities (and I have no intention of engaging in such things.) Any intelligent criminal would likely feel the same way and not use a service with their real billing information to commit crimes. Especially with Tor available.

Yet despite this, I am constantly hit by CloudFlare captchas on sites that are very clearly not being hit by DoS traffic. Further, it seems site operators don't even realize this is happening. When I reported the captcha issue to Zotac's Twitter account, they had no idea CloudFlare was doing this.

Google is also a huge offender with the captchas. I'm this close to switching to Duck Duck Go. Facebook is too, but I'm fine with not ever going there.


I assume the actual claim is that 94% of fraudulent traffic comes via tor. Which is quite a different claim.

There's a pretty obvious calculus. If you approach the question as 94% of the fraudulent traffic comes from the x% of total traffic that comes via tor... deciding to block tor exit nodes seems rational (particularly if x% is particularly small... say <1%).


> I assume the actual claim is that 94% of fraudulent traffic comes via tor. Which is quite a different claim.

The claim is thus:

> Based on data across the CloudFlare network, 94% of requests that we see across the Tor network are per se malicious. That doesn’t mean they are visiting controversial content, but instead that they are automated requests designed to harm our customers.

Meaning that for any given request coming from Tor, the odds are heavily in favor of it being malicious.


Yeah, 94% seems very high, and although I guess it could be possible, I can't imagine it is quite that high. Cloudflare is zeroed-in on Tor users, but am I crazy for thinking that there are several other ways bad actors could create issues not using Tor?

It seems like they are trying to come up with an amicable solution, but for the moment, legitimate Tor and VPN service users suffer. If Cloudflare really refuses to acknowledge that their view may be a tad skewed, I don't see how this is readily or easily remedied.


A large percentage of malicious traffic is most likely generated by bots, which are quite naturally better at creating a lot of requests.


The really questionable thing CloudFlare seems to be doing is that they captcha traffic depending on the overall reputation of only the source IP rather than whether the source IP is attacking that specific site or even whether the site is under attack.

What they should do instead is this:

1. If the server is not overloaded, do not captcha any traffic at all

2. If the server starts being overloaded, only captcha traffic from IPs that have been detected as attacking THAT specific site

3. If the server is still overwhelmed, only then switch to captchaing all IPs with "bad reputation"

Most websites are probably almost never under attack, so this would make encountering CloudFlare captcha extremely rare in the wild while still providing DDOS protection.

They could even only do this for Tor exit nodes and other IPs that are known to be used by lots of people.

If a site is being DDOSsed a lot and the slower start up of this technique is a problem, then they can revert for those sites to the current behavior of using reputation.


I find Cloudflare's argument analogous to that of cash - i'm sure some huge percentage of all illegal transactions are with cash, but that does not mean the solution is to ban cash...though some would probably disagree


It's different in that I can accept a cash payment from you without having to worry that your cash will somehow harm me. Not so with a request coming from an IP address from which malicious requests are known to originate.

So with cash, people can disagree on its benefits and drawbacks to society as a whole, but as long as it's legal, there's little reason for me, individually, to not use it, regardless of my opinion on its value to society. With IP addresses, I might conclude that I'm much better off blocking some even if I lament the consequences of this for causes that I approve of.


> without having to worry that your cash will somehow harm me.

Really? What if it is counterfeit? What if it has some kind of poison, germ, or disease on it?

I get your general point I think, but cash can definitely harm you.


But to be fair, in Cloudflare's case they aren't "banning" access, they are putting it behind a captcha.

Still far from ideal, but it's not banning.


In fact, paying with cash sometimes requires a sort of captcha equivalent: the pens[1] that are used to detect counterfeit bills.

Depending on the area you're in, when you pay with a $20 bill (the most commonly counterfeited), the cashier will mark the note with a pen before accepting it.

The business is simply profiling the transaction - a $20 cash bill brings with it a higher risk of fraud - and behaving accordingly. I can't think of a way to argue that it's unfair to the customer that the business does this.

An astute observer will point out that the captcha presented by CloudFlare is an order of magnitude (or two) less convenient than the counterfeit detection pens, but I would argue that this doesn't support a position that CloudFlare is wrong to do what they do.

1: https://en.wikipedia.org/wiki/Counterfeit_banknote_detection...


When every .html resource requested is met with a captcha, access is effectively banned.


Well that's another problem. When you fill out the captcha you are given a cookie that can allow cloudflare to let you through next time.

If you are blocking that cookie for privacy reasons (which is not a bad thing!), then cloudflare has no way to verify you again (short of doing nefarious things). It's a bit of a self inflicted problem at that point.

That's not to say that the answer is "deal with it", but that we need to find a better way. A a way to verify that someone isn't a bad actor without having them take a pretty significant chunk of their time to answer captchas, or give up some of their privacy.

It's a tough problem, and i think the "proof of work" solution proposed in the original could work, but it would need participation and collaboration from "both sides" of the problem. And of course it won't happen overnight.


Any proof of work concept again is just a cookie – because the proof I present will be the same.


But you could possibly re-do the proof of work every page load without the cognitive load of multiple captchas.

It still isn't ideal for mobile or low-end clients, but its something.


A huge percentage of illegal transactions may be in cash, but a huge percentage of transactions in cash are not illegal.


Which is why large cash transactions are heavily regulated and reported on. In the US, one cannot just withdraw $10k or a series of smaller transactions that add up to $10k or more without the bank reporting on that to the authorities. That's the balance that law makers decided to strike.


That's an interesting point.

Cloudflare is doing something somewhat similar. If you are deemed "possibly a bad person" then you are asked to solve a captcha. If you want to give up some of your anonymity, you can keep the cookie they give you as a token to "prove" you are a good person.

If you don't want that though, there is nothing cloudflare can do to know you aren't a bad person.

It's much less than "heavy regulation", but i can easily see how it could be both a pain and a security issue for some.

This is a shitty problem for all involved with no good solutions...


an interesting aspect is that the bank secrecy act was passed in 1970, but has not been adjusted for inflation, so when the law was passed it was more like a $60k limit that has been encroaching on us ever since...


This is an interesting point, one which would apply to a number of topics. Nice.


Some huge percentage of all stupid arguments are made with logical fallacies, but that does not mean that any argument that uses logical fallacies are stupid. Thus the solution is not to assume that any argument using logical fallacies are wrong... though some would probably disagree.


I find the 94% figure believable (for requests, not source IP addresses), Tor is after all the obvious choice for low bandwidth DoS attacks and unwanted scraping (i.e. a few individuals will generate a large percentage of Tor-routed requests at any time).

The real issue with CF for me isn't the hassle with captchas, but the fact that CloudFlare can track users across all its sites, generate profiles and even read unencrypted traffic. It's a privacy hazard by design that makes Tor particularly attractive. But as long as Tor is used only by a small minority, it will be treated this way.


I would expect most of the malicious traffic coming out of Tor isn't using Tor browser. I wonder what the attack numbers look like for Tor browser vs not Tor browser. Cloudflare has client side checks already, which could be extended to check whether the browser is Tor browser, and if so, don't block it.


I understand what CloudFlare is saying but I still think that the benefits of allowing legitimate TOR users access websites freely (without cumbersome captchas) outweighs the troubles malicious users might cause. Public computers such as in Libraries are also often used to do reprehensible things, but still, we understand the benefits of having them.

It is also worrying that CloudFlare has this much power. One of the greatest things about the internet is the openness of the platform and the non existence of gate keepers.

Also, here is an annotated version of the TOR paper for those who want to read more about it http://fermatslibrary.com/s/tor-the-second-generation-onion-...


Payments originating from TOR IP addresses absolutely are more likely to be fraudulent. Anyone running an online business could tell you that.


Hassling Tor users shouldn't become the Internet's default. If you're having trouble, consider informing Tor users checking out that you won't process the payment without their providing additional information. This raises the cost to carders a lot more than needing to rent a SOCKS proxy in a residential area.


It's not worth the development costs and extra verification costs to try to weed a small number of legitimate purchases from a sea of illegitimate ones though.


It isn't that there is a sea of illegitimate traffic so much that their methodology is incredibly flawed, and they have little financial incentive to fix it. As a society, we have to give them that incentive, or we will lose access to a shared resource.

We can have our Internet heavily censored or heavily censored with a ray of sunshine. Is that worth the engineering costs?


I wouldn't even consider doing any payments through Tor. All I want is to read fucking blog posts without CAPTCHAs!


> Users are either blocked outright with CAPTCHA server failure messages, or prevented from reaching websites with a long (and sometimes endless) loop of CAPTCHAs

Is it really a loop or are users just failing to solve the CAPTCHAs? A loop would be obnoxious: Just tell the user they are blocked; giving them more than 2 or infinite CAPTCHAs is a passive aggressive way to communicate.


My best guess is that reCAPTCHA doesn't just have two states (pass/fail), but rather something like a confidence factor and a threshold you have to reach to continue to the site (which might depend on your reputation).


This is a terrible reply, it's basically say's "It's all your fault, we're all good over here."

They then either because they legitimately can't understand the problem, which would be scary, or because they're being stubborn fail to address the suggestions by cloudflare to address the issues.


The trouble with Clouflare is that they receive disproportionate amount of attention on Hackernews. Sometimes HN feels like an extension of their marketing machine. I'm not so sure they every single blog post of their needs to be an item on HN. Anyway that's my .02 cents.


Do you use Tor?


I have a question that I'm hoping will spur some discussion and maybe I can learn some stuff.

"Is anonymity in Tor incompatible with low-latency?"

I ask this having read this: http://freehaven.net/anonbib/cache/pets13-flow-fingerprints....

I suspect that countermeasures to defeat deanonimization all have a negative impact on latency(e.g. inserting extra packets, pausing between sends).

If the answer to my question is yes, then maybe the best thing the Tor project can do is abandon its push for low latency, and instead focus on anonymity. If Tor we're a much higher latency network attackers would probably find it less interesting.


To me personally all of this just seems like fluff. I can't be the only one that feels this way.

I don't want to 'prove I'm a human' to view your crappy site. I'll go and look at the other bits of the Internet instead.

As an individual browsing, the only contact I have with CloudFlare is a bouncer telling me 'no shoes no entry'.

Your entire company to me feels like a pointless gatekeeper because of these shenanigans (on and off of Tor).

To be perfectly clear - CloudFlare, as a brand, is tainted to me, and I expect to many others.

Fundamentally I don't think CloudFlare cares because their customers are not the viewers of websites - and if the viewers of websites come to think of CloudFlare as toxic - it still doesn't matter to them directly.


That post doesn't really offer any solutions.

It would be interesting to find out how CF came to the 94% figure but a lot of the other claims made are not countered and presumably valid.

I doubt CF's (paying) customers are particularly saddened by Tor users being inconvenienced.


CloudFlare looks for ways to justify doing less. First ANY queries, then "free" HTTPS stopping at the first CloudFlare hop, and now the stuff with Tor. I don't trust CloudFlare at all, because they say they're holding a torch for the good of humanity, when actually, they're just making "cut costs" business decisions. If you want to do something becuase it costs less, I understand, then do that. But don't sit there and try to tell me that you're somehow doing it to make the world a better place. That, to me, is super scummy.


The main problem with CloudFlare is how dumb their "protection" is.

It doesn't make sense at all to block Tor users from just accessing read-only content, like CloudFlare does today. Forms/login pages/comment boxes etc should be protected of course, and most people wouldn't have anything against solving a captcha for logging in, but preventing people from just reading stuff anonymously/securely is borderline evil from a user experience point of view.

However it's obviously much easier from an engineering standpoint though to just block people outright.


This was addressed in CloudFlare's blog post:

> One suggestion has been that we treat GET requests for static content differently than we do more risky requests like POSTs. We actually already do treat more dangerous requests differently than less risky requests. The problem is Tor exit nodes often have very bad reputations due to all the malicious requests they send, and you can do a lot of harm just with GETs. Content scraping, ad click fraud, and vulnerability scanning are all threats our customers ask us to protect them from and all only take GET requests.


Hm, I didn't notice that part and I hadn't thought about ad click fraud and vulnerability scanning.

Preventing content scraping for publicly accessible content is kind of dumb and impossible in my view anyway though. For the few that care about making scraping more cumbersome the option should be there to enable the captchas, but I don't think it should be a default.

Ad click fraud is a harder problem to figure out, but should be a issue for the ad-networks to deal with and not the sites with the ads.

I'm torn about preventing vulnerability scanning. While I do it on my server using mod_security, it's also a form of "security by obscurity", which I'm not that much a fan of.

Anyway, disabling captchas completely should be a very visible option.

The protections on "dangerous GET requests" are probably only needed for a small minority of CloudFlare sites, so what should be the default setting becomes a question of what's most valued the most - safety for the many (making Tor usage easy and not cumbersome) or security for the few (who rely on the default settings protecting them from "niche" security issues).

I think making Tor usage easy by default would be best for the security of the web at large, but I can see why CloudFlare has different priorities.


That argument only holds true if the person operating the site has no idea at all about the HTTP standards.

* GET requests have to be idempotent.

* Security by Obscurity is not Security.

* Content Scraping is nothing you have to protect against, or should protect against – DRM just does not work.


Right, but this is about abuse and solutions for that. Obviously, blocking Tor is not going to prevent a determined attacker from scraping your site or trying some SQLi vectors. It might, however, prevent a large number of bots from scraping your site for emails, scanning for vulnerabilities, or doing click fraud. It's not a perfect solution, but those rarely exist. CloudFlare sees a lot of malicious traffic, so they probably have a better view of what works and what doesn't compared to everyone else.


I see a lot of fraud on my site as well, and I can say that there are some ISPs in Eastern Europe and Asia that are just as likely as Tor as being the origin for a malicious attack.

I don’t block them either.

Instead, I use fail2ban with a 30min ban for the IP for all my servers, and have the rest of the system hardened. Also, I add an additional delay that’s just below the timeout that browsers have for each request from an IP of that block for the next 30min.

Reduced my logs of "123.456.789.012 tried to authenticate as "root" with invalid password" from several gigabytes a day to a few kilobytes.

________________________

Sure, I don’t host a large site, or get much traffic, but saying "Tor is the only issue" or even "Tor is the largest issue" isn’t true.

And there are solutions for these cases. Solutions which allow legitimate users to continue using the services.


But Tor is not specifically treated differently... their exit IPs cross a threshold and CAPTCHA's are applied, just like with your fail2ban solution.


There is a difference: They ban each IP from all their services. And eternally, not just for 30min.


Have you ever had problems with malicious botnets/spam targeting your site and they are all behind TOR? It's not really that simple.

If an IP, or IP range, behaves badly - goodbye. It's a TOR problem to solve it, it's not the problem of web servers and also CloudFlare.


That's my whole point.

Feel free to require a captcha for the login form etc and other pages where botnets/spam can be a problem, but don't ban people from pages where they only have read-access anyway.

It's like preventing people from reading books just because a few authors write a lot of crappy/illegal stuff.

Sure - make it a little bit harder to "become an author" (by using a captcha), but don't make it harder to read stuff.


How do you stop people from scraping your site? Databrokers frequently will mine social sites to build profiles on people. There's legitimate reasons to block TOR for read-only content.


It is 2016. The web is public. People scrape sites.

Don't show them things they shouldn't see. Don't pretend you're going to stop 80legs from scraping you by blocking Tor.


Scrapers visit many unique URLs. Such request patterns should be fairly easily distinguishable from the rest of the visitors. To do it on a large scale in realtime something like loglog[1] counter could be used.

[1] https://en.wikipedia.org/wiki/Flajolet%E2%80%93Martin_algori...


How do you id the scraper from request to request? Cookie? Bots can just discard. IP? Now we're back to the same problem. Some sort of super-cookie browser fingerprinting? Sure, maybe possible, but it's super sketchy and sacrifices anonymity which is important.


I don't think it is necessary to identify the scraper. Something simple should work, like counting how many unique URLs for a particular website got accessed in some period of time and comparing that to similar thing, but counted for known non-bot users (normalized). If one is much bigger than the other one - start requiring CAPTCHAs, but only for those requests, that are not part of known non-bot users set of our probabilistic counter of requests or even a separate bloom filter, depending on what is going to be more efficient and more accurate.


I think CloudFlare's security measures are insane. I use a VPN and I can tell which sites use CloudFlare because I consistently get a Error 520, where it claims the browser and CloudFlare are working, but the website is not responding. Yet I turn of the VPN and magically it works fine. That's dishonest. At least own that you are the one blocking my visit.

I'm also developing with Dwolla's API, and CloudFlare blocks all HTTP requests from my local IP, so I can't develop locally. Thanks CloudFlare.


Yeah, this is beyond just Tor - they're breaking VPN and carrier-grade NAT traffic too. Even if Tor bowed to their demands those would stay broken, and scammers would still fill out the captchas manually. But they seem very set on their chosen solution!


Someone with more knowledge of these thing, let me know:

Why does Tor not "charge" per request? i.e. Using some decentralized currency, to pay for requests.

1. Make it cheep enough such that users don't care, however, financially disincentives spammers/malicious users.

2. It would continue to be anonymous. - cycle through wallets - all transactions would also be proxied.

3. It would incentivize proxying and exit nodes (exit nodes would effectively collect a bunch of virtual money to be resold to clients for USD).


If users don't care, spammers don't care; spammers will get a financial return.


Services like CloudFlare are responsible for more and more of the DNS. When they are poor net citizens, they are poor net citizens at a massive scale. Heuristics that end up being equivalent to "Tor users are guilty until proven innocent" can't become the default mode of the Internet. As customers, Tor users, and just people who have a stake in the Internet as a shared resource, we need to demand that they try harder than that.


Anonymity ("privacy") and security are conflicting requirements. Tor users take a legit stance, and would be served an equally legit CAPTCHA (if lucky).


You're assuming anonymity and security belong on opposite sides. They don't.


How about this solution: (yes, it's only 5% serious)

From every publically available internet do something that appears malicious until cloudflare's servers annoy everyone. At that point they'll be forced to find a new solution.

This only occurred to be because I get their captchas on public wifi in Starbucks and other public wifi in Japan


I'm getting

"Attackers might be trying to steal your information from blog.torproject.org (for example, passwords, messages, or credit cards). NET::ERR_CERT_AUTHORITY_INVALID"

When trying to visit this blog post.


>> Laaw: "But I don't want end to end encryption"

Sorry, but I thought you didn't want encryption. Bit puzzled, just click ignore error to fix the issue.

Clearly this advice is based on you not wanting end-to-end encryption; heads up, NSA flags users that visit Tor's website, though clearly, you've done nothing wrong.

(Yes, I'm making a point, hope it's clear.)


Since they use HSTS, I literally can't click "ignore" or I would have. I, and (I assume) anyone else who uses the latest version of Chrome cannot access this content right now. Or it might just be me, but I don't know what the solution is.

You also missed the point if you think what I said included the words "all the time" in the other thread we were talking in.


I'm on Tor and able to see the file, no idea why you're getting that error, or I'd try to help.

Assume you know this, but Google has a cached version as text if you Google...

[cache:http...]

^^ where you remove the open/close brackets and insert the full URL after "cache:"


I'll try that, thanks.


CloudFlare uses a flawed algorithm that penalizes developing countries and anyone who uses 1 IP address for many users. And that means that it censors Tor users and impedes human rights.


I dislike CloudFare adoption. More and more I come to sites and need to wait 5 seconds, caused by their DDoS protection. Such things make the less more and more aweful.


The GET solution seems too lightly waved off considering that 90% of Tor requests will be nearly identical to those from trusted IP addresses.


The trouble with cloudflare, the lawyers of the internet. Making money on other peoples problems but not really solving anything.


Maybe I either missed this or forgot, but what percentage of overall internet traffic handled by Cloudflare is deemed malicious?


In fact, I'd like to see CloudFare segment out other populations and provide similar statistics. With carrier-grade NAT, corporate proxies, VPNs, ... Surely Tor is not the only segment that behaves similar. As I see it, CloudFare is taking one group and applying stereotypes to justify a draconian technique. I suspect they may be doing the same with other undeserving groups. Seems to be the way the world goes, when money gets involved.


We are free speech advocates, and yet we had to make a cron that downloads and adds Tor IPs to an ipset, due to botnets.


>the site only accepts payment via PayPal >and/or credit cards, and paying with those in >itself gives up a good amount of privacy.

I think both methods areactually not private and have proven not te be private at all


lol, CloudFlare vs Tor (hope its not disappointing like BvS)


Cleary cloudflare's customers prefer this behavior, it's their website, they are free to block tor traffic if they like.


>the site only accepts payment via PayPal >and/or credit cards, and paying with those in >itself gives up a good amount of privacy.

i think both methods have proven not to be private at all.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: