NSA uses Google cookies to pinpoint targets for hacking

Smerity · on Dec 11, 2013

There are two primary issues here: the prevalence of Google Analytics and the unencrypted nature of the majority of websites.

Google Analytics is on a substantial proportion of the Internet. 65% of the top 10k sites, 63.9% of the top 100k, and 50.5% of the top million[1]. My own partial results from a research project I'm doing using Common Crawl estimates approximately 39.7% of the 535 million pages processed so far have GA on them[2].

That means that you're basically either on a site that has Google Analytics or you've likely just left one that did.

If the page you're on has Google Analytics and isn't encrypted, the Javascript request and response is in the clear. That JS request to GA also has your referrer in it, in the clear.

The aim of my research project is to end with understanding what proportion of links either start or end in a page with Google Analytics. If it starts with Google Analytics, your present "location" is known. If the link ends with Google Analytics, but doesn't start with it, then when you reach that end page, the referrer sent to GA in the clear will state where you came from. All of this is then tied to your identity.

If people are interested when I get the results of my research, ping me. I'll also write it up and submit it to HN as it would seem to be of interest.

[1]: http://trends.builtwith.com/analytics/Google-Analytics

[2]: http://www.youtube.com/watch?v=pkoIUmP5ma8 (GA specific results at 1:20)

mike-cardwell · on Dec 11, 2013

Why do we still have referrers? They don't allow us to do anything that we wouldn't be able to do without them. If Mozilla and Google made a statement today saying, "We'll be removing referrers from cross site requests in 6 months time for Chrome and Firefox.", the tiny tiny proportion of sites that are using them for real functionality will have plenty of time to update.

Of course, as a web developer, it's useful to be able to see where people came from. But we don't have any right to that information. As an end-user, why the hell is my browser giving you this information for no reason when it doesn't have to?

I've been using RefControl for Firefox for years now. It fakes the referrer, setting it to the root of the domain being requested. This hasn't ever caused me any problems, so there can't be that many sites that rely on it.

I don't give a shit about your analytics or how much money you think you'll lose from referrers disappearing. Privacy is more important.

JohnTHaller · on Dec 11, 2013

For lots of us using basic CDN services, we enable referrer checks to ensure that folks aren't hotlinking images or direct linking downloads from other sites. These CDNs allow basic blocking based on referrers. You usually set it to only permit when there is a referrer from your own domain as well as blank referrers (if the CDN supports it) since most privacy conscious folks will disable referrer rather than fake it. We don't actually care that you don't provide a referrer, we just don't want other sites using our images in their own pages (leeching our bandwidth we pay for) or direct linking to binary downloads (bypassing our site with our advertising or revenue possibilities or branding but using our bandwidth).

chavesn · on Dec 11, 2013

I can think of one more; some sites use referrer to allow you to bypass a paywall if (and only if) you came from search results.

The problem is, all of these "features" allowed by referrers are user-hostile actions.

If referrers went away tomorrow, users wouldn't notice the difference or care. Publishers would get angry and think "we can't milk our content/visitors for as much money anymore!" But that doesn't really change the relationship with the customers who value your product or business so I personally can't believe it will make a sizable difference in the end.

JohnTHaller · on Dec 11, 2013

It's only hostile to sites that try and steal bandwidth resources by hotlinking/leeching images or direct linking downloads. It's not about milking visitors. It's about preventing unethical behavior by other sites.

I've spent 10s of thousands of dollars hosting free and open source software for millions of people over the years and I make sure to prevent bandwidth theft from other sites that cut into my ability to provide that service. Ad revenue (all responsible ads... no popups, no sound, etc) doesn't cover the cost of hosting and bandwidth even when sites are prevented from being unethical.

Take away referrers and it will be replaced by more complicated technology that serves the same purpose. CDN providers have secure links, for instance, that use an API to allow sites to generate a one-time use or limited time window link for a download from the CDN of a given file. It's more complex, but it's what I'd switch to tomorrow for downloads if referrers went away.

DerpDerpDerp · on Dec 11, 2013

So it sounds like there's already a solution to your problem that doesn't require leaking privacy all over the internet, since presumably the one-time links are on request from a page on your website, and tells you nothing but someone on your website wanted something from your website.

How is this a bad thing? How would removing referrals harm you in any way?

JohnTHaller · on Dec 11, 2013

Well, it would only work for downloads, not images. And it would require dynamic sites, so static sites couldn't take advantage of it. And it would require some coding ability as opposed to just knowing how to upload a file and point to it. So, it would mean expending additional resources programming-wise just to keep files moving instead of doing whatever actual service we're really working on, since we don't have money to throw at it. And it would increase the load on the website server, too, which would require additional resources, which means more money.

For images, the whole point of a CDN is to keep them in one place with a long expiry (a week or more) possibly downloaded from a nice geographically close edge node so that visitors load the images very quickly once and then cache them for the next pages and later visits of the site. The only current way CDNs implement of keeping folks from leeching/hotlinking images is to check referrals. The unique download link bit would negate the whole benefit of the CDN (you'd lose caching and the back and forth to generate the unique URL would slow it down), so that's out. Basically, lots of folks would ditch CDNs and host internally, possibly using server log checking to see if said IP recently hit a page. Otherwise they have to deal with lots of bandwidth leeches. The end result would be slowing down visitors' experience.

So, for both images and downloads, users wind up losing if referrals go away. It's far better to just leave it as is. Enable referrals by default. Let the privacy conscious disable them (sending blank ones). And build systems to take into account both userbases. Again, as a software developer, publisher, and host, I don't really care about referrals in terms of violating privacy, so I don't care if you disable them and send blank ones. I purposely set up my redirects and CDNs to allow for that. I care about them in terms of continuing to deliver services effectively to my users without competitors stealing my resources.

mintplant · on Dec 11, 2013

What if the page containing a one-time link was cached, but the resource itself was not? The "secure link" solution doesn't seem to work in all cases.

JohnTHaller · on Dec 11, 2013

Correct. And that end user would have to reload the page or clear cache. It's not as effective as just checking the referral. But without referrals, it would be what we'd have to resort to.

AnthonyMouse · on Dec 12, 2013

> For lots of us using basic CDN services, we enable referrer checks to ensure that folks aren't hotlinking images or direct linking downloads from other sites.

This is actually an interesting problem, because it's already solved but most people aren't using the solution: If you have a large file do distribute to a large number of people without authentication, use BitTorrent. As far as I can see there are two primary impediments to this:

A) Most browsers can't by default download large files P2P. You can actually write a BitTorrent client in javascript using Web Sockets if you really want to, but that's just horrible. What would be really nice is to be able to just e.g. embed a video into a webpage using a magnet link. There is no technical reason why this couldn't be implemented and rightly should be for large files.

B) Images are exactly the wrong size. They're big enough that you can't just ignore hotlinking but not big enough that you want to pay the overhead of connecting to 50 different peers instead of one to get a good transfer rate. But that just requires some adjustments to the protocol; if you're looking for realtime retrieval for display in a webpage you would probably want to use UDP and then use erasure coding to deal with slow/broken peers and packet loss. If you have a 60KB image, you can send a ~50 byte packet to each of a dozen peers and have ten of them each send 6KB (approximately four packets) to the target with 6KB worth of erasure bits from each of the others (which also allows the image to be constructed once 60KB of data is received in total from any collection of peers), and now the image is costing you ~600 bytes instead of 60KB. And if the image hasn't been received in 150ms, add more peers.

JohnTHaller · on Dec 12, 2013

Our software is used from portable devices (usually usb) as users move between computers (PortableApps.com). As such, using bittorrent would be a technical option within our platform's app store/updater. It would, however, get our platform banned/blocked at many companies and universities that have policies forbidding bittorrent use. (And we can say all day long that it's a legit protocol with lots of legit uses like downloading linux ISOs, it doesn't change the facts and policies on the ground.) Additionally, most users will be behind NATs that they can't poke a hole through to be able to properly share.

As for images, the bittorrent protocol would just be way too slow even with some changes when compared to HTTP with SPDY and all the internal tweaks done at the geographically close CDN edge nodes to make them as fast as possible. 150ms before adding a new peer is an eternity in an age when 47% of people expect a web page to load in 2 seconds or less and the abandon rate increases with each second that passes with 40% abandoning at a little after the 3 second mark.

toomuchtodo · on Dec 12, 2013

Couldn't you use an inexpensive CDN like Cloudflare? Your origin would see transfer to Cloudflare, but they'd be able to offload the majority of your download traffic. You also wouldn't be charged per GB as happens with Cloudfront.

JohnTHaller · on Dec 13, 2013

I'm not that familiar with Cloudflare and unsure how well it would work with a Drupal-based site. They do appear to have a module ( https://drupal.org/project/cloudflare ) but it seems like it is a beta and hasn't been developed in a few months. It does seem like it would be more expensive for the level with an SLA ($200 for business plan) than the CDN we use now (which is $79 a month for our images and includes 1TB of bandwidth, about what we need. excluding binary downloads, of course). Do you have any direct experience with CloudFlare?

toomuchtodo · on Dec 13, 2013

I haven't used it with Drupal before; we're using it to cache a read-only JSON service in the event of failure, heavy load, etc. We've been pretty happy with them. You could always try their free or $20/month plan on a separate subdomain.

mike-cardwell · on Dec 11, 2013

There is a trivial solution to this. Introduce a new HTTP response header 6 months before phasing out the Referer header. This header would be optionally delivered with content and would specify which third party domains are allowed to access the content. Perhaps Content-Security-Policy could be extended for this purpose.

JohnTHaller · on Dec 11, 2013

Sure. And have it on by default with a correct content security policy. If it were off by default, it wouldn't be used by most folks and the bandwidth thieves would be content hotlinking images and direct linking binaries, just ignoring the small percentage of users who turned it on.

Of course, even if this was released today and referrers were phased out in June 2014. We'd still be able to use them for at least 5 years until you could safely assume that they were gone. Likely longer.

mike-cardwell · on Dec 11, 2013

If they released support for this header with Firefox and Chrome, almost immediately, people wouldn't bother hotlinking to sites which utilise it because a good proportion of their users wouldn't be able to see the content at all.

leephillips · on Dec 11, 2013

You ask, "Why do we still have referrers?", but then you answer your question: "as a web developer, it's useful to be able to see where people came from."

You are of course correct that we don't have a "right" to this information. But I've discovered, many times, through the referrers in my logs, links to my pages from some very interesting places that I might not have discovered otherwise (because the link information that Google discloses is woefully incomplete).

Any user who wants to hide referrer information can easily to do in a variety of ways. For example, I wrote a bookmarklet that does this for you: http://lee-phillips.org/norefBookmarklet/

mike-cardwell · on Dec 11, 2013

Defaults are important. Most users of the web don't know referrers exist.

It's irrelevant how useful you find the information. You'd probably find it useful to know the name and email address of everyone that visits your site too... So?

j_s · on Dec 11, 2013

Google is already killing referrer on search results (by redirecting), to force people to pay for Analytics

andrenotgiant · on Dec 11, 2013

What? That is not true. Paying for the Google Analytic premium products gets you no additional data on referrer or search terms.

AdWords paid search clicks still send the info, if that's what you're talking about.

But this discussion is on completely removing referrers, not just stripping search keyword.

ceejayoz · on Dec 11, 2013

I'd pay for an intermediate level of GA in a heartbeat.

Right now, once you hit 10M pageviews a month you either have to sample or pay $150k/year for Premium.

I don't need support, an account manager, four-hour turnaround on data, an SLA, etc. I just need more pageviews sometimes.

richbradshaw · on Dec 11, 2013

Set up 2 GA accounts, then in your embed give even IP addresses version 1, and odd ones version 2, then you can track 20M pageviews. Admittedly it's a little rubbish. Guess you could use the API to combine them again for your own backend use.

ceejayoz · on Dec 11, 2013

Clients aren't huge fans of the "check both of these accounts and sum them together" approach, and ours like direct access so the API isn't a great solution.

I've heard that running a $10/month AdWords campaign gets you higher caps, but it may be an internet old wives tale.

dangayle · on Dec 11, 2013

Hate to burst your bubble, but even the premium GA uses samples. They don't give you a firehose of real data.

ceejayoz · on Dec 12, 2013

Some of the reports, yes, but not overall pageviews. Per their capabilities page:

> 1 billion hits per month

> Up to 3 million rows of data in unsampled reports

If you're doing conversion tracking etc. you're going to start getting sampled data at some point, but it's the pageviews our folks care for.

reginaldjcooper · on Dec 11, 2013

I use a RefControl and just don't send the header, although I feel that's close to being the least of my worries.

I just thought I'm already using NoScript, AdBlock, RequestPolicy, BetterPrivacy, Cookie Monster, Blender, and HTTPS-Everywhere; might as well go all-in.

quesera · on Dec 11, 2013

Please do post your research when it's cooked. It sounds like useful stuff.

Firefox, ABE, NoScript, Request Policy, Ghostery, HTTPS-everywhere, hygiene.

The irony of my militant approach toward privacy is that I probably make myself more interesting to would-be eavesdroppers by my carefulness than I would if they could see it all -- I'm just not that interesting.

On the plus side, the LCD of legitimate-threat hostiles is greatly increased. I'm fairly boring even to neighbors and law enforcement and copyright holders and scam artists and advertisers. I imagine I'm pretty stultifying to nation-state actors. :)

Still, I'd like everyone else to join me so that I can get lost in the crowd. The untracked, encrypted, well-rested crowd.

Come on in, the water's fine.

sdoering · on Dec 11, 2013

Greetings. I nearly have the same policy, regarding surfing and my addons.

I would only advise against Ghostery, as they whitelist some trackers, if being paid. With every update I had to reselect these trackers.

And Evidon (Ghostery's mothership) selling usageinformation really bugs me: http://venturebeat.com/2012/07/31/ghostery-a-web-tracking-bl...

I would recommend the FF-addon Diconnect: https://addons.mozilla.org/en-US/firefox/addon/disconnect/

Does anybody have an idea, how I could make my own sites secure in a relatively cheap way? Just a personal site with not that much traffic, so spending much money seems a bit off to me.

Ideas?

byoogle · on Dec 11, 2013

This.

I work on Disconnect. I don't understand why any hacker would still put Ghostery on their machine:

* Ghostery is run by former ad execs (7/9ths of their executive team): http://www.evidon.com/our-team

* They make their money (I've heard tens of millions of dollars per year) selling user data to ad co's and data brokers: http://www.evidon.com/#block-views-from_our_partners-block

znowi · on Dec 11, 2013

Sadly, Disconnect detects less trackers than Ghostery (e.g. 5 vs 7 on washingtonpost.com).

I also like how Ghostery provides URLs for each tracker source (actual payload) that you can easily view on their site.

There's also a database with short description, affiliations and privacy terms for each tracker (e.g. https://www.ghostery.com/apps/google_analytics).

I really appreciate an ethical alternative to tainted Ghostery and hope you guys will catch up soon.

byoogle · on Dec 11, 2013

Not true, Disconnect detects 13 trackers on http://www.washingtonpost.com/. If you're running multiple filtering extensions all at the same time, install order matters as far as which extension sees which HTTP requests.

gorhill · on Dec 11, 2013

What was not blocked:

== Disconnect:

* s3.amazonaws.com

* cloudfront.net

* echoenabled.com

* troveread.com

* trove.com

== Ghostery:

* s3.amazonaws.com

* echoenabled.com

* platform.twitter.com

== HTTP Switchboard

* echoenabled.com

For HTTP Switchboard, I could easily identify glancing at the matrix that what was requested was a CSS file, I then proceeded to block with one click anything coming from `echoenabled.com`. The page still displayed properly for all three blockers.

Also of note, with Disconnect and Ghostery, there were some scripts running requesting data from api.echoenabled.com and echoapi.wpdigital.net every few seconds. These requests were blocked by HTTP Switchboard without my intervention to prevent this.

byoogle · on Dec 11, 2013

[You think maybe you should start identifying that you're promoting your own product with these comments?]

The first thing I tried when I wrote Disconnect was to block every third-party domain. Within an hour, I realized that I broke the whole web. So I built a crawler to identify and categorize the most prevalent third-party services instead. The domains you list under Disconnect would all be categorized as content by our crawler - resources that most people would consider pages broken without and that Disconnect doesn't block by default.

gorhill · on Dec 11, 2013

> "[You think maybe you should start identifying that you're promoting your own product with these comments?]"

Your going personal. I am talking about the extensions, not you. If for-profit companies are going to claim to care about user privacy, expect this claim to be taken to task, especially in the current era.

Above I am providing hard data, not an opinion.

> "The domains you list under Disconnect would all be categorized as content by our crawler"

The page works fine if whitelisting only the page domain. If someone want the comments, then it's a matter of whitelisting `echoenabled.com`. The rest doesn't appear so important, so I personally rather not ping them. But the point is, I am of the opinion that people need to have the ability to know exactly where their browser connects, then they can agree/disagree/not care. I don't see how one can make an informed decision without proper information.

Now regarding:

http://p.typekit.net/p.gif?a=219379&f=175.10294.10295.10296....

There is no way this 1x1 pixel gif would break a web page. And yet it's not blocked by Disconnect as reported in another comment below. I also reported how adobetag.com is reportedly blocked by Disconnect and yet a script from adobetag.com was downloaded by Disconnect.

Can't you appreciate why I am rather skeptical? Going personal rather than provide a credible answer is not going to dispel this skepticism.

andrewcooke · on Dec 11, 2013

oh, interesting - can you explain the details? is it simple or random? if i install disconnect and then ghostery wil ghostery help with anything disconnect misses or is it more complicated than that? thanks!

byoogle · on Dec 11, 2013

As of the last time I tested:

* Chrome, Safari, and Opera give precedence to the newest extension - if that extension blocks a request, older extensions don't see the request.

* Firefox gives precedence to the oldest add-on.

john-p · on Dec 11, 2013

ABP has several privacy lists that is managed publicly so I don't see the need for these untrusted and unproven extensions.

byoogle · on Dec 11, 2013

That's a pretty cheap shot:

* Disconnect has more than a million active users and Ghostery has more than two million.

* Disconnect is also as public as ABP: https://github.com/disconnectme/disconnect.

lemming · on Dec 11, 2013

So I just tried Disconnect. A few comments:

* I can't seem to see a list of all Google trackers. Some sites have multiple Google trackers, but if I click on the icon to see them it just turns off blocking for them. I'm assuming sites don't have 6 GA trackers, what are the others?

* I can't seem to turn off Content trackers for all sites. Configuring this site-by-site seems clumsy, to say the least.

byoogle · on Dec 11, 2013

Thanks.

#2 first: Code to block everything marked as content with one click was either just checked in or is about to be.

For #1: Disconnect groups tracking requests by company. If you want to see all the Google services that Disconnect filters, you could look through the filter list (services are grouped by category and company here, so start at lines 33, 1,882, and 2,326): https://github.com/disconnectme/disconnect/blob/b27abbf033c6....

quesera · on Dec 11, 2013

So this is awesome, Ghostery has been a unsettling compromise for years now. I'm very happy to learn that you guys are doing it right. Thank you!

I've never been able to detect any nefarious network traffic caused by Ghostery (and I've looked), but I don't like the games they play, so I'll be pleased to ditch them without ceremony.

byoogle · on Dec 11, 2013

Ghostery's game seems to be tricking users into sending their data to Evidon. Going off the company's own numbers, something like 45% of Ghostery users send Evidon data (by comparison, only 2% of Firefox users share data through Telemetry).

out_there · on Dec 11, 2013

Ghostery's default setting is to disable submitting data. You have to explicity opt-in. How is that tricking?

I was interested in your project but your smearing of 'competitors' with FUD is seriously disconcerting.

byoogle · on Dec 12, 2013

Congrats on your first post! [1]

1. http://www.catb.org/jargon/html/A/astroturfing.html

rand456 · on Dec 14, 2013

Congrats on the non-answer.

dictum · on Dec 11, 2013

Why is the Disconnect trackers list (https://services.disconnect.me/disconnect.json, referenced in https://github.com/disconnectme/disconnect/blob/master/firef...) an encrypted blob?

That seems to go against the OS nature of the project.

byoogle · on Dec 11, 2013

See the commit message at https://github.com/disconnectme/disconnect/commit/691897e21d....

The unencrypted list is at https://github.com/disconnectme/disconnect/blob/b27abbf033c6....

And formatted as JSON at https://disconnect.me/services-plaintext.json.

The encrypted list is also trivial to decrypt with the SJCL code in https://github.com/disconnectme/disconnect/blob/master/firef....

dictum · on Dec 11, 2013

Thanks! I was setting up a proxy for devices that can't use Disconnect or Adblock. I thought of adapting the Disconnect list to the proxy's block list, but a cursory reading of the source only showed the URL of the encrypted list.

byoogle · on Dec 11, 2013

Cool, ping me if you need help (byoogle everywhere).

fixanoid · on Dec 11, 2013

Hehe, this makes me chuckle every time.

fixanoid · on Dec 11, 2013

Oh, hai Brian!

Aren't you a former doubleclick-turn-righteous? And don't you also employ an ex-NSA dude?

And no, we do not sell user data, just tracker data.

Cheers!

quesera · on Dec 11, 2013

I get that you two are rivals, but he does have a point that you haven't addressed.

Ghostery seems to be in the business of selling the data that I forget to tell them they may not collect. This is intrinsically a sneaky thing to do.

Yes I know about and use the "default to blocking" setting, but I don't think there is much argument that Ghostery users download your software with the expectation that the default would be anything else. But it is. And that's sneaky.

So you offer a very useful product, for free, and make money off of the people who fail to configure it so that it performs the only service they would ever purposely download it for.

Again, I have sniffed Ghostery looking for violations of my configuration settings, and never found any. I believe that it follows its configuration settings, and I am thankful for its existence. And I recognize that development and maintenance of it is not free. Presumably you are not a volunteer.

I have gotten value out of Ghostery, but apparently that has been on the backs of other users who want the same thing, but are less-careful than me about reading configuration options, and that doesn't sit well.

fixanoid · on Dec 11, 2013

Hi quesera, thanks for asking.

This is somewhat wrong: Ghostery, ever since version 1, had Ghostrank feature in it. It has always been an opt-in deal, the users who trust us may turn it on so provide us with data. For the first 4 years the data sat without any use until recently where Evidon figured out how to turn it into money. Even so, the data Evidon sells has nothing about any user, merely tracker data. Here are some samples of whats actually delivered to clients: http://www.knowyourelements.com/ and http://www.evidon.com/evidon-trackermap/tagchains-static.htm....

As I said, we do not trick the users into anything, and are as transparent about where the data goes as possible, if you have suggestions how to increase this then please let us know. We currently cover this question in every listing Ghostery has, all options pages, web site, FAQ, and many posts on our blog.

As far as defaults: originally, Ghostery was a detection software designed to "reveal the invisible web", but has added blocking since. Our official stance is that we do not make decisions for the user, but we do run every user through an install wizard that explains whats up. Disconnects stance here is a different, they do offer default blocking, tho they also have their own "whiteliest" built into it without telling the users about it. We are going to add some easy configuration in the near future that will pre-block stuff, but this is still in the works.

Finally, Ghostery source is available for review for "sneakiness" since every extension is pure javascript. We host it here if you're interested: https://www.ghostery.com/ghosteries/chrome/ and you can just unzip any other extension to extract source.

aroch · on Dec 11, 2013

You have to opt-in to 'Ghostrank' which is the data that they sell. I don't really see what's sneaky about this. Hell,if you click the 'see more...' toggle on the prefs page it tells you what Ghostrank does.

>Online marketing companies need better visibility into real-world applications of their technologies and those owned by their competitors. GhostRank data is sold as reports to businesses to help them market to consumers more transparently, better manage their web properties, and comply with privacy standards.

byoogle · on Dec 11, 2013

quesera described what's sneaky about Ghostery - their users think they're protected but aren't and don't know they're sending data to Evidon that the company sells but are:

> So you offer a very useful product, for free, and make money off of the people who fail to configure it so that it performs the only service they would ever purposely download it for.

I gave some numbers above that show, in practice, just how many users are in one of these unexpected configurations:

> Ghostery's game seems to be tricking users into sending their data to Evidon. Going off the company's own numbers, something like 45% of Ghostery users send Evidon data (by comparison, only 2% of Firefox users share data through Telemetry).

byoogle · on Dec 11, 2013

[Replying to aroch, who's too nested.]

> And how exactly is it trickery if users have to opt-in to the program and they're told what the program does?

Ghostery seems to rely on vague messaging (last I looked, they don't actually say anywhere in their extension that they sell the data you share to ad co's and data brokers) and UX "optimization" (what quesera dubbed the "reconfigure-on-update dance", for example) to get less attentive users to leave blocking off and to send data - as the numbers show.

aroch · on Dec 11, 2013

In the second paragraph (though really, its just a statement...) on the preferences page -- no need to navigate to another page, and they tell it to you in plain english. Once again, you have to opt-in, so if you opt-in without knowing what it does it's your own fault and you're being a dumb user:

When you enable GhostRank, Ghostery collects anonymous data about the trackers you've encountered and the sites on which they were placed. This data is about tracking elements and the webpages on which they are found, not you or your browsing habits.

Online marketing companies need better visibility into real-world applications of their technologies and those owned by their competitors. GhostRank data is sold as reports to businesses to help them market to consumers more transparently, better manage their web properties, and comply with privacy standards.

quesera · on Dec 11, 2013

Actually, I'd completely forgotten about GhostRank, the opt-in data collection service. The sneaky part I was referring to was just the default setting to add new trackers but not block them. I don't think any users have the expectation that updates will work that way.

I'd argue that Ghostery should come with a default configuration of ALL trackers and cookies blocked. I'd argue even more strenuously that after the user configures Ghostery manually to do so, ALL should continue to mean ALL even after updates. Ghostery currently has 700 3P cookies in their database, and almost 1700 trackers. There is no valid argument, imho, that a user who configures to block ALL really means "block ALL right now, but if you see any new ones, I would really like to try them out first!"

However, I mostly agree that Evidon has been up front and straightforward about what they do and how they do it. I want to like Ghostery. I do like Ghostery. This little bit of sneakiness though, honestly, taints the whole operation. You can call it an oversight, and I will agree that it can't possibly have much marginal value to Evidon...but it's somewhere between tone-deafness and carelessness, two qualities that call for heightened vigilance.

fixanoid · on Dec 11, 2013

Your argument would be much better placed in our support forum. There, we do see much more opinions that are not the same, and the conclusions we draw are based on a bunch of inputs, but our support forum is probably the #1 place for issues we look at. Blocking pre-selections are currently slotted for mid-2014 because its actually a low priority item according to the votes we see.

If you feel strongly that your opinion is important and should be prioritized, please create relevant topic here: https://getsatisfaction.com/ghostery/ and gather support to change it so we address it quicker.

DerpDerpDerp · on Dec 11, 2013

To be fair, if those recordings are time-stamped, it probably does leak information about user browsing habits.

Anonymizing data is hard.

aroch · on Dec 11, 2013

Assuming they're adhering to privacy standards, the anonymization should be reported in aggregates and not like "{UUID} at {TIMESTAMP} reported {TRACKERS} at {URI}"

aroch · on Dec 11, 2013

And how exactly is it trickery if users have to opt-in to the program and they're told what the program does?

fixanoid · on Dec 11, 2013

Hi sboering, how are you?

Ghostery does what the user tells it to do. If you are seeing unblocked trackers, most likely, its because we've added new trackers and you didn't select "block" by default for the new trackers when the list gets updated. You can change this preference by going into Ghostery options, Advanced, and review the "auto-update" section.

And heres a full explanation as to what Evidon gets and what it does with it: http://purplebox.ghostery.com/?p=1016023438

sdoering · on Dec 17, 2013

Nice: First I would really prefere it, if my name was spelled right. That much of respect should have been the least to do.

But back on topic: Why should I (as fairly technically adapt person) have to search deeply inside the configuration, to maybe find a feature, that I expect to be active by default?

So sorry - Ghostery is so far off my radar nowadays, as I felt tricked and victim of a dark pattern [1]. And as I nowadays have a strict "zero tolerance" policy regarding sites/services/tools that act this way. Ghostery was, is and will be on my list of tools, that I would never ever recommend to anybody.

Btw.: I do not mind downvoting - as it shows me, I must have done something right:

"Methinks thou dost protest too much." (English proverb)

[1]: http://darkpatterns.org/

andrewcooke · on Dec 11, 2013

folks please don't downvote people when they're giving useful info. there is a button in the advanced section that makes blocking work by default on new data. that's useful to know - i've just enabled it, and you should too.

just because you don't like someone (likely based on one comment on a web site...) doesn't mean that they should be downvoted...

quesera · on Dec 11, 2013

I didn't downvote him, but I can understand why others might.

His comment seems willfully ignorant of the problem, which is that Ghostery calls itself a tracker-blocker, but squirrels that obviously-desirable config option away under "advanced" settings.

If Ghostery was on our side, really and truly, that would be the default. Indeed, it probably wouldn't even be an option.

Of course when the tracker list is updated, I want to block the new ones!

No post on HN, no matter how helpful, correct, and civil, changes that this operating model is essentially a trick.

I just spent 45 seconds explaining it, but it would have been faster, and pretty defensible, to just downvote.

On the other hand, nuking his comment into gray-land would obscure useful instructions for making Ghostery do what it is assumed to do in the first place. So I agree, downvoting here is destructive.

aroch · on Dec 11, 2013

> His comment seems willfully ignorant of the problem, which is that Ghostery calls itself a tracker-blocker, but squirrels that obviously-desirable config option away under "advanced" settings.

Err, the option under Advanced just lets you set it to auto-block new elements as they're added. When you first install Ghostery, the walk through lets you pick that option as well without having to be "advanced" (oo, scare quotes!) in your preference setting.

quesera · on Dec 11, 2013

"advanced" here is a string literal, not an adjective, so it belongs in quotes.

fixanoid · on Dec 11, 2013

Quesera, this is a good point I didn't cover.

Ghostery does not call itself a tracker-blocker, our users do. This is an obvious oversight for most users, and its somethign that we will address, but at this time, Ghostery is designed to reveal the invisible web and give user the control over it, not make decisions for the users...

As far as the feature, at the implementation time, we've queried a set of users that agreed that when new trackers are added, there is no need to let the user know until s/he encounters it for the first time and reviews it. Obviously, this is another setting we will be moving away from advanced and into wizard so the users may review and select this option at install time.

quesera · on Dec 11, 2013

> Ghostery does not call itself a tracker-blocker, our users do.

Sorry, I cannot accept that answer.

From your home page, in big letters, right now:

https://www.ghostery.com/

  > Knowledge + Control = Privacy
  >
  >    See which companies are tracking you
  >    Block over    1 6 0 0    trackers
  >    Learn how they track
  >    Ghostery is FREE

What do you call yourself then?

Please be honest with us. How do you view your operation internally? What services do you provide, and to whom?

Thank you.

fixanoid · on Dec 11, 2013

As I said, we are adjusting to fit what our users think rather than what we preached. The site is 2 month old, and it reflects the new updated stance, and as I said earlier, the extensions will also be updated with pre-configured settings.

I'm not sure what you mean by that question? Ghostery is a separate team inside Evidon with full control over what we do. I'm one of the people managing the product and my customers are users of Ghostery. As such, my primary goal isn't improved blocking, its education - to let users know that they are being tracked, to provide relevant info on who are the trackers and where to find out more about them, and finally, provide control in the form of blocking.

quesera · on Dec 11, 2013

I appreciate you taking the time to reply in these threads.

I won't needle you with follow-up questions, but for the record, I think there's more than a little cognitive dissonance here regarding customers and conflicting goals.

This is why people who spend time thinking seriously about the issue are concerned about Ghostery, but I accept that "trust" doesn't pay the bills.

fixanoid · on Dec 11, 2013

Thanks for asking relevant questions and I agree about the dissonance. Our teams job is to make sure they are minimized and we're working on fixing those things. Something to keep in mind tho is that our team is very small (4 people), so it takes time to get stuff done.

quesera · on Dec 11, 2013

Yes, Ghostery's origin story and reconfigure-on-update dance is bothersome. Thanks for the pointer to Disconnect, I will check it out.

Regarding your sites' security, what sort of advice are you looking for? OS-level hardening? Service config?

bigiain · on Dec 11, 2013

Idea: A service to allow people to anonymously "share" Google tracking cookies. Perhaps a local transparent proxy that MITMs all your cleartext Google requests, and re-writes the tracking codes on the fly, submitting the ones you've been given and retrieving other "real" ones from a service (probably a TOR hidden service?)

dunham · on Dec 11, 2013

I use Ghostery/Adblock, but I have found that it breaks some web sites (whose javascript expects the tracking code to be loaded).

BTW, if you're using Chrome, you might also want to look into the "Users" section of preferences. You can create multiple user profiles with separate history, cookies, cache, etc. You can have a different user profile per window at the same time. (After you create a second user, there will be an icon in the top right corner of the window to open a window as another user.)

I like to use this to protect against CSRF. (I do financial stuff as another profile and facebook as another profile.) It's also useful for QA if you need to be logged in as multiple people at the same time.

goooderybucket · on Dec 11, 2013

I also recommend installing them and running the extra mile by changing your default search engine to DuckDuckGo (HTML SSL) without javascript (use NoScript). They temporarily track you and HTML one gives you direct links as opposed to redirects.

You can even modify you Firefox by changing values in about:config

geo.enabled ---> false

keyword.URL ---> Your Search engine query url

browser.urlbar.trimURLs ---> false

noscript.ABE.wanipCheckURL ---> 0

network.http.sendReferheader ---> 0 network.http.sendSecureXSiteReferrer ---> false ^these ones break some site functionality that rely on it. It's rare at least.

r0h1n · on Dec 11, 2013

I would replace Ghostery in your list with Disconnect. I would also add the 'Self-destructing Cookies' browser plugin. In its settings, whitelist a very limited set of sites you want to allow persistent (or session) cookies.

gorhill · on Dec 11, 2013

I don't know about either...

https://github.com/gorhill/httpswitchboard/wiki/How-does-HTT...

byoogle · on Dec 11, 2013

I work on Disconnect and don't quite understand what your page is getting at (you probably ought to be disclosing that this is your page and project, btw). If you consider the Guardian page, for instance, all the domains you've listed as third parties except Google and Twitter actually look to be first parties serving content for the page. In other words: If you go to the Guardian, you're going to be tracked by the Guardian. If you'd like, you can also prevent their pages from working properly by blocking some secondary domains they use.

gorhill · on Dec 11, 2013

> "look to be first parties serving content for the page"

"look to be"? How would javascript code know that?

I measured something, and that is the result of my measurement. People can make an informed decision with proper information. I found that the page served well without all the extra requests that Ghostery and Disconnect allowed.

Given the results, I am quite surprise you would say "look to be first parties serving content for the page".

> "If you go to the Guardian, you're going to be tracked by the Guardian"

* facebook-web-clients.appspot.com * guardian-notifications.appspot.com * related-info-hrd.appspot.com * static-serve.appspot.com * cdnjs.cloudflare.com * ajax.googleapis.com * discussion.guardianapis.com * s.ophan.co.uk

Aside `discussion.guardianapis.com`, others are clearly 3rd-parties.

It's seems my definition of "3rd party" aligns more with that of the EFF: https://www.eff.org/deeplinks/2013/06/third-party-resources-...

Now you focused on the Guardian, how about the two other cases I measured?

I'm sure you don't like the result, but this is what came out when I decided to audit. Your response: You don't think it is a problem. That is settled.

byoogle · on Dec 11, 2013

To answer your reply to my reply:

> Given the results, I am quite surprise you would say "look to be first parties serving content for the page".

I believe every single domain name you listed (except the Google and Twitter domains, like I said) is a domain owned by or a CDN used by the Guardian or hosts an app run by the Guardian - prove me wrong:

> facebook-web-clients.appspot.com

> guardian-notifications.appspot.com

> related-info-hrd.appspot.com

> static-serve.appspot.com

> cdnjs.cloudflare.com

> discussion.guardianapis.com

> s.ophan.co.uk

gorhill · on Dec 11, 2013

> "prove me wrong"

This is a terrible answer: you are suggesting that Disconnect knows exactly which 3rd-party is legit when visiting a web page, and somehow you can vouch that none of these hostnames is a threat to privacy (this is what your defense of this implies).

`static-serve.appspot.com` is no different than `ajax.googleapis.com` (you didn't list this one, why?): they are 3rd-party hostnames, some are CDN which is exactly why they are not to be trusted, you can end up hitting these hostnames from other places than just the Guardian, which is the problem.

In any case, the legitimacy of their their purpose is not the point. They are 3rd-party hostnames: Unless being told, the user wouldn't know that he is also hitting these hostnames.

I will note that you completely disregarded the other results which are even more embarrassing to explain (like `simplereach.cc`: "SimpleReach tracks every social action on each piece of published content to deliver detailed insights and clear metrics around social behavior.")

byoogle · on Dec 11, 2013

> You are suggesting that Disconnect knows exactly which 3rd-party is legit when visiting a web page.

Yes! You now know how Disconnect works - Disconnect's filter list is based on weekly crawl data that identifies what the most prevalent third parties on the web are.

> `static-serve.appspot.com` is no different than `ajax.googleapis.com` (you didn't list this one, why?)

You think that URL might belong to Google, which I already called an exception 2x?

> I will note that you completely disregarded the other results which are even more embarrassing to explain (like `simplereach.cc`: "SimpleReach tracks every social action on each piece of published content to deliver detailed insights and clear metrics around social behavior.")

I examined and debunked the entirety of the first example on your page, so I'm not inclined to waste any more time on your so-called "science".

gorhill · on Dec 11, 2013

I just tested the front page of wired.com with this new page I just assembled to quickly check what blockers are not blocking: http://raymondhill.net/httpsb/har-parser.html.

Despite "Adobe tag" marked as blocked by Disconnect, these requests were not blocked:

http://p.typekit.net/p.gif?a=219379&f=175.10294.10295.10296....

http://www.adobetag.com/d1/condenast/live/Wired.js

This is the part that bothers me: fooling people into thinking they are shielded against this kind of thing. That is not ok. I accept bugs can happen, but so far your position has been to rationalize why these 3rd-party domains are not blocked.

Oh and in this particular case, Ghostery blocked everything it said it blocked.

fixanoid · on Dec 11, 2013

Hi gorhill, interesting study. Ghostery runs its own to see how effective privacy extensions are, here it is: http://www.areweprivateyet.com/

Ghostery database is not static either and we update it very often, if you feel we are missing something, please let us know.

anonymous123941 · on Dec 11, 2013

Could you include RequestPolicy in your comparison please?

https://addons.mozilla.org/en-US/firefox/addon/requestpolicy...

gorhill · on Dec 11, 2013

I tested what is available on Chromium. Request Policy is not available for this platform.

gavinpc · on Dec 11, 2013

> I would also add the 'Self-destructing Cookies' browser plugin. In its settings, whitelist a very limited set of sites you want to allow persistent (or session) cookies.

Curious, how would this differ from whitelisting cookies in the browser's own settings?

salient · on Dec 11, 2013

It doesn't really matter if you're "interesting" or not. They're using keyword filters anyway, so if you don't encrypt your stuff, and say something that they consider suspicious, you'll probably show up in their alert system anyway. Welcome to the surveillance state and self-censorship. I'd say it's worse to not have your data encrypted.

reginaldjcooper · on Dec 11, 2013

> Still, I'd like everyone else to join me so that I can get lost in the crowd. The untracked, encrypted, well-rested crowd.

There's this add-on called Blender that is supposed to make your browser send headers like the average browser, you might be interested in that.

tunap · on Dec 11, 2013

ABE, NoScript, HTTPS Everywhere, Blender. Run em all in Sandboxie for pr0n and spelunking.

Like to point out what may be obvious to some but not others, when using NoScript you may want to remove Goog, Yahoo, etc from the default whitelist.

Before Blender I used various iterations of FF for testing & different surfing types(Waterfox/PaleMoon/ESR), but it appears I'll only be doing that for testing purposes anymore. https://addons.mozilla.org/en-US/firefox/addon/blender-1/?sr...

rosser · on Dec 11, 2013

My approach is similar to yours. I also take the step of using services I'm persistently logged into in different browsers than the one I do my general browsing in.

I also share your concern that my (lack of a) footprint makes me an outlier, and thus inherently more interesting to an adversary with the power and reach of No Such Agency. There's precisely zero I can do about that, without compromising my local objective of not being followed by every damned website, though, so I just carry on.

reginaldjcooper · on Dec 11, 2013

What is this hygiene add-on?

DannyBee · on Dec 11, 2013

"My own partial results from a research project I'm doing using Common Crawl estimates approximately 39.7% of the 535 million pages processed so far have GA on them"

This is interesting. I would have actually expected more. The last time I remember someone analyzing this, I believe the result was "<script ... ga.js>" was the most popular tag on the web by far. This was, however, a few years ago.

declan · on Dec 11, 2013

I wish you luck with your research, but I'm not sure your look at Google Analytics is relevant to this discussion or the Washington Post article that started this thread.

First, as <bdt101> pointed out, you "cannot track a unique visitor across the web using GA cookies" because of the way they're designed: https://news.ycombinator.com/reply?id=6889120&whence=item%3f...

Second, the NSA doc as excerpted in the WashPost article talks only about Google's PREF cookie, which is set only when you go to say Google.com, not when you go to a non-Google property. It's a first-party cookie used for things like saving language preferences when you're not logged in, not for advertising across other properites. (That's what the Doubleclick cookie is for.)

droopybuns · on Dec 11, 2013

+1'ing more encouragement.

I've been in the internet industry for a while now.

For what it's worth, I think your thesis has significant value.

bdt101 · on Dec 11, 2013

Google Analytics uses only a first party cookie to identify visitors. That means your id is different on each site you visit. You therefore cannot track a unique visitor across the web using GA cookies.

Google Analytics requests are also only unencrypted if the site itself is unencrypted, so the fact that the GA request includes the referrer doesn't seem relevant (since the referrer would have already been transferred in the clear in the Referer header on the initial HTTP request.)

cataflam · on Dec 11, 2013

> That JS request to GA also has your referrer in it, in the clear. (...) If the link ends with Google Analytics, but doesn't start with it, then when you reach that end page, the referrer sent to GA in the clear will state where you came from.

I'm curious. In that case, the GA JS is requested from what you call the "end page", so the referrer it has should be the "end page", not the one before it.

aethertap · on Dec 11, 2013

Regarding the prevalence of GA, I've been experimenting with self-hosting analytics on my own domains using piwik [1]. So far, I'm pretty pleased with it and I feel somewhat better knowing that my users' data doesn't go to any third parties.

It does everything I want it to do (so far), but I'm not an analytics power user by any means.

[1] - http://piwik.org/

g8oz · on Dec 11, 2013

Does blocking third party cookies stop Google Analytics from being used by the NSA in the way the article described?

dangrossman · on Dec 11, 2013

Google Analytics uses first-party cookies. They're set and read on each website's own domain by JavaScript. Blocking third-party cookies wouldn't have any effect.

malandrew · on Dec 11, 2013

Can't we simply block all requests to Google Analytics so that the GA javascript is never loaded?

joshfraser · on Dec 11, 2013

Sure, just black hole www.google-analytics.com and ssl.google-analytics.com in your hosts file.

fjarlq · on Dec 11, 2013

I've done this for a year or so now.. it works fine.

Add this line to your hosts file:

  0.0.0.0 google-analytics.com ssl.google-analytics.com www.google-analytics.com

Hellenion · on Dec 11, 2013

if you have iptables you can try amici: http://github.com/upwhere/amici

suprgeek · on Dec 11, 2013

A perfect reason to NOT let Google own all layers of the stack between you and the internet (or indeed the real world).

Search - Check (goog.com)

Mail - Check (Gmail)

Browser - Check (chrome)

Devices - Check (Android/Chrome books)

Websites - Check (Double click/AdMob, Unknown number of other companies)

Google Analytics - Check

Your DNA - Check (23&Me)

Cars - Check (self-driving cars)

I am probably missing large chunks of tracking even with this list.

Where do you draw the line so that organizations like Google do not handover (willingly or inadvertently) our life to NSA, GCHQ, ASIO, CSIS & whatever New Zealand's Intelligence spooks go by, on a platter?

Heterogeneity - Make the buggers at least have to work a little bit to invade your privacy.

eli · on Dec 11, 2013

Your larger point might be true, but has nothing to do with the current revelation.

If every site switched from Google Analytics to, say, Mixpanel... nothing would change. The NSA would just target the equivalent mixpanel cookie. So long as their are popular third-party cookies, this will be a problem.

jdubs · on Dec 11, 2013

It also seems like does the risk of being eased dropped upon outway the benefit of having data to drive business needs?

IBM · on Dec 11, 2013

Search was easy to replace. Bing and DuckDuckGo are both good. Firefox was an easy switch now that Chrome seems to be much more of a resource hog (and the extensions are better). I don't have an Android phone. For e-mail I've switched to Fastmail but Outlook is also a good alternative if you want something free. I don't use anything else. I'd say e-mail was the hardest friction point of them all, but overall it was pretty easy to leave Google.

psbp · on Dec 11, 2013

Switching to Microsoft from Google is not an ethical accomplishment.

girvo · on Dec 11, 2013

Despite the fact that I'm not the kind of person who "hates" Microsoft, you've got a point there: they were implicated in the NSA docs as much as Google was. In fact it sounded like they cooperated more than Google. Stick to smaller independent companies that have a good privacy track record.

dangrossman · on Dec 11, 2013

If you care about e-mail privacy, you should consider desktop/native e-mail clients instead of webmail and IMAP.

Having all your mail sitting on someone else's server means it can be handed over by that company in response to a government request, legal or otherwise. After 6 months, it's not even a fourth amendment issue and no warrant is required; it's not "your" mail when it's data on someone else's server.

This doesn't require technical prowess the average person doesn't have. You can use your ISP's mail server, or a professional service like Rackspace Mail. There are free native e-mail clients for every desktop and mobile platform. You can still get instant mail notifications with IMAP Push. Just set at least one of your computers to delete mail from the server after downloading it.

lallysingh · on Dec 11, 2013

I think you're focusing on the wrong (or at least, non-primary) axis: destination. I'd suggest your on-wire image instead: packet header, protocol, encryption, etc.

This is all assuming you consider your adversary to be the NSA. If it's google, well, choose other vendors. If it's both, you'll have to consider both your destination and wire-protocol axes.

FWIW, if your traffic is split evenly between 3-4 main vendors (e.g., google, amazon, bing, etc), and all HTTPS, it's hard to tell what you're doing.

jodrellblank · on Dec 12, 2013

I am probably missing large chunks of tracking even with this list.

Enormous amounts of things have some connection into Google. Other connections into Google's equipment potentially include Voice, Talk, Hangouts, embedded Google Plus +1 buttons, embedded YouTube, Blogspot sites, embedded Picasa images.

Google runs ReCAPTCHA. ( http://www.google.com/recaptcha/ )

If you email someone with a GMail account your email address is in Google's servers with the email header containing your IP address.

Google's SafeBrowsing URL check built into FireFox which normally works by hashed URLs but could still track that you are using it, but has a simple version of the API so applications could send plain text URLs to it without you knowing ( https://developers.google.com/safe-browsing/ ).

Sites hosted on Google AppEngine ( https://developers.google.com/appengine/ ).

If you have IOS, Safari defaults to Google suggestions - i.e. sending everything you type in the address/search bar to Google.

Google Maps, built into other websites and services. Google Geolocation API built into other software ( https://developers.google.com/maps/documentation/business/ge... ).

Google DNS (last time I read the privacy policy, it said queries are not combined with other data Google collects).

Sites loading popular JavaScript from Google's hosted libraries ( https://developers.google.com/speed/libraries/devguide ).

Sites embedding Google Sparklines ( https://developers.google.com/chart/interactive/docs/gallery... )

Links going via Google's URL shortening service Goo.gl

Not counting things you choose to use (Chromecast, music, docs, drive, Now, voice search, News, Groups, Finance, Toolbar, Android sat nav, Chrome's open tab sync between your devices via Google Cloud, etc.).

That's not to say they are good or bad, or they are or are not tracked. Just that it's way to late to "avoid Google" just by switching away from GMail and blocking Google Analytics.

http://en.wikipedia.org/wiki/List_of_Google_products

gress · on Dec 11, 2013

So all that paranoia about being tracked by Google... wasn't paranoid at all.

Yes, I know Google likely didn't cooperate in this, but they built a giant tracking engine, so it's not surprising to see it repurposed.

psbp · on Dec 11, 2013

"I know Google likely didn't cooperate in this"

I'm sure they have plausible deniability.

eli · on Dec 11, 2013

It is indeed quite plausible.

001sky · on Dec 11, 2013

There was a quote here from oppenhimer the other day, about when it comes to cool technical solutions, you shoot (solve) first and deal with the morality (ask questions) later. Pity that everyon seems to be "hacking" the hell out of pandora's box, though.

PavlovsCat · on Dec 11, 2013

If you build it, they will come.

sehugg · on Dec 11, 2013

Interesting choice of cookie:

http://blogs.wsj.com/digits/2012/02/28/the-google-cookie-tha...

https://bugzilla.mozilla.org/show_bug.cgi?id=368255

PavlovsCat · on Dec 11, 2013

Gotta love the arrogance of "This is intend behavior of the feature. WONTFIX for me.", without the ability to explain why this cookie would be required for the feature to function.

simfoo · on Dec 11, 2013

Also "...I am not worried that google is misusing this data...". This clearly isn't acceptable in a post-Snowden world.

driverdan · on Dec 11, 2013

> On Firefox, the cookie shows up automatically as part of Safe Browsing, unless a user disables “third party” cookies.

Disable 3rd party cookies. It solves a lot of these types of tracking issues.

gorhill · on Dec 11, 2013

What a coincidence... I was just a few seconds ago, before taking a break to read HackerNews, investigating an issue with a Chromium blocker (https://github.com/gorhill/httpswitchboard/issues/79#), and was puzzled finding that the `pref` cookie of `.google.ca` changed every single time the tab of the page lost focus. Even went to Google privacy page to understand what this cookie did, with nothing in their statement that could explain this. Now this?

gorhill · on Dec 11, 2013

That part (value of `S`) changes everytime the tab loses focus: pref=[...]:S=J3ITrb9DNMWLQBzc

What kind of "preferences" changes in that way each time the user browse away the page and how does it help "user experience"?

samstave · on Dec 11, 2013

So, are you saying that through this - NSA can see exactly which tab you are viewing at which time?

gorhill · on Dec 11, 2013

No. I prefer the scientific approach. At this point, I just reported what I observed. Maybe somebody will come up with a sensible hypothesis as to why a value changes so often. Google could just come forward and tell us the exact meaning of each field in its cookie. That would be a start.

grey-area · on Dec 11, 2013

Google analytics tracks your time on page, and is probably storing values to do with that when you lose focus on the page - this is used in analytics for showing site owners engagement etc. Re the pref cookie, it could well be to track your interaction with searches, as they do for site analytics, if it changes when you leave a page, that's the most likely explanation.

cromwellian · on Dec 11, 2013

Don't even need cookies if you have JS enabled (https://www.eff.org/deeplinks/2010/05/every-browser-unique-r...) Without JS and with HTTP headers alone, you might be able to reduce entropy by using Geo-IP.

rl3 · on Dec 11, 2013

To speculate: For connections that utilize NAT devices, NSA probably has analysis tools designed to attempt segregation of network traffic on a per-user basis.

Browser string, viewed content, frequency and magnitude of access, user authentication cookies, and ad-tracking cookies all would be tremendously helpful for this purpose.

Also, I'm betting they can easily tell when specific computers on a network are powered on or not based on fixed-interval network traffic from anything that polls regularly, such as anti-virus, news readers, mail clients and background updater services.

All of the above could aid in painting a more complete per-user picture behind the NAT, without actually having to compromise the local network or individual computers in question.

salient · on Dec 11, 2013

Relevant:

http://betanews.com/2013/12/09/tech-giants-surveillance-refo...

As long as these companies build the best tracking engines the world has ever seen, that can identify anyone and everything they're doing, it's just a matter of time before governments get their hands on that data, legally or illegally. It's just too tempting to pass.

If I were Google I'd start thinking long and hard about how to solve this problem, and try to make money by actually being on the user's side when it comes to privacy, not against them. Google will ultimately fail if their goals aren't aligned with those of the users anymore.

drawkbox · on Dec 11, 2013

So not only are businesses like cloud services, video games and messaging/devices affected by anti-business NSA trust breaches. But now we have the advertising industry that is going to be affected by the anti-privacy and anti-business practices of over the top spying on individuals. If any private company was doing this there would be legal issues.

jimworm · on Dec 11, 2013

Let's be charitable to the NSA for a minute, and imagine that they are following the plot of the God Emperor of Dune[1], where in seeing the danger posed to the Internet by the formation of cloud service giants, they became the fearsome yet benevolent tyrant, strategically planning an engineered leak, so that on their death the Internet would react by distributing its services among many providers in The Scattering, thus ensuring the safety and continued survival of the Internet.

[1] https://en.wikipedia.org/wiki/God_Emperor_of_Dune

chroem · on Dec 11, 2013

Hah, the joke is on them: I browse with cookies disabled.

Of course, I'm sure they have some other way to pwn me, but it's nice to know that I was doing something right.

misiti3780 · on Dec 11, 2013

if you browse with cookies disabled, that means you cannot successful browse arounds sites logged in - correct ? you basically do a ctrl+shift+N in chrome every time you open a new window ?

chroem · on Dec 11, 2013

I have a select few sites whitelisted, but they're disabled by default.

Also, I'm on Iceweasel/ Firefox instead of Chrome. It's probably nothing to worry about, but you can never be too careful these days.

kzrdude · on Dec 11, 2013

It's interesting but also annoying how my browsing is now diverged from the web as others see it. I mean, with increased amounts of blocking addons, the difference between an adblocked, ghostery'd, etc browsing experience to the vanilla experience is growing bigger.

kissickas · on Dec 11, 2013

I see a lot of you are using Ghostery, which I've never even downloaded because they get paid to whitelist and are run by ad executives. Is there a reason why I would want Ghostery in addition to Noscript, or is all of the (privacy-protecting) functionality redundant?

This news makes me happy to see there's a point to me having Google Analytics blocked the last two years. I've noticed a new thing, Google tag manager, lately. Any point in whitelisting this? Anyone know what it does?

fixanoid · on Dec 11, 2013

Heh, Ghostery is not paid to whitelist anyone, I would know since I run the database for Ghostery.

As to your question: NoScript does a different thing -- it concentrates on limiting known security issues by disabling Javscript. Tracking is accomplished in a variety of ways, and only some of them are Javascript based. Ghostery looks for all of these and lets users know who is tracking them on any given web page.

koide · on Dec 11, 2013

Why would you know how or why Ghostery is paid just because you run their database?

Unless you are something more than Ghostery's DBA.

fixanoid · on Dec 11, 2013

Indeed, I am Ghostery Lord & Master =) -- one of the people who run, develop, and see to the success of it.

bottled_poe · on Dec 11, 2013

In my opinion, browsers should block all third party website content by default. Yeah, I know, the interwebs will break if they actually did this. Well perhaps someone should come up with some kind of website quality rating which indicates that a site can be viewed withing worrying about the prying eyes of FaceBook, Google, Twitter, LinkedIn, etc.

Anonymous823 · on Dec 11, 2013

I made a post the other day, but it got pushed off 'new' in a few seconds. Anyway, I thought someone should setup a simple one or two page site that summarizes the importance of not tracking visitors. Then, it has a few 'this site respects your privacy' images in a variety of sizes that you can copy and paste into your own site, if you agree to respect those rules.

It would need to be a recognizable image and symbol. The image would link to the site above, that informs users how you respect their privacy and do not track them. Personally, I'd add it to my sites, because with all the recent concern about privacy, I think my users would appreciate this change, and it would provide some advantage over competing sites. I'd like to visit a site, see that image in the footer, and feel more confident using their service.

I think it would be a good way to encourage change from developers. Very few are going to pull Google Analytics on their own. However, if they get pressure from their users to follow a certain privacy standard, and by doing so they can drop an image on their site to illustrate the change and potentially increase trust and improve their reputation, we might see some improvements.

vu3rdd · on Dec 11, 2013

I have more or less the same opinions as you. I wrote my browsing setup optimized for privacy here: http://rkrishnan.org/posts/2013-12-01-firefox-privacy.html

Comments and further improvements welcome.

dobbsbob · on Dec 11, 2013

German Privacy Foundation does this

pavanred · on Dec 11, 2013

Firefox + Third party cookies blocked + Ghostery + NoScript

Can be a little inconvenient at times but seems justified now.

greenyoda · on Dec 11, 2013

If you're using Firefox, you might like a little add-on called "Cookie Monster", which lets you easily control which sites can set cookies (permanently or temporarily) and indicates whether the site you're on has attempted to set them.

https://addons.mozilla.org/en-US/firefox/addon/cookie-monste...

LoganCale · on Dec 11, 2013

In Chrome, you can use Vanilla. It may not have as many options, but it works fairly well.

pktgen · on Dec 11, 2013

Here's my ideal security policy:

- Cross-site requests not allowed without whitelisting. This means some setup will be required at first (for example, for separate image domains used by Amazon, Google, Yahoo, etc.), but after a bit it shouldn't be a problem. This also serves as a "better adblock" in some ways, as it blocks ad networks without relying on a database that needs to be updated.

- All cookies blocked by default; whitelist as necessary

- JavaScript disabled by default; whitelist-enable as necessary

- No Flash or Java, period. If I need Flash for something, I'll launch a VM.

Sadly, Safari doesn't support whitelisting for any of this. Chrome supports whitelisting of cookies and JS by default, but the Chrome UX is worse than Safari's IMO (for a few reasons, but that's another topic entirely).

RequestPolicy handles the first one quite well, but is unfortunately Firefox-only.

quesera · on Dec 11, 2013

Safari is effectively ungovernable, and Chrome is part of the problem.

Firefox is the answer. No other option makes any sense, if you're serious about this stuff. I understand that some people like the UI or process model of other browsers better, and that's where the evaluation of priorities comes in.

The good news is that the days of Chrome's technical superiority are truly over.. Speed, memory consumption, rendering engine...Firefox is all there and sometimes better.

Firefox is also the only browser with an ability to sanely handle tabs on the side, which is the only sane place to put tabs on modern screens. If I had to choose between sane tabs and sane privacy policies, I might have some soul-searching to do. I understand that everyone has their own equivalent, but be sure not to dismiss Firefox based on historical issues.

ars_technician · on Dec 11, 2013

>but be sure not to dismiss Firefox based on historical issues.

It's incredible how much inertia there is with that. The majority of the people I know that switched to chrome did it back when firefox was blatantly slower and that's the image that's stuck in their head. It's incredibly hard to remove and to get someone to try it long enough to change their mind again.

Firefox has a tough issue with marketing right now. They need to start a nice "firefox is faster" campaign.

gorhill · on Dec 11, 2013

Doesn't that fulfill all your points: https://github.com/gorhill/httpswitchboard

Posted on HackerNews two days ago.

noinsight · on Dec 11, 2013

Since browsers have varying support for even creating plugins for this, maybe it would be possible to create a proxy server that could handle this stuff?

driverdan · on Dec 11, 2013

Disabling 3rd party cookies works just fine. I've been doing it for years.

eli · on Dec 11, 2013

Quality rating? I don't see how that could possibly work in practice...

hnha · on Dec 11, 2013

this would just lead to sites using subdomains pointed to the ip addresses of those.

gress · on Dec 11, 2013

Also, it's worth pointing out that the tracking isn't for search. It's for more profitable advertising.

chanux · on Dec 11, 2013

For anyone who would find this useful: Self destructing cookies add-on for Firefox https://addons.mozilla.org/en-US/firefox/addon/self-destruct...

reginaldjcooper · on Dec 11, 2013

This is a polished and wonderful add-on.

judk · on Dec 11, 2013

Is there a way for mobile browsers to block analytics cookies JS , a la ghostery and adblock?

maxerickson · on Dec 11, 2013

Adblock Plus is available for Firefox on Android.

fixanoid · on Dec 11, 2013

Ghostery is available as a stand alone app for iOS, and has extensions for mobile Firefox and Opera.

usrnam · on Dec 11, 2013

Last weak i create extension for Firefox:

Disable Google tracking, log off user FROM Google search engine: * keep login into Gmail * also remove ads * remove Cookie,Sess~/localstorage __ First run, need refresh Google page to log off ~~

-- Also remove Google anal-itics Cookie :)

https://addons.mozilla.org/pl/firefox/addon/googleantyspam/?...

elwell · on Dec 11, 2013

The problem with this is that most of the general public will read it as "Google helped NSA intentionally ..."

bosch · on Dec 11, 2013

Can someone answer this question:

From a business perspective why is Google and Facebook getting involved in this and calling for the government to not track users. Won't that just bring more attention to their two business models of... wait for it... tracking users and selling their information?

arbitrage · on Dec 11, 2013

Because their customers are pissed off, and if they don't do something to mollify them, they'll lose money.

Previously, when the customers didn't care, they did nothing to involve themselves with this, and almost certainly aided the government.

It's purely business. Google and Facebook don't have morals, they have a bottom line. You can understand their actions by following the money.

goldvine · on Dec 11, 2013

This is beyond ridiculous at this point. Wondering what else is still to come...

tejaswiy · on Dec 12, 2013

I mean, disgust aside, technically NSA is doing some seriously cool shit. I wonder what you could do if you had access to a de-identified data dump from the NSA.

dangayle · on Dec 11, 2013

As someone who works closely with several web marketing folks, this hits close to home. Each time they open a Snowden file, things get weirder and weirder.

timbro · on Dec 11, 2013

No website has to have Google track their users. If you do it, you choose to do it (you're disrespecting your users).

You can get your open-source and locally running web analytics here: https://prism-break.org/

timbro · on Dec 11, 2013

> it lets NSA home in on someone already under suspicion

Like OWS protesters, for example.