Hacker News new | past | comments | ask | show | jobs | submit login
Facebook caught sharing secret data with advertisers (arstechnica.com)
225 points by ferostar on May 21, 2010 | hide | past | favorite | 56 comments



Here's the paper giving details:

http://www2.research.att.com/~bala/papers/wosn09.pdf

There's three ways info leaks:

1 - Referer header, eg facebook.com/profile.php?id=1

2 - Request, eg analytics.google.com/script.js?page=facebook.com/profile.php?id=1

3 - Cookies, eg z.digg.com points to an omniture server, and so passes all digg cookies to them!

1 and 2 are easily exploitable by advertisers who wanted to, but 1 especially seems like a very standard way of building urls on most services. Definitely will get them hammered for good reason, but there's not necessarily any bad intent.

3 seems a lot worse. Are there legit reasons I'm missing for hosting ad servers on the same domain, and so puncturing the browser security model?


> Are there legit reasons I'm missing for hosting ad servers on the same domain, and so puncturing the browser security model?

Avoiding generic (not targeted to your site specifically) AdBlock URL filtering.


Also, many browsers block third-party cookies by default, which may screw up your analytics.


Many browsers can, but few do. Firefox, chrome and IE don't by default.


True. I misspoke about many, but you will still improve the accuracy of your analytics by doing this.


Omniture isn't an ad server, it's an analytics system. Perhaps they do allow sharing with 3rd parties (i.e. ad servers) but I'd guess that's somewhere in the settings.


re: 1 - For user shared links, Facebook redirects to anonymize the referring profile. I suspect they forgot to do the same for ads, and it was an honest if frustrating mistake.


They didn't use to do this, the only reason they redirect now is so that if a link is deemed a virus of some sort they can easily stop it from spreading, and you can enable a setting so that before visiting every link you get an interstitial that tells you that you are leaving Facebook.


I've sometimes subdomained a few, select third party services on the same domain. For example, if a third party hosts your landing pages and you wish to own the urls to those, subdomaining is the best way to handle that.

That said, you should practice decent subdomain level security with cookies. You can and SHOULD restrict cookies to subdomain levels. The only exception is for SSO related cookies (that are stored at the domain root) that still need at least a second, shared secret verification at the very minimum.


"Caught" is a little strong. It's not like they were selling the information to advertisiers -- in fact, several of the advertisers who were receiving the information have said they were unaware it was even being sent, far less doing anything with it.

They didn't write any code to "share" this data; they just failed to put safeguards in place to prevent it leaking via HTTP referrers.

I'm willing to put this down to incompetence rather than malice, though of course incompetence is still not great.


Honestly, I think the vast majority of facebook's privacy guffaws have been due to incompetence and not malice. My concern, however, is that we get in the habit of excusing these failures because of this incompetence.

Software can be made (more) secure and can be tested (better). The point then is when does it make financial sense for facebook to put the money and man power into tackling these issues on the front end... If users / consumers don't take note of the problems and move to another (currently nonexistent) platform facebook will never have a motivating reason to change.

Perhaps, users have done this to themselves by demanding low cost (free) software with fast release schedules for new features.


OP's #3 above with the cookie pointing seems to tell the lie to, well, that lie.

You can't put it past big co's (or even small co's, and individuals) to come up with the strategy:

1. We'll do this naughty thing

2. We'll make it look accidental

3. Then if anyone finds out, we'll pretend to be bumpkins

It's a classic foil. Basically, it's a reverse pool shark hustle.


The cookie thing is definitely a different matter. I was referring to the portion that is making all the headlines, which is Facebook "giving advertisers names and ages of people who clicked ads" (see http://www.businessinsider.com/facebook-myspace-busted-for-t... ). They did no such thing.


"Reverse pool shark hustle"

I would call this pulling a W.


It never ends does it?

FB has a real problem, I hear my totally clueless (when it comes to computer related things) family members discuss their facebook privacy and whether or not they should quit.

I never expected to see that happen.

And all that in the space of about 2 months.


It hit the news cycle, and now technical details which have existed unchanged for years and which no user actually cares about (HTTP referrers) provide new grist for the mill. And, of course, it is distorted beyond all recognition: "Anyone who runs a web page on the Internet -- including advertisers -- is passively informed of the page you were looking at when you clicked a link to their site. This is built into the Internet and is the way it has always worked" becomes, quote, "Facebook, along with MySpace, Digg, and a handful of other social-networking sites, have been sharing users' personal data with advertisers without users' knowledge or consent."

I don't fell all that sorry for Facebook, but man, am I sure glad I have never had my business interests aligned against a media narrative.


Peter Bright of Ars Technica points out in the comments on their story:

here's why this is particularly objectionable: Facebook bounces user links through a redirect to strip the user data out of URLs. Facebook already has the technology, understands it, and uses it elsewhere. But not for adverts. The failure to use the existing technology is peculiar.

The original article was sensationalist, and I think this was much more likely an oversight than something malicious, but still... oops.


The failure to use the existing technology is peculiar.

Only if you have never coded software professionally in your entire life. A junior engineer on team B did not use the library code written by team A several years ago, which is probably documented mostly as a matter of oral lore among members of team A. Instead, mistakenly believing the problem to be trivial ("I have the URL they're going to! All I need is to output it. Hah, psych, I'm going to run it through our HTML escaper to make sure there is no cross-site injection. Security++ I am the awesome."), they handwrote a one-liner which worked fine. Two years later it is the subject of a WSJ article.

This only happens every single freaking day on every project I've ever been on. Heck, I have missed opportunities for re-use (and caused subtle side-effects through doing so) frequently when I was the only coder on the project.


At this point, it seems FB could benefit from a thorough third party security audit of their web technology.


That is definitely FB caught red handed.

Amusingly, an alleged employee of Facebook here challenged me to find a single example of Facebook selling private information, and this seems to be the clearest example so far.

http://news.ycombinator.com/item?id=1312016


My challenge stands. There is no indication that Facebook made a cent off of this bug, nor that any advertiser was aware of the fact that a small percentage of ad clicks contained a user id.

"Alleged" employee? My name is Keith Adams, and here's an entry I posted to the Facebook engineering blog this week:

http://www.facebook.com/#!/notes/facebook-engineering/the-li...


Your challenge does not stand. It gets weaker every day.

There is a reason why Facebook is more appealing than other advertising venues. They offer more personal information. Facebook is smart enough to use a redirect cloaker for other content, why didn't they do it for ads? The reason is quite clear to me.

And yes alleged. Your comments and profile offered no proof of your employment so I was careful to represent that in my statement. Do you find anything wrong with that?


Easy there, crusader. There was no clear intent, and no selling of anything involved here. Read the story, not just the title.


I think that what makes this case particularly special is that Facebook referring URLs share much more data about you than the average site. A typical Facebook URL can be something like:

http;//facebook.com/#!profile_id=123/reqs.php/456/v=photos&ref=pymk

This means "I am user 123 and I'm looking at the photos of user 456 after having clicked through to their profile. I found this user's profile through Facebook's friend recommendation page."

Why does Facebook have to put all that info in the URL in the first place?

The referring URL for an average site would simply share "I am an anonymous user that's looking at 456's photos".

An advertiser could use Facebook's Graph (where your name, picture and other information is forced to be public now and indexed via the above Ids) and you have extremely detailed info about someone and their Facebook activity.

Note: It looks like Facebook has stripped the part of the URL that needlessly self-identifies now, so that's good.


It's like watching a snowball roll down a hill at this stage.

Imagine what you could do if you could harness the power of that narrative in the other direction.

It's interesting to see how people react to realizing what has been going on under the hood pretty much for as long as I remember. I think that when the doubleclick trouble hit people just couldn't make the mental connection and for the media it was much too dry. Facebook is very close to home and it ties in to everybody's lives at such a close-to-home level that they seem to feel threatened way out of proportion.

Not sure if digg belongs in that list.


> Imagine what you could do if you could harness the power of that narrative in the other direction.

Facebook got to where it was by riding the media narrative up (from the start Zuck pulled strings to get positive coverage in the Crimson and from then on it was off to the races). They made Facebook and now will destroy Facebook. Fun to watch from the sidelines at least.


I think Apple is an example of a company that rode the upside of a narrative. Microsoft is evil. Microsoft is insecure. Microsoft is old and crufty, etc. There were a lot of practical reasons for Apple machines to never talk off (no support, poor supported software, no one uses it, hardware investment, etc), but they made decent products, and more importantly, fit in the story.


"Imagine what you could do if you could harness the power of that narrative in the other direction."

diaspora


Diaspora has already had its run in the media, they were at their peak pulling in $4500 per hour in donations, they've fallen back to < $1000 per day now.

The media has given them a nice old time of it (especially a major article about them taking on facebook and pointing people to kickstarter) but they failed to fan the fire as far as I can see, they're well in to the 'valley of despair' now media wise, unless they cook up some stunt.

Otherwise their next shot at a media slot is launch day, and they better not mess it up.

News is fickle that way.

And they have a bit of a delivery problem ahead of them, the expectations are way beyond reasonable at this point.

If they manage to pull it off I'll be most surprised, if they manage to take > 1% marketshare away from facebook without active help from facebook I'll be even more surprised.

But facebook may yet oblige them.


Yeah, sucks to be them. They only raised 10x what they needed without giving up any control. Now all they can do is build the app they wanted to build and try to squeeze by as a well funded internet startup with great PR.


Right. Because all you need to take on the #2 company on the web with 400 million registered users is a few hundred grand and some newspaper articles.

Really, seriously. The Diaspora guys are probably great people but it takes a bit more than that and the above ingredients to make this happen. They'll have to keep drumming that PR motor without any news at all if anybody is to even remember them by launch day, and they have a very high bar to cross in terms of expectations.

At some point the amount of money you have doesn't matter.

Let me give you one small example: In the netherlands there was a small local site called 'marktplaats' that had nested itself in peoples' consciousness when it came to buying and selling second hand goods.

In the end, Ebay, with a marketing budget that would dwarf most other companies turnover just gave up and bought them, so strong was the power of being the entrenched party.

On that scale 200K bucks and a bit of press amount to nothing.

The party that determines the future in this respect is facebook, and if they don't mess up royally (and there's always a chance for that) the outcome of all this is fairly predictable.

Given everything I know about all this today, and the fact that fall is about 5 months way and that they'll be able to hire an additional 35 man-months of coding time (assuming they themselves will only use that 10K they originally budgeted), that translates in to a team of 11 people that still needs to be broken in and that needs to produce a relatively large amount of software in a very short time.

I put the odds at significantly less than 5% of this succeeding in a way that the first batch of users will be happy. If they find an investor that will give them several years of runway it's a totally different story, but then they still have to unseat facebook.

I hope they'll give it their best shot and that something good will come out of it, instead of just a signal to FB they have a public relations and a privacy issue.

Anything over that and I'll consider it a bonus.


I don't recall these guys ever saying they were trying to take down Facebook. That was the media's spin. A lot of people only understand change in terms of bloody revolution.

They're some geeks with a solid idea and they've got way more cash to build it than most successful open source projects ever see. There is absolutely no problem here. But I guess if you swim with sharks...


It's not so much taking down face book.

A social app, by definition, is governed by the network effect. For it to be successful, it needs much more than a great codebase. It needs users.

Diaspora will need to attract users, and that probably means enticing them to come from elsewhere. The purpose isn't destructive against FB, it's constructive for diaspora.


> Otherwise their next shot at a media slot is launch day, and they better not mess it up.

(cough) Cuil (cough)


They might see a resurgence after today's xkcd.


I note a 'whenever' in the hint.


Yes, the rumour is out and people have started to question. And it's actually a good thing if people become more cautious about what to share and where.

For IT professionals Internet can be a great opportunity to show our work. Equally anything you share can bite you in some future job interview, for example. Discrimination based on race, religion, sexual orientation, family situation etc. is illegal, so you cannot really ask questions about those topics, but if all that info is easily available online... Now I'm not saying it would be a negative thing to be open about yourself, but I'm sure to monitor what my family is putting online.


"Not surprisingly, Facebook appears to have gone farther than the other sites when it comes to sharing data."

This isn't really the expectation you want your users to have.

Interesting to note that Google comes up in this though.

This is leading to regulation. Hard and swift.


I think the answer to this lies in regulation, but I think we also need to start treating the thriving market for our personal data differently. Privacy and regulation is super important. But I think we, as "products" also need to become active, engaged participants in the economic market for our personal information. We should have profit sharing agreements with Facebook to resell our data, should we consent to data sharing. Only then, I think, will we really have a stake that is worth more than writing angry articles and blog posts.

I wrote an expanded version of this comment as a blog post here for anyone who wants to read it and comment, here or there. http://edwardbenson.com/facebooks-product-is-you


That sounds like people being paid to exist


This is exactly the sentiment I think we need to get rid of. I have it too. Why? Because it is depressing to think of ourselves as a product; we'd rather just dismiss that idea as "being of a culture we do not choose to belong to".

But the problem is that's like an ostrich sticking its head in the ground. People are making money off your existence, off of every click you make online, off of your gender and your religion and what you read last weekend.

Until we are able to accept that reality as active, willing participants, we won't be able to demand better legislation to give us agency is the issue. The ostrich never had any agency in the stampede rumbling by him.


I don't think that's necessarily wrong, if some money is being made off a common resource. Every Alaskan citizen gets annual oil-fund checks, for example.


Following the general rule in economics that special interest groups are more powerful than the masses in passing legislation I think any regulation would make this worse i.e. "enforced real-id online to make us safer".


>>"Not surprisingly, Facebook appears to have gone farther than the other sites when it comes to sharing data."

>This isn't really the expectation you want your users to have.

It kind of is - Facebook's raison d'etre is sharing information easily.


Yes Facebook is evil. When I saw that Zuckerberg called users "dumb f*s" for trusting him with their data, and I mused on how criminals could exploit that data via phishing, pw guessing, and social engineering schemes, I joined the Perma-Delete revolution.


Really could have just as easy been a silly joke that is now taken out of context.

I wonder if any of us will ever make it to the level of Zuckerberg, but if you do, are you sure you never made an IM message or an email that might be used against you like this?

I don't even recall most of them.



Not a lot of details. Is this a story about the HTTP referrer header? (aka "Referer")

But don't let facts get in the way of a good story...


Yes, it looks like it. And it's unclear how Facebook somehow shares more than other sites...


We don’t share your information with advertisers. Our targeting is anonymous. We don’t identify or share names. Period.

-- Elliot Schrage, vice president for public policy at Facebook. May 11, 2010.

http://bits.blogs.nytimes.com/2010/05/11/facebook-executive-...

ouch.


This is the part that troubles me: " It wasn't until WSJ contacted them that changes were made."

How do you interpret that?

1) Too busy to care enough to prioritize this? 2) Indeed there was intent? 3) To dumb to realize the consequences?

Maybe I'm too biased now, but I can't think of a good way to put a positive spin on that.


There is an interesting related thread on Quora.

http://www.quora.com/How-did-Elliot-Schrage-not-know-that-Fa...

Here is what one of the Facebook guys says about the situation:

The Wall Street Journal article is not exactly factually false, but the implication you're drawing from it is incorrect -- the actual issue is that in some cases (e.g., after performing some editing operations) the viewing user's ID is contained in the page URL. If the user happens to click on an ad on such a page, the browser will send a Referer header line that has the URL with the ID in it. On the other hand, if the user clicks away to a different page then clicks on an ad there, the ID will no longer be present.

This by no stretch of the imagination represents Facebook "going out of its way" to pass user information to advertisers.

In any event, the accusation makes little sense given the context. If Facebook wanted to leak user IDs to advertisers, surely it would be far more profitable to do it reliably, on every ad click, rather than doing it via a mechanism that (even according to the WSJ article) only discloses user IDs a small percentage of the time when the user happens to be viewing certain pages in certain ways.


I'm curious but I can't see the Quora thread.


The spin people are putting on this is just unbelievably sensation-mongering. ReadWriteWeb of all places is calling them on it - http://www.readwriteweb.com/archives/unbelievable_wsj_calls_....

It's so disappointing to see Hacker News be a part of this mob mentality.


Sorry, but RWW's subtext that this is nothing more than regular referral URLs is disingenuous.

Providing advertisers with personally identifiable information, particularly information that can be used to both gather additional data and target you later, is a pretty significant privacy failing.


Disclaimer: I have always thought Facebook was the devil -- it uses a growth model that co-opts human behavior in a manner not in the best interests of the participants

Having said that, the media coverage is starting to get the feeling of piling on. Reporters have decided the media narrative around FB is something like "Big company goes evil. Users revolt"

I think we may have reached the point where the leaders of FB really want to do this correctly, but the momentum of the company and the overriding media narrative may continue to drive lots of stories like this.

So. I'm going to be careful to double-check the "Facebook is killing your grandma!" types of stories. The media is famous for getting tech wrong. My guess is that most all of them will have a grain of truth. And most all of them will need some technical clarification before we can make heads or tails of it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: