Hacker News new | past | comments | ask | show | jobs | submit login
ACLU exposes Facebook, Twitter for selling surveillance company user data (arstechnica.com)
296 points by AdmiralAsshat on Oct 11, 2016 | hide | past | favorite | 63 comments



This is a very messy, multi-layered issue.

First of all, a large part of a social network's utility is derived from sharing data about oneself. This data is public, for the benefit of other participants of the social network. Some of those participants may have agendas that are contrary to one's own: stalkers, adversaries, data aggregators -- it's impossible to know all of them ahead of time, and a granularity of permissions that satisfies these competing goals is... tough.

Second, it ought to be a person's reasonable expectation that law enforcement is a benevolent actor. Actions such as this, which amount to targeted crowd surveillance, don't instill much faith in this assumption. This is a very serious issue that merits more discussion than my terse comment.

Lastly, this data aggregator should not be the target of the outrage. They're an intermediary in a system where we willingly share data with private companies to get some value-add, who then sell that data to fund their operations, and some of the purchasers of our data happen to be law enforcement. Anger directed at any particular intermediary -- like anger at reposessors or ambulance chasers or debt relief counselors -- is woefully misdirected; we should instead be asking why corporations feel the need to sell our data in the first place, and why we have law enforcement that feels the need to employ mass surveillance. If we ponder these questions, we'll have accomplished something more than drive an inconsequential company out business while leaving the systemic faults in place.


1. If the data in question were public, Facebook would not need to be the ones selling it. There would be a whole market of third party data aggregators.

2. If this were true, why would we need the Bill of Rights? The entire point of the Fourth Amendment, for example, is to protect citizens against unreasonable searches -- by law enforcement.

3. This would be true if the data were only shared on a need to know basis, which has not been demonstrated. Facebook allows users to give it data under one set of pretenses, and then turns around and uses/sells this data under a different set. I agree the other questions are important, but none of them would be possible if Facebook were not abusing its role.

The NSA FISA orders are getting endless complaints, and yet Verizon and others are handing this data over under a "court" order. In this case, Facebook is doing effectively the same thing, but for profit.


1. The data is freely viewable to anyone on the internet. There is just no practical means of searching / discovering it without the companies' feeds.

2. They shouldn't call it user data, as if Twitter / Facebook were selling private information they store about users. This tool just searches the public stream of posts.


1. Practical and apparently legal means: https://lumendatabase.org/notices/2037976

"• Accessing Facebook or collecting user content or information using automated means without Facebook’s prior permission;"

So it's clear that if I know of someone's public data feed and gather it, that's fine. But what if I gather from a list of 10,000 peoples public data feeds? So I'd argue there are two points of contention here:

(1) Does making something hard to discover make it less public? Assuming there is such a thing as "public"

(2) By creating a legal barrier to a particular kind of discovery (automated), and then selling access to this data, does this make it less public?

2. I'd argue that there are different kinds of public. For example, in one kind, I can look at a room full of people, find a man, learn that his name is Mr. Brown, and see that he is wearing a green hat. In another kind, I'd need to create a request to find out if Mr. Brown exists, if he is wearing a hat, and if so, what color. In both instances of "public", this information is readily available to me, but in the second, I have to have some idea of what I want, and specifically request it.

What Facebook is doing, for me, seems like they are selling access to the first kind of public, and then allowing everyone else free access to the second kind of public, within limitations: I cannot write a script to cycle through a list of names and clothes, and I'm probably restricted in how many requests I can make within a certain timespan.


re 1): as this situation stands now, facebook threatens via C&D or sues other companies that try this invoking us law.

However if you incorporate in countries that are actively hostile to companies like facebook/google (i.e. Indonesia, China), you can operate aggregators of such data more effectively. However that barrier is a bit different from jumping over this barrier in the due to Facebook incumbent status.


Can you provide evidence of that? I find this interesting, because a lot of people are saying "it's public data." I would argue that if you get a C&D for scraping/etc, it's no longer public.


Facebook's c&d against me and my friend's company we bootstrapped to about $100k annualized before decicing to close stuff down to changes from this: https://lumendatabase.org/notices/2037976

I'm in Indonesia now, and still have everything pretty much ready where I left off (and technically its even easier than before to exfil data from this walled garden sans user account, because of the huge surface of oauth keys out there in the wild as it is for every website with similar protocols and api's) so i'm just doing due diligence now to make sure that like me and my friend handled CNIL[0] from the US, we can effectively send any facebook requests to /dev/null

So I say to anyone else if you think its "public data" try to start a company that builds a derivative product within the US leveraging such "public data" ;)

[0]: https://en.wikipedia.org/wiki/Commission_nationale_de_l%27in...


>Second, it ought to be a person's reasonable expectation that law enforcement is a benevolent actor.

I totally disagree. Care to expand on that?


I think the point is that police etc should not behave in such a way that many people immediately assume that an encounter would likely be antagonistic.

Basically, if you're afraid to go ask a police officer for help (which, apparently, is fairly common among black people in US, for example), then something is very wrong about police, and needs to be fixed.


Thank you, this is exactly what I meant in the ancestor post.

It's a difficult situation, because different people have different opinions on what the role of law enforcement should be in proactively addressing upcoming behavior that they believe may develop to be criminal. By engaging in targeted mass surveillance in ahead of a suspected protest, the police in this situation is literally preparing for a confrontation, contributing to an antagonistic relationship between (this slice of) the community and the police.

I don't believe this approach is productive, but I recognize some people hold a different opinion. However, when this approach causes knock-on chilling effects and reduces the likelihood that people needing help will call upon police [1], I believe it's a detriment to the health of society.

[1] http://www.theatlantic.com/politics/archive/2016/09/police-v...


It's worse than that. Some prominent lawyers advise that everyone should never talk to the police.


Which is very good advice, but it's because of the fact that testimony given to a police officer can only ever be used against you (it's hearsay otherwise). Not to mention that if someone slips up and accidentally tells a lie in the middle of a testimony, that instills doubt in the entire testimony ("I didn't kill her, I've never killed anyone, I've never even used a gun, I wasn't in the area that night" sounds fine until the last part if it turns out you were in that suburb that night).

Basically, to a lawyer a client talking to the police is giving the prosecution ammunition. Not to mention that nobody (not even the supreme court) can claim to know every federal law that applies to a person at a particular time (there was a quote for that but I can't find it on my phone right now).


Which is very good advice, but [..]

Wrong conjunction? It seems like you're explaining why it is good advice, not offering an opposing view.

I think you have given a perfect illustration of the GGP's assertion that "if you're afraid to go ask a police officer for help [..] then something is very wrong about police". This is a good example where the law sets up the police as antagonists against the general population.


My point with the "but" was that the reason why lawyers advise this is not because police are trying to harm you. It's because in general, people are very bad at law and providing testimony. So maybe I should've clarified that -- the police are not at fault for doubting flawed testimony.

To be fair, police do engage in manipulative tactics in questioning, but that's kinda their job. The job of the person being interrogated is to ask for your lawyer and say nothing else.


It's good to keep in mind how police are incentivized. They are paid to arrest suspects and close cases. Whether or not they nab the wrong person hardly matters to them. So although they generally may not intend to harm you they will gladly do so if it increases the numbers on which they are graded. And if you happen to have some cash on you when you make the mistake of talking to them then there is the bonus opportunity of asset forfeiture.


You can tell how bad things are when people are seriously arguing that police can't be expected to be a benevolent actor.


Meaning the police ought to earn that expectation, not that people should think that way right now.


Can I ask why you disagree?


Sure. I think that law enforcement is necessarily there to keep us from doing things. That is enough for me to think it's not necessarily the same as benevolence. At the risk of flying my ultra-nerd flag, it's like the lawful-chaotic vs good-bad axes. To paraphrase the comment I was questioning, I think it ought to be a person's reasonable expectation that law enforcement is a lawful actor, but A) we aren't even there yet, and B) that is orthogonal to benevolence. The connection between lawfulness and benevolence is complicated by so much history and disagreement about what is actually "good" that I hesitate to just casually purpose them together like niftich's comment did.

It might even be worse, that we ought to expect law enforcement to be agents of order above all, and that just makes kinda me nervous. I'm not sure how rational those feelings of nervousness are. (both of those "ought to" statements are in the sense niftich used it in). Maybe law enforcement is fundamentally just a necessary evil, in that they're going to prevent someone from doing what they think is ok. Hence the orthogonal axes, where this is just a statement about order. I don't really know how I feel about this, hence the terse question. ¯\_(ツ)_/¯

Anyways, I didn't want to put my own opinion directly in my question because of the risk of sidetracking discussion towards my own opinion and away from my curiosity about why niftich felt that it was true enough to state without justification.


You raise some good points and it's helpful for me to understand how you arrived at them.

I don't disagree with most of what you said, but I feel that law enforcement ought to be agents of societal order, where that benefit is to the good of society at large. They're the enforcers of a social contract -- one that we happen to write down and call 'laws' -- and only by willfully violating these laws would I incur the antagonism of the police.

My rationale for using the construction "... law enforcement is a benevolent actor" does reveal my view that certain types of proactive policing result in tactics indistinguishable from a malicious actor: stalking suspected future offenders, aggressively policing minor offenses that in other circumstances would be excused, and creating an environment where proper etiquette to not end up detained, injured, or deceased is getting exceedingly difficult.

See also, this comment [1] where I address the original question from the context of another reply.

[1] https://news.ycombinator.com/item?id=12690543


I expect police to be good people. Guess we differ there. I expect police to not follow evil laws. That's a bit hyperbolic though. More importantly, I expect police to use their judgment to do the right thing.

Police have an incredible amount of flexibility inherent in how they interpret laws. They choose when to put someone in jail and when to give them a break. They choose when pull their gun.

The law doesn't provide adequate guidance here. It cannot. Laws are made to be interpreted by people. Police, prosecutors, judges, and juries all apply their humanity to the ridiculously coarse and simplistic letter of the law.

I expect police to be the best of us. They need to be. It's an impossible goal but we could go a lot better than we do now.

Are you a programmer? If so, I hope it wouldn't offend you to suggest that you might be thinking of the law as a computer program, and police as machines that run it. If we ever create a benevolent machine to serve as our lawbook, I'll agree with you that humans should simply put their judgment aside and be simple agents of its will.


I think that flexibility is what puts laws as they exist (imperfect, maybe even bad) right next to the "ideal" social contract (good) in their mandate. With both of those in the job description, yeah, they ought to be good. That flexibility in enforcement and the judiciary can have bad unintended consequences (leaving the door wide open for the same action being legal for people in power and illegal for everyone else, or for bias to creep in), but on the whole we need it. You're totally right. Thanks for the insight.


I'm astonished at how lazy this comment is.


Sorry if that came across as lazy. I answered with my own opinion in more depth in another comment, but I didn't want to have niftich have to focus on details of my disagreeing point of view. I just wanted to know why they felt it was so plainly true that they stated it without justification, and I wanted to ask it without distracting the conversation. I was trying to respectfully say, "There are (hopefully) rational people out here who don't take that statement for granted. Can you explain?"


Companies feel the need to sell our data because they are offered tons of money for it by other companies such as Geofeedia. I agree that the blame does not rest on Geofeedia alone, but they aren't innocent either.

I don't know a good solution to this issue other than making it illegal to buy or sell information about users of public corporations' products.


The bigger issue is "why is it that companies can make more money selling personal data about consumers than selling products to consumers?" If companies had a viable business model that involved ordinary consumers paying money for services rendered, then their responsibility would be to the people paying them money, i.e. the consumer.

This is a really complicated question, and probably results from a combination of:

1.) Consumers not having disposable income at all, because real wages for a large slice of the population have been stagnant or falling for a generation and necessities like housing, education, and health care keep going up.

2.) Consumers having certain expectations of what they can get "for free", and not wanting to pay money for things that competitors will offer for free.

3.) Consumers having a poor understanding of the long-term consequences of putting data out there, and what it can be used for.

4.) Network effects where even if a small minority of consumers are aware of the above and would rather pay for products, they can't because a majority of the market would rather pay with information rather than money.

If you just ban the sale of information outright, you will either end up with work-arounds (eg. Google doesn't sell information, but it sells better ad-targeting and behavior manipulation based on that information), or consumers will lose access to services that they depend upon for much of their daily life.


I agree with most of what you said, but I don't think that banning the sale of user information is a problem for Google or similar business models.

I may not have a problem with Google collecting information about me and using it to target ads. I do have a problem with Google turning around and selling that information to just anyone. I think there is a distinction there that makes a difference. I may trust Google and choose to trust them with my information, but I don't necessarily trust their partners.


And frankly, if it's going to happen in a capitalist society, why aren't I getting my cut of the profits?


You made a lot of great points. In regards to:

"3.) Consumers having a poor understanding of the long-term consequences of putting data out there, and what it can be used for"

This is so true and unfortunately there are few resources for helping people understand the implication and consequences. I think at the very least we should be educating young people, perhaps in school.

There's a decent documentary that came out a few years ago called "Terms and Conditions May Apply", its an independent release and so lacks some polish in places but I think its worth a watch:

https://www.youtube.com/watch?v=7HPw_hx7OME


> The bigger issue is "why is it that companies can make more money selling personal data about consumers than selling products to consumers?" If companies had a viable business model that involved ordinary consumers paying money for services rendered, then their responsibility would be to the people paying them money, i.e. the consumer.

Because advertising doesn't fulfill demand, it generates demand. It generates demand by preying on people when they are weak, when they are distracted, about what they are insecure, about what they are not knowledgeable.


https://www.aclu.org/blog/free-future/facebook-instagram-and...

Instead of ArsTechnica, why not link to the ACLU's website?


> Instead of ArsTechnica, why not link to the ACLU's website?

I prefer journalism to the original source's post. The ACLU is an advocate for its cause; a secondary source can provide context, other sources, opposing points of view, etc. For example, compare a post by a political candidate, with its natural extreme bias, to a newspaper article about it, which includes fact-checking, expert analysis, their opponent's reaction, etc.

That's the reason Wikipedia requires secondary sources for its citations and forbids primary sources.

EDIT: Fleshed it out a bit.


Thank you.

>Twitter did not provide access to its “Firehose,” but has an agreement, via a subsidiary, to provide Geofeedia with searchable access to its database of public tweets.

I really would like Twitter to share what search words were entered. It would give us a first-hand window on what government agencies are trying to monitor, usually secretly.


That's nice that Twitter is able to keep this kind of thing at arm's length by allowing resales of the firehose.


If the data is public, why is it a problem? If they're sending private info to cops, that's different, but a reasonable person expects all public info to be visible to the last person you want to see it. Theyre just feeding the customers the same data users can see in a more condensed form aren't they?


Probably, because it violates the terms of services of the data access agreement, which I would imagine prohibits the use that was being made because it's the kind of use which discourages people from using the service from which the data is sourced, and it is therefore in the interest of the service provider to officially prohibit the use and to take action to deal with misuse at a minimum when it has become public.

The chilling effect of surveillance has commercial as well as liberty implications, and commercial entities have rather strong reasons to care about the former even if they aren't concerned with the latter.


This conflation is made too often. I think the problem with the assumption that if someone made something public willingly they don't care is if people knew they are under surveillance they may not speak their mind, or speak at all.

If people who use the internet expect all their activities to be logged and mined for patterns that can potentially be used against them, they may behave very differently.

Since we all act in self interest this means no one will speak freely. This is what surveillance does, this is what the chilling effect is.

This reduces the value of the internet for the free exchange of ideas and discussion. People who seek this will have to resort to encryption and other tools and forced to behave in a secretive fashion as if they were doing something wrong.

Protests and activism are supposed to be normal part of every democracy and its disconcerting the see the strange discomfort our 'democratic societies' show to activism and the sheer amount of effort devoted to managing it from infiltration to surveillance.


> If people who use the internet expect all their activities to be logged and mined for patterns that can potentially be used against them, they may behave very differently.

Unfortunately you see this already. Though usually people are worried about adversaries taking screenshots of conversations out of context and then painting them in a negative light, not the government doing that. But ultimately it's the same issue, there's a chilling effect on speech which is caused by the overwhelming fear of mass surveillance.


Mass databases, in a format that can be read and cross-referenced by computer, are an example where a difference in degree really makes for a difference in kind.


But almost all location data is "public". I could follow you around, staying in public places, maybe have a friend or two help me out. And hire an army of private investigators to do the same for a bunch more people.

But that doesn't scale, so much so that the difference in degree becomes a practical difference in kind. That's just an exaggerated version of the difference here. One of the most effective practical tools to combat bulk surveillance is to increase the cost.

I'm also of the opinion that people are kinda asking for this by giving that data away and posting it publicly, but that doesn't mean those already collecting this data have no responsibility if they package it up and make it easy to use for surveillance.


IMO the average person does not know enough about the freemium services they use to make an informed decision regarding their own privacy, and it's kept that way intentionally by companies like Facebook, LinkedIn, Apple, Google etc. Long terms and conditions that constantly change, user interfaces built to delay gratification until a box or button is checked (and them giving that checkbox a significance that goes well beyond common sense), planned obsolescence and deprecation that force migration to other services..


If something is public then it does not follow directly that you can collect it or share it with a third-party. It all depends on relying legal framework. Especially when it depends on additional spacial and temporal parameters.


The firehose isn't public.


Access is not, but the content is.


Only if you follow everybody, which is impossible as a practical matter.


COTY.


The data may be public, but they get greater access to the entire data set than the general public. This greater access is what was cut off.

Also most people aren't aware that this level of access is even possible, much less that it's being sold to the highest bidder. Which in my opinion, is a problem.


The same reason ALPRs (with long retention rate) are a bad idea.

"No reasonable expectation of privacy in public" is the US norm today. I'm suggesting that maybe it should be revisited in light of the pervasive surveillance that has come into being.


I laugh at twitter playing stupid to what their information was being used for.

Government loves its surveillance. Try this though, if you live next to a cop, FBI agent or public official place a high res video camera on your property facing their house and leave it live streaming. Post it to a website with photos of all their family and children, their license plate numbers the time they leave for work etc. All things that are publicly available and captured from a simple video camera. Start having multiple people do the same and locate it all on your website. Then watch how fast the hypocrisy rears its head.


It's interesting to note that in some cases the data was being fed through an intermediary... who's to say there aren't other integrations doing the same thing?


> ... The companies need to enact strong public policies and robust auditing procedures to ensure their platforms aren't being used for discriminatory surveillance."

Hm, is it okay to use it for non discriminatory surveillance? Can there ever be surveillance that is not discriminatory since the surveying entity is usually has no interest of surveying itself and to provide that data to the public.


from a story posted yesterday about the CIA crunching data to predict social unrest, it is fairly obvious the feds have access to social media firehoses

https://www.engadget.com/2016/10/05/cia-claims-it-can-predic...


I find it strange that it would be perfectly OK to sell this data to a commercial entity, but it's not OK to sell it to the government.


I think that what the buyer does with the data has a lot to do with this. They would be in even more trouble if they had sold the data to a private mercenary firm who takes out contracts on business rivals, for example. Conversely, if they had been selling to, say, the U.S. Census Bureau or FEMA, they would likely be lauded.

It does say a lot about how little people trust the police these days, though.


As a non-native speaker this headline took me way too many attempts to parse correctly. For anyone wondering the same, it's meant to be read as:

ACLU exposes Facebook and Twitter for selling user data to a surveillance company

The way I read it the first few times I thought they sold user data of a company, for surveillance.


As a native speaker, it's a poorly written headline.

One term for this form of misleading / confusing sentence (and especially headline) is a "crash blossom".

http://www.quickanddirtytips.com/education/grammar/fun-with-...

https://en.m.wikipedia.org/wiki/Syntactic_ambiguity


I think in this case it's made worse by the wording "selling X Y" as a substitute for "selling Y to X" being relatively rare and in this case both X and Y being compound words (surveillance company and user data).

So not only is it a crash blossom in the sense that it's ambiguous whether it's [surveillance] [company user data], [surveillance company] [user data] or [surveillance company user] [data], but it could also be read as [surveillance company user data]. There are at least four possible ways to attempt to parse the sentence and the only one that makes sense relies on unusual phrasing.


I'm pretty sure I have seen other apps do similar things. The data they are feeding them does seem to be mostly public data. I'm not exactly sure how I feel about this, or how outraged / surprised I should be.


Just because the data is public doesn't mean it can be easily accessed or aggregated en-masse for real time survellience.

If Geofeedia was only using public data, easily available to anyone else, then Twitter and Facebook wouldn't have been able to cut off their access.


"Public data" doesn't mean easily accessible.

Twitter sells access to their data. They can cancel that account, but the data is still public. Without the access contract it is inconvenient and against the terms of service to access in bulk.


I love that "user data sales" has become such a normal thing that when someone stops selling it to ONE company out of like several thousand, it's considered newsworthy.



Shame on Facebook! Shame on Instagram! Shame on Twitter! I consider this a despicable act on the part of these companies. They stopped this data selling only because they were caught. Using data internally to feed to algorithms to show targeted ads is one thing (though privacy wise it's bad). Selling data outright to other companies is a terrible state of affairs. It's sad that this won't make a big dent in the user base or make people flee these platforms in large numbers.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: