For what it's worth, the most private data here is shared to analytics companies...

lawnchair_larry · on April 2, 2018

As someone who has been working in security for a long time, and has seen how the sausage is made at even the biggest, most reputable companies who “take security very seriously”, the “strictest possible protection of that data” means approximately nothing. The only serious way to protect sensitive data is not to take it in the first place. Hell, not even the NSA can keep a lid on their sensitive data.

”For people who think this is still wrong, I'm curious what their pragmatic alternative is. How else are app developers supposed to analyze their app performance?”

Remember, customers first, your “needs” come second. That goes double when they are placing their trust in you by allowing you to be a custodian of their data.

Not long ago, desktop software phoning home would have been a scandal. Not long before that, it was offline and couldn’t phone home. Yet, we still had software. Unfortunately, developers have taken the slipperly slope all the way to outright abuse of their privileges in order to collect information that customers don’t know about or understand. This has led us to things like GDPR. It doesn’t matter if your intentions are good or your usage is benign. It isn’t yours to begin with, those aren’t your decisions to make, and developers need to learn to seriously respect that.

_dw7s · on April 2, 2018

1) At least don't send any personal data over http. It's 2018 for fucks sake. I can't believe there are companies out there with such a hand-wavy approach to this. Is it so hard to do https in this day and age? It's so basic wrt to a security audit, my head hurts. The fact that extra data is sent over https shows that they made an active decision to partition this data into non-important/important.

2) Just don't fucking send it to a third party. Every single time you do that you yield control over the data, introduce another party to the mechanics thus doubling the risk of disclosure and they you cry 'breach of trust'.

> Not everyone can afford to perform their own product analysis.

Then don't do it and don't store sensitive information. You're taking on a risk and if you don't have the money to roll your own analytics then you probably don't belong on the market. This is no longer a playground, this is the real world, especially for this kind of information. People can get killed based on Grindr leaks. It's the big boys game and if you don't have the backing, you shouldn't play in the first place. And this app specifically should not have any problems with funding, give me a break.

shadowtree · on April 2, 2018

Why do you need to share HIV/status as part of performance analysis of a web portal or app?

mackey · on April 2, 2018

So it looks like the requests go to profile.localytics.com which is the API used for https://www.localytics.com/profiles/.

So not used for performance, but instead "A people-centered and personalized approach to app marketing and analytics". I am not sure if this is better or worse.

wiiner · on April 2, 2018

Drugs marketing, supplements? I'm sure there's stuff you can sell more easilly to HIV positive people.

philipodonnell · on April 2, 2018

> My guess is that Grindr's agreement with Apptimize and Localytics asks for the strictest possible protection of that data. If anyone at Apptimize or Localytics has access to that data, I'd be incredibly surprised.

Honest question, are you in the SAAS analytics industry or is anyone else that can comment on this? I am not (though I do do data work) and I would actually be surprised if the SAAS company _didn't_ have access to the data.

That would require some kind of dedicated setup so that Grindr's data was not at rest with other company's data which is a) super expensive, b) no reason to expect that the SAAS company would not have access for maintenance/troubleshooting and c) kind of defeats the purpose of using SAAS.

hilbertseries · on April 2, 2018

For startups of their sizes, it's unlikely they have strict data controls. So, probably anyone working on the product side of things, support, engineering, services, has access to their analytics data. Basically, most of the company likely has access to that data. Grindr really shouldn't be sending that data to their analytics providers.

hackcasual · on April 2, 2018

They have the option of not sending HIV status to any third party.

anigbrowl · on April 2, 2018

Why would you ever default to an opt-out for that information? That's like saying "people should read the contracts" while waving about a 10,000 EULA in 6 point type, or burying an option checklist so deeply that most users don't even know it's there.

“But the plans were on display…”

“On display? I eventually had to go down to the cellar to find them.”

“That’s the display department.”

“With a flashlight.”

“Ah, well, the lights had probably gone.”

“So had the stairs.”

“But look, you found the notice, didn’t you?”

“Yes,” said Arthur, “yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard.”

mrkurt · on April 2, 2018

I don't think the distinction between "third party service" and "hosting company" is all that clear. You're sending data to a third party service when you host an app on AWS. The only data protection you have is contractual.

chimeracoder · on April 2, 2018

> You're sending data to a third party service when you host an app on AWS.

Amazon neither receives nor requires access to the raw underlying data (in this case: data in your database indicating HIV status, or decrypted bodies of requests sent over TLS indicating same) when you host your web services on AWS. While, yes, it's possible for a dedicated attacker to intercept and snoop on this data, it's (a) not easy, and (b) very much outside the bounds of the scope of the relationship you have with them.

Contrast to the setup described here, where the third parties in question both received and required access to the raw underlying data in order to perform the services they were explicitly contracted for.

You may not think this is an important distinction, but legally, it is, and it makes a world of difference.

mrkurt · on April 2, 2018

I honestly don't get the distinction you're making here. I understand how people _can_ use AWS without ever letting sensitive data touch their disks, but most apps hand everything over wholesale (and frequently in a nicely structured format on RDS).

The legal distinction you're making doesn't sound right to me. Contractors for companies that access your data aren't usually about whether or not an attacker can get at it, but about what kind of access an employee of the service itself has.

Amazon _technically_ has complete access to your data when you run on AWS, but they're contractually limited in how they can use it. The same goes for third party SaaS services. The major difference is "who writes the logic".

But I'm not a lawyer and won't ever have to argue that somewhere it matters.

Spooky23 · on April 2, 2018

Amazon is selling an abstraction, and goes to great expense to not have access to customer data. If you are a HIPPA covered entity, they sign a BAA that puts them on the hook.

It's like the difference between putting your papers in a storage locker versus your friends garage. The storage company ultimately has access to the locker, but is less likely to snoop (either consciously or accidentally) than any of the folks with access to that garage.

mrkurt · on April 3, 2018

But you've just described a contractual agreement. You're still sending data to a third party. I'm not sure we're disagreeing here.

rurounijones · on April 3, 2018

Would this be a better to distinguish?

AWS does not care about the data, does not want to see the data and goes out of its way to make it damn hard for it to see the data. The data is a black box to them and this is by design. You are not sending them the raw data in a format that they require for analyses. You are just sending them bits and bytes that they store for you.

The analyses third-parties in this case are the exact opposite. They explicitly require access to their data in a certain format for analysis. In fact, their business fails if they don't have access to this data.

They are both technically third parties but the way they handle the data is completely different. One has every incentive to avoid reading the data, the other has every incentive to hoover everything it can.

mrkurt · on April 3, 2018

I just don't think that's a meaningful distinction. There's no distinct line between "company that hosts all your data but doesn't analyze it" and "company that does data analytics on your data". It's a gradient, there are all kinds of companies that fit on that gradient, and it's weird to lambast people for using those companies as if it's a technical choice, when what we really want is people making good choices about the data protections their providers have in place.

AWS even has analytics products that require access to your data. I generally trust those more than sketchy analytics companies, but it's entirely because of the contractual protections AWS has in place, not because they're inherently different.

geofft · on April 2, 2018

What is the privacy distinction between a third party with a contractual agreement and an employee with a contractual agreement?

Remember that Russian intelligence got a spy hired by Microsoft: https://www.theguardian.com/technology/2010/jul/14/russian-s... Will your interview questions find a foreign spy, or someone who isn't even a spy but is interested in looking at private data for personal amusement?

mtgx · on April 2, 2018

If Microsoft implemented proper security policies, I imagine that guy didn't have access to all of Microsoft's user data.

So that would be the main difference. Virtually all of a company's employees shouldn't have access to user data at all, and those that do would only have access to parts of it.

yodon · on April 2, 2018

>If Microsoft implemented proper security policies, I imagine that guy didn't have access to all of Microsoft's user data.

This is precisely the sort of thing Microsoft takes incredibly seriously internally. Tim Cook may be a more vocal spokesman for treating user data with care but Microsoft is fanatical about it internally. They recognize the risk they face in the event of compromise and have made just enough mistakes in the past to appreciate how hard it is to actually protect their customers’ information.

geofft · on April 2, 2018

This might be a naive view, but I think companies that are good at this sort of segmentation will also be good at picking trustworthy third parties and limiting (both technically and legally) what they can do, and conversely, that companies that just send a bunch of sensitive data to third parties out of laziness have no meaningful internal controls either.

"Avoid third parties" is an occasional effect of conscientious care of data, not a cause of it.

madeofpalk · on April 2, 2018

What?

How about "lets just not spend medically sensitive information to third party services"

geofft · on April 2, 2018

That seems reasonable, but can we also say, let's not send medically sensiive information to every employee at the first party?

kayfox · on April 2, 2018

You could not fill out that part of the profile.

88e282102ae2e5b · on April 2, 2018

> Not everyone can afford to perform their own product analysis.

Just because ethical behavior is expensive doesn't mean you have a license to do whatever you want.

DoofusOfDeath · on April 2, 2018

I don't think you'll find total agreement on what behavior is considered ethical, particularly in this product space.

jrlocke · on April 2, 2018

The reasonable tradeoff, in this case, would be to continue using the third-party analytics saas, but exclude personally identifiable information, or at the very least, exclude this extremely sensitive information.

morley · on April 2, 2018

I think it depends on your reasoning for not sharing data to third parties.

It seems like you're arguing that sharing data is wrong because, in the wrong hands, the data could be used to personally identify someone. In my mind, these are the ways that can happen:

1. The data is sent to an advertiser who can target based on that data. Seems possible, so it's relevant that this data isn't being shared with an ad firm. 2. The data is sent to a third party, whose employees can access and leak the data. 3. The data is sent to a third party, whose data gets compromised.

So the trade-off is, what is the value of having user information in a tool for analytics purposes, versus the chance that (2) or (3) (or any unknowns happen)? My argument is that analytics firms are not in the business of leaking or selling data; their business hinges on their client's data privacy. So to me, this seems like a reasonable trade-off for certain types of data.

As for whether HIV status is the type of data that's unreasonable... I can buy that argument either way. I've never used Grindr but I can imagine it being extremely relevant to its users. And any data that has product impact is useful in an analytics setting. For example, if Grindr has some features that make it easier for HIV-positive or negative people to filter, then they'd be interested in understanding whether it's being used in the product. Then again, I can equally see them deciding it's not worth the risk, and removing it.

If you think sharing sensitive data is wrong under all circumstances, on principle, then you're entitled to your beliefs, but that would seem to me awfully close to religion.

gorkemcetin · on April 4, 2018

> How else are app developers supposed to analyze their app performance? The open source, self-hosted pickings are slim

There is Countly Enterprise Edition for this purpose (both mobile and web). Privacy focusing and on-prem installation.

arrosenberg · on April 2, 2018

This type of data is already protected under HIPAA and HITECH. Expand those protections to cover non-providers.

wackspurt · on April 2, 2018

(Copy-pasting the message I already posted in this thread. Seems more relevant here)

I think that adoption of privacy preserving data aggregation/analysis will become the norm. The most immediate applications are 1) telemetry data that is used for monitoring (for example, Google Chrome uses differential privacy for collecting this data), and, 2) services like Google Maps and Tinder-like dating apps. In these applications, essential user information can be represented as integer/boolean values (is user present in location X? True or False. how old is the user? device CPU usage right now? ...) Based on my limited understanding* of differential privacy, it falls short on exactness (of aggregate values) and robustness (against malicious clients). I've lately been studying the literature on function secret sharing and I think it is a better alternative to DP. Take this paper: https://www.henrycg.com/files/academic/pres/nsdi17prio-slide....

Prio: Private, Robust and Scalable Computation of Aggregate Statistics

Data collection and aggregation is performed by multiple servers. Every user splits up her response into multiple shares and sends one share to each server. I've understood how private sums can be computed. Let me explain it with a straw-man scheme.

Example (slide 26):

x_1 (user 1 is on Bay Bridge):- true == 1 == 15 + (-12) + (-2)

x_2 (user 2 is on Bay Bridge):- false == 1 == (-10) + 7 + 3 ...

If all users send shares of their data to the servers in this manner AND as long as at least one server doesn't reveal the identities of the people who sent it responses, the servers can exchange the sum of the shares they've received. Adding the three responses will allow the servers to infer that there are _ number of users on Bay Bridge without revealing their identities.

This system can be made robust by using Secret-shared non-interactive proofs (SNIPs). This allows servers to test if Valid(X) holds without leaking X.

The authors also bring up the literature on computing interesting aggregates using private sums: average, variance, most popular (approx.), min and max (approx.), quality of regression model R^2, least-squares regression, stochastic gradient descent.

Bottom line: I found the discussion on deployment scenarios very interesting. Data servers with jurisdictional/geographical diversity, app store-app developer collaborations for eliminating risk in telemetry data analysis, enterprises contracting with external auditors for analyzing customer data, etc.

* - I understand the randomized response and, to some extent, the RAPPOR technique (used for collecting Chrome telemetry data) but the other literature in that community goes over my head.

* * - This technique is a black box to me at the moment.

icelancer · on April 2, 2018

>>For people who think this is still wrong, I'm curious what their pragmatic alternative is.

Use the services you mentioned but DO NOT SEND HIV DATA TO THE ANALYTICS COMPANIES. Holy hell, how hard is that? Just omit that part.

dang · on April 2, 2018

Could you please not use allcaps for emphasis in HN comments?

This is in the site guidelines: https://news.ycombinator.com/newsguidelines.html.

kayfox · on April 2, 2018

Or just don't fill out that part of your profile.

loufe · on April 2, 2018

If one is HIV positive it would probably be a draw of the app to find only others who are also afflicted. Turning it off might result in some illegal decisions.

ovao · on April 2, 2018

Grindr wouldn’t be the only way to declare one’s STD status though. It could be omitted from one’s profile, but declared during chat, for example, or prior to hooking up.

DoreenMichele · on April 2, 2018

Methinks you don't understand the problem space. Putting it in your profile is intended to save the afflicted from wasting tons of time and energy on a. talking with people who will immediately nope out when they learn your status and b. dealing with a lot of emotional BS from people who want to see themselves as nice but who aren't really ready to deal with you and your situation.

That can be a hard enough conversation to have even if they know. Being straight up rejected by some high percentage of people who started to chat you up and are done the minute you mention HIV would be a dreadful experience. It's possible they are on the app precisely for the ability to pre-screen people for their willingness to hook up with someone HIV positive.

ovao · on April 2, 2018

Given the current reality, you can do either of the following here:

A) Mark yourself as HIV positive in your profile, which Grindr will share with third parties.

B) Directly declare yourself as HIV-positive at some point in a conversation.

I’m not suggesting there are options without drawbacks.

DoreenMichele · on April 2, 2018

I imagine a lot of folks declared it in their profile without knowing it would be shared and they probably declared to try to have a more positive experience over having one dreaded discussion after another. I think those folks have reason to be upset.

ovao · on April 2, 2018

I agree that they have every right to be upset. Grindr didn’t need to share that kind of information with third parties.

kayfox · on April 2, 2018

Illegal how?

bzbarsky · on April 2, 2018

Presumably illegal in the sense of https://en.wikipedia.org/wiki/Criminal_transmission_of_HIV (and note that in some jurisdictions "exposure", not just "transmission" is criminalized, per that wikipedia article).

kayfox · on April 2, 2018

Except you can decide if you want HIV status to be displayed and the last time I mucked with my profile, it was optional anyhow.

Its not even clear if whats being sent to the analytical companies is whats displayed in the profile or all fields even if they are not displayed.

getsugablitz2 · on April 2, 2018

I believe in certain places that if you withhold the information of being HIV positive from someone you have sex with and infect them, it's a pretty serious crime. I think it might even be a crime if you don't infect them.

throwawayfinal · on April 2, 2018

Depends on the state. The majority of states either have laws relating to disclosing known STDs or laws specifically about disclosing known HIV status, or both.

DoofusOfDeath · on April 2, 2018

Or don't use Grindr.