> with the only privacy guarantees being that the data is encrypted during trans...

FlyingLawnmower · on June 25, 2021

Have you thought about using central/global differential privacy (which tends to have much less noise) on the "high level aggregates" or "aggregated datasets" that persist after the research study ends?

E.g. from the FAQ: "We do intend to release aggregated data sets in the public good to foster an open web. When we do this, we will remove your personal information and try to disclose it in a way that minimizes the risk of you being re-identified."

It's a little worrying to think that this disclosure process might be done with no formal privacy protection. See the Netflix competition, AOL search dataset, Public Transportation in Victoria, etc. case studies of how non-formal attempts at anonymization can fail users.

jonathanmayer · on June 25, 2021

> Have you thought about using central/global differential privacy (which tends to have much less noise) on the "high level aggregates" or "aggregated datasets" that persist after the research study ends?

Yes. Central differential privacy is a very promising direction for datasets that result from studies on Rally.

> It's a little worrying to think that this disclosure process might be done with no formal privacy protection. See the Netflix competition, AOL search dataset, Public Transportation in Victoria, etc. case studies of how non-formal attempts at anonymization can fail users.

I've done a little re-identification research, and my faculty neighbor at Princeton CITP wrote the seminal Netflix paper, so we take this quite seriously.

skybrian · on June 25, 2021

Interesting. I can see that RAPPOR seems to be deprecated in favor of something else called ukm (Url-keyed metrics) but not why this change is being made. Is there somewhere I can read more about it?

jonathanmayer · on June 25, 2021

I am not aware of any public announcement or explanation. Which is... probably intentional, since Google is removing a headline privacy feature from Chrome.

skybrian · on June 25, 2021

How did you learn about it? By studying the code?

jonathanmayer · on June 25, 2021

Our team looked closely at the Google, Microsoft, and Apple local differential privacy implementations when building Rally. It helped that we have friends who worked on RAPPOR.

skybrian · on June 25, 2021

Did you end up using differential privacy in Rally? What's the thinking behind this?