"In this directory, we give a simple example of how to use the C++ Differential Privacy library.
Zoo Animals
There are around 200 animals at Farmer Fred's zoo. Every day, Farmer Fred feeds the animals as many carrots as they desire. The animals record how many carrots they have eaten per day. For this particular day, the number of carrots eaten can be seen in animals_and_carrots.csv.
At the end of each day, Farmer Fred often asks aggregate question about how many carrots everyone ate. For example, he wants to know how many carrots are eaten each day, so he knows how many to order the next day. The animals are fearful that Fred will use the data against their best interest. For example, Fred could get rid of the animals who eat the most carrots!
To protect themselves, the animals decide to use the C++ Differential Privacy library to aggregate their data before reporting it to Fred. This way, the animals can control the risk that Fred will identify individuals' data while maintaining an adequate level of accuracy so that Fred can continue to run the zoo effectively.
The animals have implemented a CarrotReporter tool in animals_and_carrots.h to obtain DP aggregate data to report to Fred. We document one of these reports in report_the_carrots.cc."
Tech companies love to use that line, "We take privacy seriously."
That seriousness is certainly reflected in this example, which appears to compare users with zoo animals, tended to by a "farmer".
If Fred is anything like Google, he wants this per animal carrot consumption data for some other reason(s) besides simply ordering more carrots.
This example makes privacy sound like some sort of resource allocation problem. What is the minimum carrots we must provide in exchange for animal data.
What if the animals are not "fearful that Fred will use the data against their best interest" but instead they know Fred is using the data for reasons other than ordering more carrots, profiting from that use and not sharing any of the profits.
I was not making a general argument for profit-sharing, I was calling attention to this presumed idea in the example of users worrying "they will use my data against my best interest". Obviously they will not use your data against you in a way that causes measurable injury (damages). If they did, you could sue them and potentially win. They are not that stupid.
However they may use your data for purposes other than the reason you allowed them to collect it. They will likely use the data to further their best interest; they will not tell you exactly how they use it nor will they cause you any injury. The only claim you potentially have is to the value of your data, which they utilise in their pursuit of profits.
You might not get a "share of profits", but you could claim the value of the data they obtained from you. If many users make the same claim, in the aggregate, that could be a substantial amount of data that carries a substantial amount of value.
The situation also presents the animals implementing DP to protect themselves from the farmer, but in reality the farmer collects the raw data and also applies DP at their own discretion.
If users really did have control over the generation of their data and come together to aggregate it, then they would be able to do more than just apply DP. They would probably not just give the aggregate data away for free (or "exchange" it for something they already get for free, e.g., carrots).
Zoo Animals
There are around 200 animals at Farmer Fred's zoo. Every day, Farmer Fred feeds the animals as many carrots as they desire. The animals record how many carrots they have eaten per day. For this particular day, the number of carrots eaten can be seen in animals_and_carrots.csv.
At the end of each day, Farmer Fred often asks aggregate question about how many carrots everyone ate. For example, he wants to know how many carrots are eaten each day, so he knows how many to order the next day. The animals are fearful that Fred will use the data against their best interest. For example, Fred could get rid of the animals who eat the most carrots!
To protect themselves, the animals decide to use the C++ Differential Privacy library to aggregate their data before reporting it to Fred. This way, the animals can control the risk that Fred will identify individuals' data while maintaining an adequate level of accuracy so that Fred can continue to run the zoo effectively.
The animals have implemented a CarrotReporter tool in animals_and_carrots.h to obtain DP aggregate data to report to Fred. We document one of these reports in report_the_carrots.cc."
Tech companies love to use that line, "We take privacy seriously."
That seriousness is certainly reflected in this example, which appears to compare users with zoo animals, tended to by a "farmer".
If Fred is anything like Google, he wants this per animal carrot consumption data for some other reason(s) besides simply ordering more carrots.
This example makes privacy sound like some sort of resource allocation problem. What is the minimum carrots we must provide in exchange for animal data.
What if the animals are not "fearful that Fred will use the data against their best interest" but instead they know Fred is using the data for reasons other than ordering more carrots, profiting from that use and not sharing any of the profits.