That doesn’t need to be the case though with just a little bit of effort and minimal cost. Use your own domain for email and set your account to be a catchall. Then use facebook.com@yourdomain.tld and your email address is no longer a cross site unique identifier.
Isn't this it though, the engineers designing the ad targeting system at Facebook is linking the random emails you use as "catch all" to your main identity so you can be targeted specifically even though neither party has full knowledge of the linkage between your catchall email and your main identity email. This is facilitated by information that is not under your control.
If facebook was able to design and build this system, you can bet that other companies are doing this too.
Check the TOS and/or implementations for many of the tracking providers and you’ll see they use hashed emails. Show me a way to extract the common domain name from the below:
The simple way would be to use part of the hash for the domain and part for the user. If you alternated bits it wouldn't be obvious.
I doubt it'd be worth spending the effort to target people with personal domains though, and it would have some negative effects, so your point is well taken.
If the hashing algorithm is known (and my guess is it is at least possible to reverse engineer it, if it isn't documrnted) then cracking a hash with a GPU may be quite feasible.
The hashing algorithm is well known, it’s unsalted md5/sha1/sha256. That doesn’t make it necessarily possible (sure, some cases yes, but not even most), let alone feasible, to rainbow table them.
Its pretty simple to crack unsalted hashes using rainbow tables, unless each hash is salted with a random distinct salt and if that is the case then these hash seem pretty useless. So how do tracking providers use these hash ? What other info is sent along with the hash ?
> Isn't this it though, the engineers designing the ad targeting system at Facebook is linking the random emails you use as "catch all" to your main identity so you can be targeted specifically even though neither party has full knowledge of the linkage between your catchall email and your main identity email.
If you use the method described in the grandparent, you use a unique email address for every site (e.g site1@yourdomain.tld, site2@yourdomain.tld, etc). The domain will be the common part, which would be very hard for a company to use because most domains are shared between many separate users.
This is no longer "just a little bit of effort and minimal cost" - most likely no one will use unique emails for every site as well as use private browsing mode permanently in order to avoid cross cookie / cross site contamination via 3rd party (non facebook) tracking. Which is cited as a "feature" - allowing clients to bring their own ad tracking database and integrating that into the FB one in order to make ad targeting more specific.
> This is no longer "just a little bit of effort and minimal cost" - most likely no one will use unique emails for every site
It takes a tiny amount of effort: you setup your domain with a wildcard so all you need to do to create a new email address is to use it. You could send mail to barkingcat@real.domain.for.394549.net right now, and it will be delivered to my inbox with no setup required.
It's also great in case you start spamming me. I don't have to struggle with your unsubscribe links, I can just blacklist all mail sent to barkingcat@real.domain.for.394549.net, and be done with it without any collateral damage.
You mean a very small percentage of FB users do this?
The point being as parent comment said it’s not “a little effort and minimal cost”. Figure a $10-15 overhead cost for the domain and maybe $5/month/e-mail account? Effectively to minimize tracking on Facebook one would have to spend a minimum of $70/year?
It doesn’t seem like a great solution...go with a “free product” like Facebook in exchange allowing them to collect and monetize your data, only to pay to combat their business model? May as well offer a competing service that doesn’t track you, collect/monetize your data and pay say...half the cost of a domain and email.
Sort of like at first people thought paying for cable tv would mean that there would be lots of channels without ads. Didn't happen. Only a few where you get to pay even more for now ads. Now Netflix begins the cycle anew.
It is completely possible to fingerprint a browser and then group all the email accounts used on it and treat them as a single user. When was the last time you lent your device to someone so they could check their email?
>Then use facebook.com@yourdomain.tld and your email address is no longer a cross site unique identifier.
unless sites smarten up and realize facebook@johndoe.com is the same person as pizzaplace@johndoe.com, especially when johndoe.com isn't a "common" email domain like hotmail.com
As someone that has created a facebook account with an unused email without using my name or any information they still recommend my friends, family and interests. Instagram did the same thing with my interests.
There's a lot more going on than linking email addresses.
Most marketing companies don’t share raw email addresses (rather md5/sha1/sha256 hashes of the emails). In that scenario, linking the common domain name is very difficult to near impossible to do currently.
You can do it with Gmail to some extent already. E.g. instead of using myemail@gmail.com I would use myemail+facebook@gmail.com. Gmail ignores anything after the plus. As someone mentioned, marketing companies usually share just the hash of email. The trick is not too popular and I didn't experience a company handling it yet.
A vast majority of companies either don't accept the plus because they are too lazy to implement proper email validation, or they strip the pluses from gmail addresses because they're strictly useless to them.
The "trick" is both popular and commonly made to be moot by programmers. Source: I know programmers at multiple companies that have written production code to strip the +suffix from the username portion of gmail addresses.
Agreed. This isn't done just for ad targeting, either. If a user invokes a GDPR right to be forgotten, it's useful to make sure you've found all the instances of that user's email address in your system regardless of the +additions.
I've been typically using name+website@domain.tld to distinguish email origins (and leakage). Ironically, I've already set up otherdomain.tld@privacy.domain.tld to hide registrar information, but hadn't thought of using it for day-to-day signups until now.
I think I'll extend the latter (and reduce the required Spam score) before it gets sent to my inbox.
That doesn’t need to be the case though with just a little bit of effort and minimal cost. Use your own domain for email and set your account to be a catchall. Then use facebook.com@yourdomain.tld and your email address is no longer a cross site unique identifier.