Hacker News new | past | comments | ask | show | jobs | submit login
Privacy implications of email tracking [pdf] (senglehardt.com)
157 points by mpweiher on Sept 29, 2017 | hide | past | favorite | 69 comments



Hello, I'm one of the authors of this work.

The code and data for the study is available here: https://github.com/citp/email_tracking (measurement platform here: https://github.com/citp/OpenWPM).

We also just released a blog post that highlights the main results (and is a quicker read): https://freedom-to-tinker.com/2017/09/28/i-never-signed-up-f...


Thank you very much for doing this work! I have recommended that my employer, who makes a popular email client, read this and consider adding these sorts of features.


Folks, every email client and service has an option to not auto-download images. Plenty have it on by default (MS Outlook, notably). All of them ought to (IMO), but it's better than nothing.

I warmly recommend you turn it on. There is no need to switch to text-only email clients just because of tracking.


That doesn't work for all clients. It usually will, but I have seen some make HEAD requests for inline images to try read the expected download size to display this to the user - if this happens then the server obviously gets the full URL including tracking codes so has a fair idea the mail hit a valid mailbox.


Similarly, Gmail screws up its UX (as usual). If you ever make the mistake of clicking "Always display images from <...>", there is no turning back for that sender, as far as I can see at least. No "block images from this sender" button, no list in settings of all approved email senders that you can change your mind about, nothing.

The only option at that point is using the PixelBlock extension:

https://chrome.google.com/webstore/detail/pixelblock/jmpmfcj...


Apparently you can, but it's a bit hidden: https://webapps.stackexchange.com/a/103470


I could swear I checked that spot!

Still, would be nice if there was an overview in the settings page, and a possibility to wipe the entire whitelist at once


Does that help though?

I've used Airmail for a long time with "Autoload Remote Images" off and thought it worked. Then couple months ago I installed Little Snitch and saw the client contacting all kinds of places. I had to explicitly deny connections to all other domains except my email provider's.


Last time I checked, Google's Inbox by Gmail did not include a way to disable image loading, which is why I won't use it.


All Gmail (and presumably Inbox) images are transcoded/downloaded via Google's image proxies. https://support.google.com/mail/answer/145919?co=GENIE.Platf...


Does that mean that if you even receive an email from a list, they get notified that you opened and read it, even if you didn't? If so, that seems bad to me.


Bad for whom? If spammers actually wanted to use that as an indication of a live account, they'd need to just give up and ignore all gmail.com addresses, which would be win for ordinary users.


No. From the link:

> senders may be able to know whether you've opened an email that has an image attached to a unique link

My limited understanding is that the load happens on first open, from Google's servers.


Yeah, without the ability to cookie them, which I guess is the gain from that.


Gmelius (https://gmelius.com) detects and blocks pixel trackers in Inbox.


The gmail app on the iPhone does not


Even if it did block them in the app, we have no way of knowing if tracking pixels links are followed by Google when it scans/sorts you mail server-side.


You could test the principle by including a graphic in an email, send it to an account with images turned off, and then see if any HTTP requests come in, perhaps.


https://www.emailprivacytester.com/

https://news.ycombinator.com/item?id=9237550

> mike-cardwell: a web app I wrote [...] Sends an email to you which checks to see how much stuff your email client is leaking


Interesting (to me and hopefully to others!) Side note, this is why eve-online added the option to bot autoload images in the forum.

As eve-online has a shit ton of spies(and some people take this game very seriously) a large alliance was using self hosted images and logging ips that we're viewing those images. Through matching posts they were able to scout out alternate accounts and catch spies.

That's how we all heard about it at the time anyway, while technically fesable, I don't have first hand knowledge either way.

Although it did make me start to think about those emails with images and convinced me to turn them off by default ;-)


I simply stopped reading HTML emails and read everything in plain text. Works fine for most of the emails I receive, the ones that don't are almost always marketing emails anyway.

I do realise clicking tracking links still works but they are way more visible in plain text mode and it's not that hard to copy & paste the relevant part of the link.

Too bad most email clients have removed plain text rendering these days. I also haven't found a good mobile plain text email client yet.

MailMate has a really nice feature when you can read your mails in plain text by default and toggle the HTML version on a per mail basis with a shortcut.


> Too bad most email clients have removed plain text rendering these days. I also haven't found a good mobile plain text email client yet.

eul [1] only has plain text rendering. It only supports Gmail right now, but a full-fledged email client is coming soon. A mobile app will be released in early 2018.

[1] https://eul.im


Sounds interesting. It looks like a one man show though. Is it open source/free software?


I use Thunderbird with automatically downloads disabled.

Same on GMail app.


Thunderbird with Simple HTML view is my favorite. It strips the remote content and fluff, while allowing simple things like bold and tables.


The only safe email is text-only email | https://news.ycombinator.com/item?id=15224199 (Sep 2017)

Not too many metions of email clients except mutt and alpine.


It is a bit fun how some think they can track if I open emails from them or not. I have several times got email with something like "we see you are not reading the emails you get from us, so we will remove you from our list". I did read the emails using Thunderbird, so was only their tracking that didn't work.

But this is maybe what we get when most read emails in web apps from companies that want to track everything everyone do and think so they can show the adds that are least relevant


oh god I wish spammers did this. Imagine: "we notice you don't care about our spam advertising, we'll remove you from our list"

heaven.


I've gotten that but it is an unfulfilled promise. It was just another cheap trick to attempt to get me to click their link.


That's impressive. Using FOMO to get you to read something you don't want


We do get that in the UK


1) Removing inactive subscribers is a good thing. You stop bothering those who've stopped caring (but haven't bothered to unsubscribe) and pay less for your CRM/email software (if it charges by number of contacts).

2) Marketers should not rely on opens alone to determine who is inactive. Also look at clicks, and site activity from the past 6 months.


Gmail actually takes steps to protect privacy when loading images. Apple's mail apps are the ones that load images indiscriminately. Pretty sure that's for UX reasons not ads.


Not sure what you mean by, "Apple's mail apps are the ones that load images indiscriminately"? I've been using Apple's Mail.app for years on many different machines with many different accounts from different providers, and I have image loading turned off. I've never seen it load an image when I didn't ask it to. Furthermore, it sounds like Gmail will download all images from a mail and store them on Google's servers, which means they're triggering the tracking from those email providers!


Sure, but most people never change any settings. By default it loads all images from all senders


Google takes some steps, but still reports opens back to senders.


This is why I still love and use mutt (NeoMutt, technically) even after all these years.

I occasionally do fire up Thunderbird but only for those extremely rare cases where I actually do need to be able to read an HTML e-mail.


Surely there's a way to pipe an html email to $BROWSER from [neo]mutt?


Put this in your ~/.mailcap (all on one line)

text/html; elinks -no-connect -dump -dump-charset UTF-8 -dump-width 140 -default-mime-type text/html %s; needsterminal; copiousoutput;


Yeah, as jasonjayr pointed out, and it's done automatically for pretty much every HTML e-mail I receive. Occasionally, however, due to formatting or whatever, I need to be able to view the actual HTML e-mail "as it was intended" (so I fire up Thunderbird).


We still have a serious problem with mail client behavior. There is so much that clients could still do to add basic security, even though E-mail protocols are terrible.

For instance, why do we not see in every client a big warning at the top saying something like: “NOTE: YOU HAVE NEVER RECEIVED E-MAIL FROM THIS INTERNET LOCATION BEFORE.”? Heck, such messages should even be auto-quarantined to specific folders. It would go a long way to protect people from constantly opening spam.

And, why by default do they insist on making everything look “simple” and “clean” at the expense of helping users to do even the most basic validation? They show senders as short names like “Facebook” when CLEARLY the message is coming from facebook.spammer.com or whatever when you do even the slightest digging into the original message.

Why are “rules” so complex, since damn near everybody needs them for basic sanity? There ought to be a button in every message saying something like “Mark Every Future Message From This Sender as Junk”, and similar short-cuts.


The 'simplicity' is the same as Microsoft hiding things like file extensions, ostensibly to help less experienced users. It ended up making users have even less of an concept of file types, and made it easier for evildoers to disguise executables as photos and such.


> For instance, why do we not see in every client a big warning at the top saying something like: “NOTE: YOU HAVE NEVER RECEIVED E-MAIL FROM THIS INTERNET LOCATION BEFORE.”?

Because that is way too dangerous a policy. Recently, I moved, and in creating online accounts for online bill pay, I got confirmation emails from each of my utilities. Saying that they're spam just because you've never received email from them would cause most people to be unable to find these confirmation messages.

> And, why by default do they insist on making everything look “simple” and “clean” at the expense of helping users to do even the most basic validation? They show senders as short names like “Facebook” when CLEARLY the message is coming from facebook.spammer.com or whatever when you do even the slightest digging into the original message.

Uh, my email client doesn't do that. If the email address isn't priorly known, it shows the email address instead of the display name.

> Why are “rules” so complex, since damn near everybody needs them for basic sanity? There ought to be a button in every message saying something like “Mark Every Future Message From This Sender as Junk”, and similar short-cuts.

Most spammers don't reuse the same email addresses. You end up with a lot of useless rules. Bayesian spam filtering is much more effective, for example, and requires very little user action.


> Saying that they're spam just because you've never received email from them would cause most people to be unable to find these confirmation messages.

It's not saying that they are spam. It's just saying that you never received a message from them. That account confirmation email you are expecting will be obviously marked, but that phishing email claiming to be from your bank will be marked too. You look at the mark and decide what to do.

Email clients probably don't do it because it is not as useful as it sounds. Impersonating email senders is not hard, so phishers will just do it.

> If the email address isn't priorly known, it shows the email address instead of the display name.

The only email client that I have ever seen doing that is the roundcube instance I configured on my VPS. I use several clients, nearly all of them either hide the sender address or decrease its relevance enough so that nobody sees them.

I'm in complete agreement with your comment about spam filtering. The only thing is that somehow, it feels like it worked better at the earlier 00's. Nowadays the training for your account will be dissolved in a huge set of unreleated data, so that anything specific for the spam you are receiving will never be reflected on the filter. That is both for marking things as spam and as not spam.


I get where both of you are coming from, but there is one UX part of this I've found that is hard to solve.

Inexperienced users want to be told what to do. You can't just throw information or warnings at them without giving them a way to act on it.

Combine that with the fact that if the users even read the warnings they are going to only read a sentence at most, or just the first option.

So when you show a warning like"you have never received email from this address before" users are going to ask what they should do. Is this dangerous? Did it come from my bank? I've had this bank for years! Does this mean the email is a hacker!?

If you say "it can be dangerous, but it also can be just a new email" that will be read as "yes this is dangerous" and now they will learn the hard way that it is safe, and your warnings will have less weight in the future (they were wrong about this being "a hacker" once, they might be wrong this time too!)

It's a really hard problem to solve, and the "easy way out" is to not show the information at all (no confusion if you just don't show it!) But that kind of just kicks the can to the user leaving them to determine if an email is "good" or "bad".


If impersonating senders was so easy, phishers would be doing it. Yes anyone can lie in the From header, but any sane mail system would reject it as unverified


Have never really thought about this, but it's a great point. At the very least, mail clients should have options to enable this kind of behavior.


Yes you did - we all did. By not paying attention, by not speaking up, and by deferring to those who don’t give a shit.


At least we have some control over the exposures discussed in the paper. Very many companies are exposing (sensitive) information to third-party email delivery services, customer feedback and polling services, etc. Thus creating a privacy issue (and in some cases, a serious one) even before the email arrives at the recipients' server.


As others have pointed out, the best solution is to configure your email client(s) to your security preference but it's worth noting that some companies will give you the choice of plain text emails over HTML. I was pleasantly surprised to find Amazon (at least in the UK) offer this.


For those reading https://news.ycombinator.com/item?id=15354114 OSX mail client does not help.


OS X mail has an explicit option in the "Viewing" section of its preferences saying "Load remote content in messages. Email messages may contain images or content stored on remote servers." I think this option was there since Lion or even earlier.


The average user enables them otherwise, newsletters or other mails are unreadable. Smarter would be if Apple can IMAP store certain content.


Thunderbird blocks remote content in emails.


We found that Apple Mail clients typically load remote resources by default, unless the message is spam. That content can't set or retrieve cookies, which is an improvement over other standalone clients.

I'd still recommend disabling remote content by default since the tracking identifiers (the hash of your email address, etc) are present in the image URL. That's enough to continue to track the read and serve targeted content. See: https://web.archive.org/web/20170922213846/https://support.l...


What do you mean? I have blocked external content, is that not enough?


Actually, the paper says it does. It doesn't display inline content, I think by default.

The paper goes on to say that then you don't get to look at those e-mails. Yes. That's OK.


Doesn't Google immediately download any external images/assets in an email and cache them to prevent exactly this?


It's my understanding that they only download the assets when you first read the message. So it protects your IP address, but it doesn't protect the fact that you read the message.

[edit] Yes, I have just confirmed this by using emailprivacytester.com


So what you should do is have some sort of script running 24/7 that "reads" your emails the moment it's received.


You're right, I was misremembering the change.


It's not for your benefit - you are Google's product, and only they get to milk their own cows.


Gmail's image proxying actually won't help much with this style of tracking. In the paper we found email addresses (or hashes of them) leaking to third parties via a query string parameter in the request URL. So the proxied URL still contains your email address and the third parties can still learn you opened and read the email. As another commenter mentions, we found these requests to occur the moment you first open the email in the Gmail web interface. Since the request URL is unique to you, it can still be used to serve you targeted content. See: https://web.archive.org/web/20170922213846/https://support.l...

Actually I suspect image proxying will also interfere with request blockers like ABP or uBlock Origin, which may have otherwise blocked all requests to that third-party domain.


This is particularly annoying, because if they just cached the image immediately upon receiving the e-mail, there would be no privacy impact AND you would get to see the image. (Yes, it might still send your e-mail address to third parties, but someone who has your e-mail address is free to send it to anyone they want without your help; there is no way to stop that.)

I guess we didn't need further evidence that Google cares more about third party marketers than users' privacy.


You can in any way disable automatic downloads and only see what you actually wish.


Gmelius (https://gmelius.com) detects and blocks pixel trackers within Gmail and Inbox. We'll soon release a way to prevent link tracking. You can learn more about this privacy feature at https://gmelius.com/gmail-block-trackers/


https://gmelius.com seems to block email trackers in Gmail.


Similarly, I've used the PixelBlock Chrome Extension and it works well. Very much a "set it and forget it" kind of tool.


I'm surprised there isn't an equivalent for Firefox. At least no obvious one I can find so quickly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: