Mixpanel analytics accidentally slurped up passwords

tekkk · on Feb 6, 2018

One more reason to use adblocker. The beauty of 3rd party scripts running on your website, hijacking people's passwords like it's no big deal.

I've used Mixpanel as a developer and also been disturbed how they combine analytics data with the user profiles creating really detailed profiles what they do on their website. Even that wouldn't be such a big deal if it was obvious to the user but really there is no law stopping companies from doing it. I hope GDPR will clear this thing out in the future and companies can't operate without user's direct consent and opt-in becomes the standard instead of opt-out.

ryanwaggoner · on Feb 6, 2018

That seems crazy to me. It should be illegal for a website to track what you do on that site? We’re not talking about tracking activity across many sites, and you don’t need a third-party tool to do this at all. If it were illegal, a huge number of sites wouldn’t even function without fundamentally changing the way they work. You know that the HN database has records of all your votes, comments, etc, right?

I get the outrage at being tracked by third parties across the web, but that’s different from making it illegal for any website to track any of your activity on their site.

toomuchtodo · on Feb 6, 2018

It should be illegal for a site to store your personal data without your consent, which is what the GDPR dictates for EU residents. What is considered personal data is for governments, regulators, and courts to decide; not website owners.

The days of the Internet being the Wild West are coming to a close, and not a moment too soon.

Side note: it’s disturbing it took Mixpanel nine months to notice they were ingesting sensitive data. Turn on All The Ad Blocking.

ryanwaggoner · on Feb 6, 2018

I think your view of the GDPR is naive. 99% of people will just consent and that’ll be that. The second they want something the site offers and they have to register (the point at which they’re handing over PII), they’ll be forced to consent, and just like everyone does for cookies and TOS and privacy policy and whatever else, they’ll quickly be trained to consent and move on. I could be wrong, but that’s what I expect. The assymetry here is incredible. That data and the consent has a perceptual value of approximately zero to the consumer, and millions or billions in aggregate to these companies. They’ll figure out a way to get people to consent. Or most of the web just won’t be usable to you.

joshuakarjala · on Feb 6, 2018

According to GDPR you do not need consent to gather personal data which is reasonable with regards to the service your are providing.

Why you do need consent for is to gather unneeded personal data or to send personl data to 3rd party providers for processing that is not essential to your service.

You are not allowed to deny people access to your site based on lack of this kind of consent.

Tharkun · on Feb 6, 2018

What are you basing that last paragraph on? It's my website, I'll damn well deny access to anyone I please.

mattmanser · on Feb 6, 2018

Perhaps you should read the parable of King Canute?

Here's roughly what you must comply by, if you're not blocking the whole of the EU.

https://ico.org.uk/for-organisations/guide-to-the-general-da...

ithkuil · on Feb 8, 2018

I'm not sure I understand what this means:

> Avoid making consent to processing a precondition of a service.

Does it mean I have to ensure my users can use the service even if I'm not allowed to "process" their data? I assume this must mean "processing" data for reasons not directly connected to the actual service. (E.g. using the data to gather business intelligence or sell it to third parties)

jacquesm · on Feb 6, 2018

It's your website but they are not your users. You can deny access to anyone you please but for those that you do allow access you're going to have to abide by the law.

TeMPOraL · on Feb 6, 2018

> What are you basing that last paragraph on? It's my website, I'll damn well deny access to anyone I please.

If you collect and process PIIs of EU citizens, the EU will do whatever it goddamn likes with you, which currently means some pretty high fines.

ryanwaggoner · on Feb 6, 2018

Only if they can enforce and collect them. I’m extremely skeptical that they can do so for companies with no ties to the EU. I suspect other countries will take a dim view of the EU attacking their sovereignty like that, and will probably just ignore it.

TeMPOraL · on Feb 6, 2018

> I think your view of the GDPR is naive. 99% of people will just consent and that’ll be that. The second they want something the site offers and they have to register (the point at which they’re handing over PII), they’ll be forced to consent, and just like everyone does for cookies and TOS and privacy policy and whatever else, they’ll quickly be trained to consent and move on

You're wrong.

You see, the "Cookie Law" was a test. Can a gentle legal nudge make the web self-regulate into respecting its users? It turned out that no, it can't. Remember, you only have to show a cookie warning if you're doing something inherently user-hostile. That almost every site has one only shows how little anyone cares. Well, the gloves are off, GDPR is the next iteration, designed to make people care. If you're doing something shady, GDPR will make you explain, in points, how exactly are you going to fuck the user over, and make you request explicit consent for each single fuckery you want to do - and does not let you make the service conditional on the user bending over. The user is supposed to be able to tell "no" to everything and still use the service.

Whatever pains this creates for businesses on-line, frankly, it's deserved.

wastedhours · on Feb 6, 2018

So, you put those points into an interstitial. All sites create a similar looking interstitial with the same "Accept" checkbox. It becomes user behaviour (a la invasive cookie notices), cycle continues.

I don't see how the informed consent bit wont become the same mindless box-ticking exercise web users already undertake?

You can "use" the service, but it'll be positioned that it's not an option but to accept.

TeMPOraL · on Feb 6, 2018

> You can "use" the service, but it'll be positioned that it's not an option but to accept.

I feel this time around, such a construct will open the company to legal liability, and there's plenty of people (myself included) who will gladly report shady practices like that. EU seems really into making consent actually meaningful, so why I can't be sure, I'm convinced the "mindless box-ticking exercise" scenario won't play out.

wastedhours · on Feb 6, 2018

Really? To be honest, although I'm a marketer, even as a consumer, I believe if I don't accept the parameters and terms of the application, I shouldn't use it. Obviously that philosophy is only really eligible for private sector organisations, but GDPR holds public ones to a higher bar in any case.

IANAL, but I also think a lot of companies will be well argued on using "Legitimate interests" as a basis for processing instead for a lot of things.

Then others will argue on "Consent should not be a precondition of signing up to a service unless necessary for that service." and use "processing is necessary for the performance of a contract" as the basis (especially if there's money involved).

Obviously, the large companies will be able to handle this better and employ decent legal teams to argue for using those basis' - not because they're "good actors", just they've got the cash and knowledge to argue better. It's the smaller business owners (who as with VAT MOSS) are going to be screwed over the most.

ryanwaggoner · on Feb 6, 2018

Again, I think you’re naive. The big tech companies and anyone with EU operations will comply, as they should since they are operating from those jurisdictions.

But other small and medium websites all over the world that don’t have EU operations will mostly just ignore this, as they should. It’s going to be nearly impossible to enforce for those companies, it’s a violation of national sovereignty, and half the regulations don’t even make much sense. How exactly are you supposed to avoid storing the personal information of EU residents when personal information includes IP address and that EU resident can be anywhere in the world? Or you have to delete the info if asked, but other laws require you to keep it? Now every medical facility in the world has to have entire new compliance procedure just in case an EU resident stumbles through their door?

Yeah, color me skeptical. You can pass any law you want, but it’s only as good as your ability to enforce it. If you think the US is going to help the EU collect multiple-million-dollar fines from some small business in Oklahoma, well, you’re going to be very disappointed.

We’ll see, but I doubt that in a couple years things will have changed all that much.

jacquesm · on Feb 6, 2018

The real punch of the GDPR will hit those that did not obtain consent and that end up being compromised losing data they shouldn't have had in the first place. That's a chair you really do not want to be sitting on when TSHTF.

I sincerely hope the fines will put a couple of companies out of business as a warning to the remainder.

WA · on Feb 6, 2018

> I think your view of the GDPR is naive. 99% of people will just consent and that’ll be that.

Maybe, but the GDPR also has much higher fines when it comes to a data breach. So there's quite an incentive for companies to care a little more about user privacy.

Angostura · on Feb 6, 2018

Have a quick look at this for an idea of how the GDPR changes what consent means https://www.hallaminternet.com/how-to-make-your-website-gdpr...

toomuchtodo · on Feb 6, 2018

I think you’re underestimating the intent of the EU and the aggressive enforcement available to them.

If we have to break the web, we break the web. Digital rights supercede tech profits.

stryk · on Feb 6, 2018

I have not much more than a cursory understanding of this new law, but holy shit do I hope somebody breaks the web. Fast. This is out of control with the tracking and the monitoring and every other website seems to pull in 14 or so different javascript analytics with who-knows how many 3rd party companies that you had no idea you were interacting with when you went to the original site.

I still don't understand how they plan to enforce this law worldwide, though. The EU is powerful, sure, but if I was China I might just tell them to piss off when they come knocking.

beojan · on Feb 6, 2018

> The EU is powerful, sure, but if I was China I might just tell them to piss off when they come knocking.

It's not really Chinese companies doing the tracking, it's American ones.

The EU is simply too big a market for them to ignore.

stryk · on Feb 6, 2018

I don't disagree, but China was just an example of a country that could feasibly tell them to get bent. What would they do about it? Trade sanctions? Most of the shit in Europe is made in China, too.

TeMPOraL · on Feb 6, 2018

They won't go against China per se. They'll go against individual companies through whatever trade agreements connect Europe and China. Maybe Baidu or Alibaba would be strong enough to use Chinese government to turn this into a stalemate. But it probably wouldn't be worth the effort. I doubt they make that much money of wanton analytics abuse anyway. For every other company though, they either comply with GDPR or get out of the European market.

(The trick depends in part on companies wanting access to EU more than EU wanting a particular company to sell to its people.)

kangoo1707 · on Feb 6, 2018

How would startups gather BI information then?

tzahola · on Feb 6, 2018

Not my problem.

jacquesm · on Feb 6, 2018

By asking politely?

bluesign · on Feb 6, 2018

‘One more reason to use adblocker. The beauty of 3rd party scripts running on your website, hijacking people's passwords like it's no big deal.’

Yeah but soon adblock and 3rd party adaptation will reach critical mass, we will start to see, 3rd party tools integrated as 1st party (with subdomains etc) then adblocks will be not much efficient

TeMPOraL · on Feb 6, 2018

We can't win this war with technology alone.

It's good the GDPR is coming on the data collection front, because the only way you can stop this fuckery is through out-of-band threat of legal problems.

chopin · on Feb 6, 2018

This could backfire badly. I don't use an explicit adblocker but NoScript, which blocks almost as efficient (and makes tracking more difficult). For the occasional site I allow 1st party script. If sites start to serve me malware this way, I will block them right away, too. I can't do business with you without Javascript enabled? I just move on to the next one. Or I go back to brick and mortar.

As well, serving malware 1st party may have some legal consequences if got caught.

RoboTeddy · on Feb 6, 2018

Looks like Mixpanel is handling this well: they deleted the data, notified affected customers, and then are making sure it can't happen again. +1 for admitting responsibility instead of deflecting or minimizing.

spenvo · on Feb 6, 2018

Ok, they took responsibility... But they took about a month to notify clients via email and didn't notify the public until Techcrunch inquired for the purposes of writing this article. So -1 for not being upfront with end users and clients.

ajeet_dhaliwal · on Feb 6, 2018

Take away a few more for not noticing the problem for nine months.

snissn · on Feb 6, 2018

Would be a good move to put together a common password list and regularly check their data against it.

perfectstorm · on Feb 6, 2018

They were notified on Jan 5th. They destroyed the stored passwords on the 9th and informed the customers on the 1st of February. It seems reasonable.

Why would they inform the end user though ? The end user doesn't even know what Mixpanel is and would be confused if they emailed them directly. Informing the clients (BMW, Samsung etc.) is the right thing to do which they did. I'm sure they had to do a postmortem and make sure a fix is in place before informing the clients and urging them to update the SDK.

anc84 · on Feb 6, 2018

> Why would they inform the end user though ? The end user doesn't even know what Mixpanel is and would be confused if they emailed them directly.

So that the end users can grasp to which extent their privacy was violated? If a third-party I never heard of, contacted me and told me they got my login details, I would be bloody furious. And that is a good thing if people are enabled to this.

givehimagun · on Feb 6, 2018

Weird that the title says 'accidentally' when other companies aren't given that kind of credit. Real talk, do we give internet companies (Google/Facebook) more leniency than traditional companies (Equifax/Target) in data leaks/hacks?

mbesto · on Feb 6, 2018

No, but if you've ever done security before, you would know that it's a matter of "not if I hacked but when". Judging from their response, they seem to have pretty good procedures in place. Conversely, Equifax took months to respond and pulled a "it's not our fault".

IntronExon · on Feb 6, 2018

Google and Facebook have proven themselves about as trustworthy as a pair of starving animals, and frankly it took time and a lot of benefit of the doubt before plenty of people arrived at that conclusion.

In the words of Nixon they “Earned everything [they] got.”

MichaelRenor · on Feb 6, 2018

Am I missing something? Among the companies I would trust not to leak or expose my password (or incidentally collect it plaintext) google would be high on that list.

bigiain · on Feb 6, 2018

It wouldn't surprise me greatly to notice Google serving me targeted advertising linked to key words in my password...

"What're all these batteries and horses and staples doing coming up in my advertising???"

AFNobody · on Feb 6, 2018

They notified their customers, not the people whose passwords they collected.

They did minimize how many people they told. Lol.

BuildTheRobots · on Feb 6, 2018

Call me thick and naive if you will, but how does an advert manage to hoover up all typed inputs on a page including (somehow accidentally) passwords.

Isn't the underlying problem with the web browser that allows this, or are browsers just fundamentally broken when it comes to protecting users?

TeMPOraL · on Feb 6, 2018

Analytics analyze. They read data. Mixpanel decided to read more than they should, and they unintentionally slurped some passwords.

The core technical problem is that scripts can read data. But that's kind of their job. "Solving" that would require to reduce the functionality of webpages.

The core social problem is that people include third-party analytics and advertisements. Solving that would require to destroy the entire adtech industry.

I'm all for the latter solution.

bluesign · on Feb 6, 2018

Where you will draw the line, then main ad industries will start to serve ads as 1st party as custom subdomains.

There is no way to win this fight except regulations and oversight

chopin · on Feb 6, 2018

In this case, they can be made fully responsible. You'd better vet the Javascript you're going to deliver.

TeMPOraL · on Feb 6, 2018

Yes, I strongly agree. That's why I can't wait for GDPR to come to life this May.

nvr219 · on Feb 6, 2018

Always do ublock origin + privacy badger (and pi hole when you're at home)

tylersmith · on Feb 6, 2018

I set up a network-wide pi hole recently and it took less than 15 minutes once I had the rpi and sd card in hand. It's a fun and easy experience that I highly recommend it to anybody. I check the dashboard every morning to see how much I've blocked, and to what. And my wife, who doesn't care about privacy for some insane reason, now has more privacy than she expected without any work.

bonestamp2 · on Feb 6, 2018

uMatrix is well worth the time it takes to use too. It shows you a matrix of exactly what each site is requesting. I've become so much more aware of what is happening on websites since installing the plugin about a year ago.

The great thing is that it's so quick and easy to enable/disable/customize too. For example, I've identified two scripts that block websites from loading when it doesn't get loaded... so I allow those to load on every website -- no ads, but the site still loads.

Mixpanel is blocked by default in the blacklist that I subscribe to.

faitswulff · on Feb 6, 2018

Are those two scripts widespread or specific to a site? And if they're quite widespread, would you be able to share how to identify them?

bonestamp2 · on Feb 6, 2018

Just saw one a few minutes ago. It's actually not even the script you need to allow, it's just an XHR request to adiode.com. I can't remember the other one, I'll report back here when I see it again (usually every couple days).

I saw the adiode.com one on https://www.merriam-webster.com if you want to have a look for yourself.

TeMPOraL · on Feb 6, 2018

You mean the "Something interfered with this website loading" fake error screen? Yeah, that "something" is their bloody script. Bastards.

I hit that one pretty much every other day somewhere; I wonder, how did it spread so quickly? It must be linked to by some popular ad network or analytics package.

bonestamp2 · on Feb 6, 2018

Yup. So if you allow the script to make that XHR request then it thinks the ad blocker is off and it will show the page but not the ads.

NicoJuicy · on Feb 6, 2018

A lot of people talk here about GDPR, but does anyone know how to "handle" this in my ecommerce?

I haven't found any indication on how to do it "legally" or a small "howto". I'm also not interested in an IBM solution that would cost me 10 times my earnings

jmickey · on Feb 6, 2018

In essence the GDPR requires you to be accountable about all personal data you collect and use in your daily operations.

Compiling a list of all personal data currently located in your systems and making sure you have a legal basis for each item, goes a long way towards compliance (though of course this is not all you need to do)

NicoJuicy · on Feb 6, 2018

So, let's say:

- WooCommerce on url xxx: phone number, email,name & address. Google Analytics & facebook Pixel. Requirement for e-commerce fullfillment and analysing website performance / ad performance.

- Mailchimp : email and name, when accepting WooCommerce "Terms and conditions" n°2. Requirement for recurring ecommerce updates/changes of new products.

- OpenERP: ( invoicing - local network) - Firstname, lastname, address, email, phone, orders. Requirement for invoicing

Somehow i can't believe that would be sufficient.

jmickey · on Feb 6, 2018

It can get tricky. Analysing web performance is not a direct requirement to fulfill orders, so you would need an explicit opt-in from your customers that lets them agree to their personal data being used in this way.

With Mailchimp you probably need to let your customers separately opt into their e-mail being used for marketing purposes, as again that use is not strictly required to fulfil their order.

Same with any other information - your customers need to be aware of all the ways you will use their data. If any uses are not covered by a legal agreement, there needs to be an option to opt-in.

NicoJuicy · on Feb 6, 2018

Analysing performance is a requirement for more sales.

ryanwaggoner · on Feb 6, 2018

It’s not. GDPR is so overly broad that it’s almost impossible to be in full compliance. Not only does it count IP addresses as personal information, but it covers all EU residents, apparently no matter where they are in the world, even if you have no way of knowing they’re EU residents.

I really doubt most small-medium businesses without ties to the EU are going to pay any attention. Just like VAT actually.

jmickey · on Feb 6, 2018

Mind you an IP address is personal data only if it identifies an individual. Same goes with any other information.

ryanwaggoner · on Feb 6, 2018

That’s not what I’ve seen from my admittedly cursory research, but I don’t see how that matters anyway: how would you know if the IP address could personally identify an individual, so it seems like you would have to assume that it could?

btown · on Feb 6, 2018

> We immediately began investigating further and learned that the behavior the customer was observing was due to a change to the React JavaScript library made in March 2017. This change placed copies of the values of hidden and password fields into the input elements’ attributes, which Autotrack then inadvertently received. Upon investigating further, we realized that, because of the way we had implemented Autotrack when it launched in August 2016, this could happen in other scenarios where browser plugins (such as the 1Password password manager) and website frameworks place sensitive data into form element attributes.

Here's the commit in question: https://github.com/mixpanel/mixpanel-js/commit/98a1845c5c55f... - as referenced here: https://github.com/mixpanel/mixpanel-js/issues/164 .

The meat of it is that the check of the type of the <input> node to ensure it's not "password" or "hidden," now wraps the iteration over attributes as well as inclusion of the value.

That heuristic, though, is far from perfect: see, for instance, https://www.troyhunt.com/bypassing-browser-security-warnings... . And even if you're not doing something funky like that, you're not out of the woods. For instance, it's highly likely that a site collecting "secret answers" in plaintext input fields (for password resets) would leak that information to Mixpanel if Autotrack were turned on. This commit does nothing to change that scenario. And while Autotrack is now opt-in, it can be done entirely by a business team with zero interactions with engineering, if (for example) a tag management solution is set up: see https://mixpanel.com/blog/2015/03/27/community-tip-implement... and https://help.mixpanel.com/hc/en-us/articles/115004613366-Wha...

SaaS analytics companies, especially those able to slurp up everything into a managed data lake and provide perfect retroactive analysis capabilities, provide a great experience compared to self-hosted solutions, but it's inevitable that you'll end up giving them more information than you'd originally planned. For many businesses, that's the right decision, but it should be one made with eyes wide open.

colmvp · on Feb 6, 2018

On a side note, Mixpanel used to host conferences but stopped after 2014. Anyone know why?