Hacker News new | past | comments | ask | show | jobs | submit login
The real source of Apple device IDs leaked by Anonymous last week (nbcnews.com)
192 points by ssclafani on Sept 10, 2012 | hide | past | favorite | 64 comments



Matt Blaze, on Twitter:

@mattblaze: So, instead of being tightly held data given to the FBI by Apple, UDIDs are widely available to random app developers you've never heard of.

@mattblaze: And thanks to Anonymous, if the FBI didn't have that list of UDIDs before, they do now.


Matt always has a great perspective on these things. At one of his talks I attended he commented on the difference between the 'security' business and the 'intelligence' business and noted that they both depended heavily on obfuscation and misdirection. Prior to that I had never really connected them in that way, but in hindsight it seemed amazingly obvious. Interesting times indeed.


Matt Blaze is probably the most important computer scientist at the intersection of security & privacy; look at the relatively recent work his group did on wiretaps for a good example of why. He's not a grandstander.


I'm laughing my head off at all of those Apple haters and conspiracy theorists right now over the collective brain explosions when facts come out that Apple wasn't colluding with the government.


That's just what they want you to believe!!!

Typical government cover-up!


Make sure to read the actual details from David Schuetz's – @DarthNull – blog post (the dude who did the digging):

http://intrepidusgroup.com/insight/2012/09/tracking-udid-src...


Nice story.

It's a reminder how powerful the combination of simple tools and a little reasoning can be.

I once found myself by chance with a customer just as they got smacked with a DDoS attack that they were completely unprepared for.

The security folks threw up their hands claiming that they couldn't do anything to stop the attack due to certain random elements. The executives panicked and everyone started pointing fingers while their site went offline. It was chaos.

I asked to have a look at logs on a hunch that the "random" element wasn't entirely random. One line of awk, grep and uniq later it was revealed that roughly 85% of the attack could be mitigated with a trivial change at the edge.


Ah, exploratory data analysis to the rescue! Those "experts" can only use confirmatory tools.


Nice sneer, there. High horse much?

I'm not an expert in all the areas I'm expected to cover by any means. Some days, it's all I can do to not say "act of God, maybe?" I try to learn new things when I am afforded the time, but since we don't live in a CSI world, it's just not possible to always have the right tools, the right knowledge or the right answer at all times, especially when it comes to computer security.

Bruce Schneier ain't cheap and not everyone can afford the services of specialists when it isn't a common occurence.


I'm sorry you were personally insulted. Were you one of the people involved in that incident?


... and the BlueToad blog post. Nothing of substance, mostly just a "sorry" and "we've fixed it".

http://blog.bluetoad.com/2012/09/10/statement-from-bluetoad-...


I wish they also said they were not collecting "other personal data as, full names, cell numbers, addresses, zipcodes", which the pastebin posters claimed was in the original file.

Unfortunately, BlueToad's statement leaves some wiggle room.

"BlueToad does not collect, nor have we ever collected, highly sensitive personal information like credit cards, social security numbers or medical information. The illegally obtained information primarily consisted of Apple device names and UDIDs – information that was reported and stored pursuant to commercial industry development practices."

Edit: The "98% correlation" leads me to believe the publicly posted info is the full extent of the leak.


Why? If 98% of the posted UDIDs are in their database then I could see them say "98% correlation"


Good to see they don't forget the obligatory, "we take information security very seriously," line.


They always seem to say that in retrospect, even though the evidence shows they obviously don't take their security very seriously. Otherwise, they wouldn't of had such a leak.


Incredible contrast to the following:

Another theory on the “FBI” UDID leak

http://news.ycombinator.com/item?id=4484547

http://www.marco.org/2012/09/06/udid-theory


After reading through that blog post it seems that David himself still has some serious doubts as to whether Bluetoad was the source of the breach.

Reading through his analysis, it almost seems that he may have fallen victim to log file pareidolia as he doesn't make it clear how a device named "Hutch" or one named "Paul’s gift to Brad" a anything more than coincidences in a very large data set.

Doing some quick analysis of the file shows that there is a UDID that has the alternate names; "Hutch Hicken" (Bluetoad CTO), "Bluetoad Support" and "Customer Service iPad" among others, but could this also be representative of an older iPad that has been a pass-me down through the company?

Somewhat more interesting and possibly more revealing are the UDIDs 'ffffffffffffffffffffffffffffffffffffffff' (occurring three times) and the small number of records not conforming to the field size and format of other records (UDIDs > 42 characters, no APNS, device/iOS version number as fourth field).

For anyone interested the following ugly and slow one-liner will print out a summary of non-unique UDIDs along with their APNS and names.

  perl -F, -lane '$a{$F[0]}{$F[1]}=$F[2]; END { foreach $k (keys %a) {next unless ~~ keys %{$a{$k}} > 2; print "\nUDID : $k"; foreach $d (keys %{$a{$k}}){print "\t-> $d : $a{$k}{$d}"} } }' data


My reading is that a single device appeared four times in the logs, twice named "Hutch" and twice named "Paul's gift to Brad".

If that is the case, it would be a pretty striking coincidence for it to be someone else.


Wow -- great read


Can we agree to stop using Anonymous as a collective noun? It's like saying "In other news, a site was vandalized by The Hackers."

All it means is that someone published something anonymously, with an intent to associate themselves with this larger collective. Maybe for those in the know, they can say that this particular hack was discussed in Anonymous' IRC chanels or something. Better to say "an Anonymous" or "a hacker claiming allegiance with Anonymous".

Of course, this is how the media is being hacked. Unlike the devil, Anonymous' greatest trick was convincing the world that it did exist.


> Can we agree to stop using Anonymous as a collective noun?

Then, can we agree to stop using: "Scientists have just..." "Al Qaeda destroyed..." "Economists think that..." "Experts believe..."

Tabloid-level writing at best. This is wrong in so many ways since in each case, you assume: - the group is a consistent entity - the group has no face and all members think alike - your ignorance about how that entity is actually working.

We should never use such expressions - being specific is the right way to write about opinions and facts.


IIRC, wasn't this leak made by the group AntiSec? IIRC, they consider themselves separate from Anonymous and have their own very odd manifesto having to do with exploit disclosure.


> a hacker claiming allegiance with Anonymous

But then we'd have to argue about what the word "hacker" means.

Seriously though, I'm actually quite OK with using "Anonymous" as a collective noun. Granted, it is a less accurate shorthand, but we all know what it means. For those who don't, I'm not sure using "a hacker claiming allegiance with Anonymous" would be any more enlightening. In fact, it may actually reinfornce the mistaken idea that so many in the non-tech media still seem to have, which is that Anonymous is some kind of more or less traditional, hierarchical, online terror organisation.

"An Anonymous", yeah, maybe. Still seems needlessly confusing to the less knowledgeable. But I'm open to being convinced.


It's kind of cute when news sources do it, as if they're too shy to admit "so, we found this on 4chan the other day..."


Agreed. It seems that everyone thinks Anonymous is a contiguous group, as opposed to a bunch of loosely, if at all connected basement dwellers doing things under the same banner.


Can we agree to stop using Anonymous as a collective noun

You know Anonymous is a group, right?

Maybe you are thinking of anonymous? (lower case "a")


This is still somewhat consistent with Anonymous' story....

The chain of events could have been:

1. Blue Toad either gets hacked, or gives their data to the FBI or someone else.

2. Somehow this data ends up on an FBI agent's laptop.

3. Anonymous breaches the laptop and gets the data.

4. Anonymous sees all the UDIDs and mistakenly thinks, "Apple and the FBI must be in cahoots!", and publishes it.


My guess is somewhat blander.

1. Anon breaches laptop, finds UDIDs.

2. Anon tells other anons he got the UDIDs from a laptop.

3. Other anons tell more anons it was a government laptop.

4. Release group writes "FBI laptop" in their pastebin.

(5. ??? --> 6. Profit!)

The heterogeneity and disorder in Anonymous (at least it was like this back in the day) means that the chain from leaker to releaser -- usually passing through several people and IRC channels -- plays out a bit like a game of telephone. This serves to protect the leakers, but it can mess with some of the details.


The releaser of the data mentioned the name of a specific FBI agent and claimed the data had a specific file name containing the acronym of an non-profit organization set up to share data between private industry and intelligence organizations.

Details like that don't emerge over the course of a game of telephone. If this story is correct, and the data was not in the possession of the FBI, someone deliberately decided to make up an elaborate lie.


The FBI Agent (Christopher K. Stangl) appears in a recruiting video for cybersecurity experts. It is no sign of secret knowledge when his name is used.


Wasn't implying that it was.

It's entirely plausible that someone, disliking the FBI, made up an elaborate lie to discredit them.


It's entirely plausible that you work for the FBI and are trying to discredit talk of the FBI's involvement by proposing relatively poor arguments to the contrary.

See how ridiculous you sound?


Ha, yes, possibly more plausible.


This is assuming that you believe that Anonymous got a hold of an FBI agent's laptop


Its assuming they did what they said they did

During the second week of March 2012, a Dell Vostro notebook, used by Supervisor Special Agent Christopher K. Stangl from FBI Regional Cyber Action Team and New York FBI Office Evidence Response Team was breached using the AtomicReferenceArray vulnerability on Java, during the shell session some files were downloaded from his Desktop folder

From here http://pastebin.com/nfVT7b0Z


The chain of events could have been:

I don't think it is helpful to make excuses for Anonymous. If you'll allow me to abuse the ethical alignment terms: They appear to be a chaotic neutral, not a chaotic good. I'm not sure it makes much difference if they lied or are incompetent. Especially since there really isn't a specific 'they'.


  The analysis found a 98 percent correlation between the two datasets. 
  "That's 100 percent confidence level, it's our data," DeHart said. 
The numbers don't quite add up. Having said that, the hackers may have removed their device data, this might be (some of) the 2% missing data.


That would be insanely stupid [#] for the attackers to do - especially as they claim the FBI have the original 2%.

  diff ours theirs | xargs fbi_arrest_warrent_generator.py
I cannot come up with a convincing reason that the 2% is missing however - or if the 2% is in addition to. Which would raise even more weird questions.

edit: [#] that seems a bit aggressive, but I am not aiming at the parent post here, apologies if it reads badly. I think I mean these guys would make it onto Americas Dumbest Hackers TV special if that were the case.


The md5 sum of the UDIDs list contains 1337 on purpose.

It seems to me that the easier way to achieve this is either by randomly modifying the order of lines, which they didn't do, or to add/remove some bytes.

It could be the 2% diff between the two files : bruteforcing the md5 hashing by removing some random lines till they got a md5 digest containing 1337.


They also said the data was taken "in the past 2 weeks". Assuming that Blue Toad's data changes daily or so, getting a 98% match to their "current" data seems reasonable. If the data changes "real time" then it might be impossible to find the exact time when the data matches 100%.


Or the 2% is the data added/removed from their database since the break in.


Why is my mind going to a strange reference about chimp dna when compared to human dna..?


As well as the usual "that's not how statistics works"


So Blue Toad doesn't feel it's their responsibility to contact the people they exposed? The individual publishers assumed the risk of working with Blue Toad so they are partially responsible, but Blue Toad isn't going out of their way to make people feel sorry for them.


It seems like Blue Toad's customers were intermediaries between Blue Toad and the final end-users of Blue Toad's apps. Much like the RSA breakin a few years back, it makes sense for Blue Toad to make a general announcement and leave the direct customer communication up to Blue Toad's customers (who own the direct relationship with the customers).

Of course, this brings up the question of why Blue Toad should have Personally Identifying Information about its customers' customers.


It provides an SaaS solution, and it had a security breach. If we compare their solution to Salesforce, is it Salesforce's (legal) responsibility to keep all of your customers' data encrypted and inaccessible to anyone except your company? Only if they provide an SLA saying that, or otherwise advertise that feature. Blue Toad never advertised that feature, and it wants to be invisible to its end users.


Legal responsibility? Probably not. Ethical responsibility? I think so. Think about if, say, Conde Nast got hacked and login information was leaked from all of their servers. What would I expect from Ars Technica, Wired, Reddit, etc? A link to a Conde Nast page with one unified statement, closing with something like "We at Ars and Conde Nast apologize...", "We at Wired and Conde Nast..." etc.

Even if Blue Toad wants to be invisible to end users, it's gone a little beyond that. Legally they only have to follow the fairly open-ended PII laws that are in place (I believe California is the only state to require they notify users of a breech), but ethically I believe their responsibility falls a little beyond that line.


What if AWS security was breached. Do you think they should email all Dropbox customers to inform them of the breach or leave that responsibility to Dropbox? (Dropbox is hosted on AWS) I think that analogy is more accurate because Conde Nast actually own the web properties you mentioned whereas BlueToad is an independent service provider.


The Conde Nast analogy breaks down since Conde Nast owns all those publishing properties while BlueToad is just a service provider to multiple, separately owned publishers. I think it's fairly reasonable to let the B2C entity be responsible for contacting the consumers.


BlueToad said they were able to confirm several of their own devices in the dataset which they go on to use as evidence that the dataset is their own. If antisec took this database from BlueToad why would they not trim out the BlueToad devices that could help confirm that this leak didn't come from the FBI? They trimmed out 11 million rows but left in the 19 used by the developer itself?


How would they do that? The data contained "Apple Device UDID, Apple Push Notification Service DevToken, Device Name, Device Type." None of those fields necessarily indicate who owns any of the devices.


This doesn't necessarily rule out the FBI as the Anonymous's source, but it does cast a lot of doubt on their story.


It also doesn't rule out the aliens from Zarvox who may secretly control our government from the highest levels.


I'm glad someone brought this up! I've been concerned about this for years!


[deleted]


I may have said too much already.


[copied from dupe thread]

much of the argument against anonymous seems to rest on the march date. from the article:

  The discovery of the theft casts serious doubt on Anonymous’ claims that the data 
  came from the FBI, and was pilfered in March.

  "Timing-wise, (their) story doesn't make sense," he [the CIO] said.
but when you read the blog of the guy who tracked things down, he says:

  While searching, I stumbled on a partial password dump for the company! And it was 
  dated March 14, the same week that the hackers claimed they’d hacked into the FBI 
  computer. Suddenly, I felt a lot more confident again, and I mentioned this 
  connection in the email.

  He [the CIO] didn’t think the March leak (which they’d already been aware of) was 
  related, but that the rest of my findings were concerning.
so the only evidence against it not being march is that blue toad are saying otherwise. against that, there's the original claim plus the known password leak. and blue toad are going to look bad if this was from march and they didn't notice (not that they look so great right now in any case).

again from the blog:

  I’m still not completely clear on all the technical details.
something doesn't seem right. the fact that the password dump comes from march seems like a huge coincidence, with no evidence against it, except the word of someone who may be motivated to deny that.


I don't see where it does either - until Blue Toad publishes information about the breach. But even then, as part of the breach investigation, that data may well have been provided to the FBI. That's one purpose of the NCFTA, AFAIK, to serve as a conduit during breach. I think it's irresponsible for journalists to say conclusively that one organization or other never touched the file when they don't have proof either way.

My original theory was that some ad service provider, analytics company, or app developer was working with the FBI on an investigation/attack and overzealously shared their customer's details (bad, bad--also horrific the amount of personal data collected, if Antisec's description of the original data set is true); but it could also follow that they were breached, hackers were circulating their database dump, and it was part of evidence in their investigation.


My point is to clarify the main thrust of the article. TLDR, if you will.

It sure looks like Anonymous lied. But they could still be telling the truth, if an FBI agent just happened to have Blue Toad's UDID list on his laptop. Which frankly doesn't sound as far fetched as alien overlords.


Do you have a point to make? What is it?


Now that's another story !

Kudos to Mr. Schuetz who went through all these UDID to find out what seems to be the truth, for once.


> The analysis found a 98 percent correlation between the two datasets.

Hate to say "I told you so" http://news.ycombinator.com/item?id=4473971 :)


So which apps were to blame?


Sorry for the OT, but for crying out loud, a professional journalist has no excuse at all for using "pouring" in place of "poring".


Blue Toad could be a front company for the FBI ...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: