Hacker News new | past | comments | ask | show | jobs | submit login

  The analysis found a 98 percent correlation between the two datasets. 
  "That's 100 percent confidence level, it's our data," DeHart said. 
The numbers don't quite add up. Having said that, the hackers may have removed their device data, this might be (some of) the 2% missing data.



That would be insanely stupid [#] for the attackers to do - especially as they claim the FBI have the original 2%.

  diff ours theirs | xargs fbi_arrest_warrent_generator.py
I cannot come up with a convincing reason that the 2% is missing however - or if the 2% is in addition to. Which would raise even more weird questions.

edit: [#] that seems a bit aggressive, but I am not aiming at the parent post here, apologies if it reads badly. I think I mean these guys would make it onto Americas Dumbest Hackers TV special if that were the case.


The md5 sum of the UDIDs list contains 1337 on purpose.

It seems to me that the easier way to achieve this is either by randomly modifying the order of lines, which they didn't do, or to add/remove some bytes.

It could be the 2% diff between the two files : bruteforcing the md5 hashing by removing some random lines till they got a md5 digest containing 1337.


They also said the data was taken "in the past 2 weeks". Assuming that Blue Toad's data changes daily or so, getting a 98% match to their "current" data seems reasonable. If the data changes "real time" then it might be impossible to find the exact time when the data matches 100%.


Or the 2% is the data added/removed from their database since the break in.


Why is my mind going to a strange reference about chimp dna when compared to human dna..?


As well as the usual "that's not how statistics works"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: