Hacker News new | past | comments | ask | show | jobs | submit login

> exact matches

Not exact matches. Hashes. Hashes that were quickly show to have collisions that the company brushed off.




That's why they require that you reach a certain threshold number of matches before its sent for human review. The threshold allows them to take the probability of a false collision, which they can estimate from data, and set the probability of an overall false-flag by requiring a certain number of these collisions. They've released that the threshold, to start, would have been 30 (Page 10 of https://www.apple.com/child-safety/pdf/Security_Threat_Model...). They claim that, given the probability of a false collision, and the threshold that they've set, the probability of your photos being sent for human review falsely is 1/trillion.


They mention a “very conservative false positive rate” - doesn’t 1/trillion imply that they used 1 / (1e12 ^ (1/30)) = ~40% as the false positive rate? If so, that does seem extremely conservative to me!


A 40% false collision probability would give an overall false flag probability of 1/trillion only if you had exactly 30 photos in your library, and thus all 30 had to be false collisions. The calculation gets a little more complicated if you have more, because you have to account for all the possibilities of combinations of 30+ false collisions among N photos, for N > 30. I wrote out the calculation in a comment from when this was being discussed a few months back: https://news.ycombinator.com/item?id=28174822.

On page 10 of the paper I linked though, they state that they assume a false collision probability of 1/million, which is more conservative than the 3 in 100 million false collisions they saw in their tests. The way they chose 30 as the threshold is based on the safeguarding assumption that everyone's photo library is larger than the actual largest library. This is safeguarding because the more photos you have, the more likely you are to have collisions. Copying from my previous comment, we can compute their photo library size assumption by solving for N in this equation: 1/trillion = 1 - sum_{k=0}^{29} of (N choose k) (1 - p)^k p^(N - k), where p is 1/million (the probability of a false collision).


You are incorrectly assuming a non adversarial environment. Swatting 2.0.


The problem with this argument is that the "adversarial environment" argument applies to a worse degree to all cloud storage services who do the scanning in the cloud, since they have no threshold mechanism, and lack transparency on whether there is any human review whatsoever. You would still be reported to the police if someone hacks your Google Photos account and uploads CSAM to it.


Accurate, but note that intent as OP referred to is not the same as implementation. Fucking up doesn't mean you intended to fuck up.

With Google you can be absolutely sure that their intent is to eat all your personal information and data for short-term profit. With Apple it was "just" a stupid attempt at legal (over?) compliance.


That's the narrative that Apple's marketing department is selling, but I'm not buying it. The fact is that Apple devices slurp up more data to Apple that you cannot turn off without making your phone essentially useless than Google devices slurp up to Google.


Googles toggles are largely useless - you can "choose" to disable web and app tracking, but it intentionally disables or breaks most app features.

Want to update Google maps home/work addresses? Too bad, requires web/app tracking enabled.


Unlike iOS, Android lets you use whatever maps app you like and set it to be the default handler for opening addresses. This includes maps apps that store the map data fully locally. Even better, when you get your location on Android, you do not have to send that location to Google. On iOS, no application can get your location without your location also being sent to Apple.

That "web and app tracking" applies to apps both on iOS and Android. The difference is that Android gives you more choice about what services you use.


They probably brushed them off because a malicious/accidental hash collision would lead to a human reviewing them and then not going to law enforcement.


Or they will, depending on reviewer, photo clarity, current political climate, potentially location and so on. You have no say in this process, nor anybody else on this forum, or elsewhere.

Its not the law enforcement that's the main issue, but various greedy 3-letter agencies who are already well known to have ambition to have profile on every person in this world (not unlike Facebook but for different purposes).

This is not privacy anymore no matter how you bend it, it has been cancelled and Apple realizes this very well. And it still doesn't care. Literally the only serious selling point for many new buyers not invested in ecosystems, blowing it off with a nice double barreled shotgun shot.


My understandimg was that the reviewer gets an extremely compressed version of the image, not full resolution, likely due to privacy concerns due to the potentially large rate of false positives.

I don't trust them not to jump to conclusions with a 256x256 (the exact quoted resolution escapes me at the moment) image at their disposal.


Thus the manual review. No one's going to be going to prison over a hash collision here.


But a manual reviewer in Cupertino or elsewhere still gets access to your personal (possibly very intimate or otherwise private) photos. Privacy from law enforcement is hardly the only privacy that people value.


If you desire privacy, never upload your images to any cloud service that doesn't offer true end-to-end encryption of the data (that is, one where they do not have the key). Use a service where data is only decryptable on your own devices or devices that you personally authorize. Which is, presently, none of the popular services that I'm aware of.


It's even probably the right choice for a popular service to have made.

Full E2E encryption is going to trigger nightmare "I lost all my photos" customer-service stories when people forget their passwords... which is acceptable when you deliberately signed up for a service where security was the selling point, but not great for someone who bought a mass-market phone.


Yep. See the perennial complaint about Signal as a demonstration of that. They don't persist your messages across devices on privacy/security grounds. That's fine, it's why I use it (or one motivation for me to use it). But it's contrary to what many people expect from that kind of service.


Thats the issue with local scanning, even if you used an e2e cloud for your photos the encryption would be bypassed with local scanning.


They would only have access to the photos that are being reviewed.

And you can either choose between (a) someone having to see your photos or (b) relying on an automated but imperfect process. You have to pick one.


Uh, can't I choose not to have my private images scanned? I think that's still a choice, right?


It is, but it's perhaps incompatible with uploading your private images to a cloud service.


Of course. But the second you enable iCloud Photo Library and want to upload your private photos to Apple's servers than you need to comply with their Terms & Conditions.

Which includes them scanning your photos for CSAM.


Not when using a commercial cloud service, no.


I used to work in the same building, as a department with legal authorities (purposefully vague here), and the burn out rate was astronomical.

Good, descent people, waking up screaming, cold shakes, permanently damaged from what they could not unsee.

You couldn't pay me enough to go through images of such sickness.

Outside of all the yes/no, on/off phone stuff, how are they going to hire, and keep staffed, a department of people having to look at this stuff.

How are they going to insure it?!


Right. Requiring exact matches for this kind of material is absurd as a single pixel change would foil any detection. So everyone, practically speaking, trying to detect it is going to use some form of hash algorithms. And every hash algorithm, by definition, permits potential collisions and false positives. Which is why any sensible program will use a manual review process before pushing anything forward to law enforcement. Apple's system, requiring ~30 matches, means that you'd have to have 30 or so false positives that also happen to look like CSAM to manual reviewers to end up getting a false case sent off to law enforcement.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: