Hey pling, I work on the OneDrive team. We definitely don't do the type of content scanning that you're describing. I wouldn't be comfortable with that. The only time we currently use file hashes for automated takedown is when known child pornography is re-uploaded to the service after being reported.
For copyrighted content, we have to respond to DMCA notices like other services. Sharing content to the public and getting reported by a third party is the only path for that. And in those cases, you definitely get a specific notice about the takedown. The web UI would also show you exactly which file was affected, and prevent you from sharing it again. It doesn't just delete files. (That would be unacceptable.)
Note that there's currently a 2 GB file size limit. It seems like the most likely explanation is that you put a large ISO in your SkyDrive folder, and it never succeeded in uploading because it exceeded the limit.
The file was 600Mbish. It successfully uploaded via the desktop client then was later downloaded on another machine via the desktop sync thing. Later that evening it was gone. I didn't delete it (it wasn't in the recycle bin) and I confirmed it was the correct live account I was signed in as. To be clear it was still on the source machine but not the destination. No antivirus had quarantined at either end.
That by elimination would suggest that either:
1. There is a reliability limit somewhere which is unknown and unpublished or a synchronisation bug.
2. You're unaware of a process or a false positive.
I should have opened another account to test this against with the same file but to be honest if I found a bug, windows live support has been abysmal. Hell they couldn't even work out how to close my account when the close account page refused to work...
Yuck. Sorry that you had a bad experience. It sounds like you hit a nasty sync bug on the destination machine. We've been patching a bunch of issues with client sync reliability over the last year. It's hard to diagnose at this point, but please reach out if you can reproduce it on the current version.
FWIW, I can say with confidence that your issue had nothing to do with the fact that the file was an Office ISO.
There are 3 processes: PhotoDNA hashing [1], automated flesh tone detection, and manual review.
1. PhotoDNA runs on every upload. It's only used to identify known child pornography that has already been reported, to make sure it can't be re-uploaded.
2. Automated flesh tone detection only runs when a photo is shared. (This is a change in policy; it used to run on upload.) There are heuristics that try to measure whether it's personal sharing or broad sharing, and we're continually improving those. The goal is to make flesh tone detection only run during broad sharing.
3. If the broad sharing criteria is met and automated flesh tone detection triggers a positive result, that is the only case in which an item is anonymously sent to manual review. It's some highly controlled clean-room environment where a dedicated team tries to determine whether the content is a legal risk or not. Clear cases of shared child exploitation porn are reported. (A parent's "baby in bathtub" type of photos are not the target here.) In most cases, it's adult pornography or family photos. In those cases, the folder is marked as porn and simply can't be shared again. (There's a user-visible message on the web UI.) It's not deleted, and it continues to be fully accessible to the owner across all machines.
The scanning policy used to be more aggressive and didn't exclude content that was unshared or only shared to a small set of people. None of us liked that policy to begin with, and then some high-profile false positives helped force the policy to be revised.
I keep reading that OneDrive lets users upload adult porn either through here, or reddit AMAs, etc.
However, the terms that are linked to me at the bottom of OneDrive.com specifically tell me that uploading porn is not allowed and presumably (haven;t double checked) tell me that if I do my MSA will be deactivated.
It's nice to have you and co. tell me that you allow porn, but the fact that the terms I legally agree to contradict what you say sort of puts me in an uncomfortable position.
Have you thought about changing the terms of use to accurately reflect your policies?
Why don't you encrypt those pictures before getting them out of your control on the "cloud"? I would extend this suggestion to every other file but those in particular kind of scary me to be available somewhere else without encryption.
Might be a dumb question ("have you tried restarting your computer?") but does the ISO come up through the web browser at onedrive.com? You only mentioned checking the sync folders.
They really need to increase the maximum file size to be competitive. As a comparison, Dropbox files can be up to 10 GB each[0], and Google Drive files can be up to 1 TB each[1].
Its disingenuous to say that Dropbox has a file size limit of 10GB, when the first thing written on that page you cited says otherwise... The 10GB limit only applies to files uploaded through the web interface. Files uploaded through the OS/mobile client have no file size restriction. [0]
For copyrighted content, we have to respond to DMCA notices like other services. Sharing content to the public and getting reported by a third party is the only path for that. And in those cases, you definitely get a specific notice about the takedown. The web UI would also show you exactly which file was affected, and prevent you from sharing it again. It doesn't just delete files. (That would be unacceptable.)
Note that there's currently a 2 GB file size limit. It seems like the most likely explanation is that you put a large ISO in your SkyDrive folder, and it never succeeded in uploading because it exceeded the limit.