I think it's smart to start trying things here. This has infinite flaws with it, but from a business and learnings standpoint it's a step toward the right direction. Over time we're going to both learn and decide what is and isn't important to designate as "AI" - Google's approach here at least breaks this into rules of what "AI" things are important to label:
• Makes a real person appear to say or do something they didn't say or do
• Alters footage of a real event or place
• Generates a realistic-looking scene that didn't actually occur
At the very least this will test each of these hypotheses, which we'll learn from and iterate on. I am curious to see the legal arguments that will inevitably kick up from each of these - is color correction altering footage of a real event or place? They explicitly say it isn't in the wider description, but what about beauty filters? If I have 16 video angles, and use photogrammetry / gaussian splatting / AI to generate a 17th, is that a realistic-looking scene that didn't actually occur? Do I need to have actually captured the photons themselves if I can be 99% sure my predictions of them are accurate?
So many flaws, but all early steps have flaws. At least it is a step.
One black hat thing I'm curious about though is whether or not this tag can be weaponized. If I upload a real event and tag it as AI, will it reduce user trust that the real event ever happened?
The AI tags are fundamentally useless. The premise is that it would prevent someone from misleading you by thinking that something happened when it didn't, but someone who wants to do that would just not tag it then.
Which is where the real abuse comes in: You post footage of a real event and they say it was AI, and ban you for it etc., because what actually happened is politically inconvenient.
And the only way to prevent that would be a reliable way to detect AI-generated content which, if it existed, would obviate any need to tag anything because then it could be automated.
I think you have a bit backwards. If you want to publish pixels on a screen there should be no assumption that they represent real events.
If you want to publish proof of an event, you should have some pixels on a screen along with some cryptographic signature from a device sensor that would necessitate atleast a big corporation like Nikon / Sony / etc. being "in on it" to fake.
Also since no one likes RAW footage it should probably just be you post your edited version which may have "AI" upscaling / de-noising / motion blur fixing etc, AND you can post a link to your cryptographically signed verifiable RAW footage.
Of course there's still ways around that like your footage could just be a camera being pointed at an 8k screen or something but at least you make some serious hurdles and have a reasonable argument to the video being a result of photons bouncing off real objects hitting your camera sensor.
> If you want to publish proof of an event, you should have some pixels on a screen along with some cryptographic signature from a device sensor that would necessitate atleast a big corporation like Nikon / Sony / etc. being "in on it" to fake.
At which point nobody could verify anything that happened with any existing camera, including all past events as of today and all future events captured with any existing camera.
Then someone will publish a way to extract the key from some new camera model, both allowing anyone to forge anything by extracting a key and using it to sign whatever they want, and calling into question everything actually taken with that camera model/manufacturer.
Meanwhile cheap cameras will continue to be made that don't even support RAW, and people will capture real events with them because they were in hand when the events unexpectedly happened. Which is the most important use case because footage taken by a staff photographer at a large media company with a professional camera can already be authenticated by a big corporation, specifically the large media company.
also the three letter agencies (not just from the US) will have access to private keys of at least some manufacturers, allowing them to authenticate fake events and sow chaos by strategically leaking keys for cameras that recorded something they really don't like.
For all the folks that bash the United States for "reasons" this one gave me a chuckle. Our handling of privacy and data and such is absolute ass, but at least we *can* hide our data from big government with little repercussion in most cases (translation: you aren't actively being investigated for a crime that a judge isn't aware of)
Of course that says nothing about the issues of corruption of judges in the court system, but that is a "relatively" new issues that DOES absolutely need to be addressed.
(Shoot one could argue that the way certain folks are behaving right now is in itself unconstitutional and those folks should be booted)
Countries all over the world (EVEN IN EUROPE WITH THE GDPR) are a lot less "gracious" with anonymous communication. The UK actually has been trying to outlaw private encryption, for a while now, as an example, but there are worse examples from certain other countries. You can find them by examining their political system, most (all? I did quit a bit of research, but also was not interested in spending a ton of time on this topic) are "conservative leaning"
Note that I'm not talking just about existing policy, but countries that are continually trying to enact new policy.
Just like the US has "guarantees" on free speech, the right to vote, etc. The world needs guaranteed access to freedom of speech, religion, right to vote, healthcare, food, water, shelter, electricity, and medical care. I don't know of a single country in the world, including the US, that does anywhere close to a good of job with that.
I'm actually hoping that Ukraine is given both the motive and opportunity to push the boundaries in that regard. If you've been following some of the policy stuff, it is a step in the right direction. I 100% know they won't even come close to getting the job done, but they are definitely moving in the right direction. I definitely do not support this war, but with all of the death and destruction, at least there is a tiny little pinprick of light...
...Even if a single country in the world got everything right, we still need to find a way to unite everyone.
Our time in this universe is limited and our time on earth more-so. We should have been working together 60 years ago for a viable off-planet colony and related stuff. If the world ended tomorrow, humanity would cease to exist. You need over 100,000 people to sustain the human race in the event a catastrophic event wipes almost everyone out. Even if we had 1,000 people in space, our species would be doomed.
I am really super surprised that basic survival needs are NOT on the table when we are all arguing about religion, abortion, guns, etc. Like really?
> We should have been working together 60 years ago for a viable off-planet colony and related stuff. If the world ended tomorrow, humanity would cease to exist. You need over 100,000 people to sustain the human race in the event a catastrophic event wipes almost everyone out.
We are hundreds of years away from the kind of technology you would need for a viable fully self-sustainable off-world colony that houses 100k or more humans. We couldn't even build something close to one in Antarctica.
This kind of colony would need to span half of Mars to actually have access to all the resources it needs to build all of the high-tech gear they would require to just not die of asphixiation. And they would need top-tier universities to actually have people capable of designing and building those high-tech systems, and media companies, and gigantic farms to make not just food but bioplastics and on and on.
Starting 60 years earlier on a project that would take a millennium is ultimately irrelevant.
Not to mention, nothing we could possibly do on Earth would make it even a tenth as hard to live here than on Mars. Nuclear wars, the worse bio-engineered weapons, super volcanoes - it's much, much easier to create tech that would allow us to survive and thrive after all of these than it is to create tech for humans to survive on a frozen irradiated dusty planet with next to no atmosphere. And Mars is still the most hospitable other celestial body in the solar system.
> Nuclear wars, the worse bio-engineered weapons, super volcanoes - it's much, much easier to create tech that would allow us to survive and thrive after all of these than it is to create tech for humans to survive on a frozen irradiated dusty planet with next to no atmosphere.
This is the best argument I've heard for why we should do it. Once you can survive on Mars you've created the technology to survive whatever happens on Earth.
> I am really super surprised that basic survival needs are NOT on the table when we are all arguing about religion, abortion, guns, etc. Like really?
Most people in the world struggle to feed themselves and their families. This is the basic survival need. Do you think they fucking care what happens to humantiy in 100k years? Stop drinking that transhumanism kool-aid, give your windows a good cleaning and look at what's happening in the real world, every day.
I think doing this right goes the other direction. What we're going to end up with is a focus on provenance.
We already understand that with text. We know that to verify words, we have to trace it back to the source, and then we evaluate the credibility of the source.
There have been periods where recording technology ran ahead of faking technology, so we tended to just trust photos, audio, and video (even though they could always be used to paint misleading pictures). But that era is over. New technological tricks may push back the tide a little here and there, but mostly we're going to end up relying on, "Who says this is real, and why should we believe them?"
> If you want to publish proof of an event, you should have some pixels on a screen along with some cryptographic signature from a device sensor that would necessitate atleast a big corporation like Nikon / Sony / etc. being "in on it" to fake.
That idea doesn't work, at all.
Even assuming a perfect technical implementation, all you'd have to do to defeat it is launder your fake image through a camera's image sensor. And there's even a term for doing that: telecine.
With the right jig, a HiDPI display, and typical photo editing (no one shows you raw, full-res images), I don't think such a signature forgery would detectable by a layman or maybe even an expert.
> I worked in device attestation at Android. It’s not robust enough to put our understanding of reality in.
I don't follow. Isn't software backward compatibility a big reason why Android device attestation is so hard? For cameras, why can't the camera sensor output a digital signature of the sensor data along with the actual sensor data?
I am not sure how verifying that a photo was unaltered after capture from a camera if very useful though. You could just take a photo of a high-resolution display when an edited photo on it
It's true that 1990s pirated videos where someone snuck a handheld camera into the cinema were often very low quality.
But did you know large portions of The Mandalorian were produced with the actors acting in front of an enormous, high-resolution LED screen [1] instead of building a set, or using greenscreen?
It turns out pointing a camera at a screen can actually be pretty realistic, if you know what you're doing.
And I suspect the pr agencies interested in flooding the internet with images of Politician A kicking a puppy and Politician B rescuing flood victims do, in fact, know what they're doing.
That's a freaking massive LED wall... with professional cinematography on top. If you believed my comment was intended to imply that I believed that's somehow impossible, well... you and I have a very different understanding of what it means to "just take a picture of a high-resolution display"...
There's been a slow march to requiring hardware-backed security. I believe all new devices from the last couple of years need a TEE or a dedicated security chip.
At least with Android there are too many OEMs and they screw up too often. Bad actors will specifically seek out these devices, even if they're not very technically skilled. The skilled bad actors will 0-day the devices with the weakest security. For political reasons, even if a batch of a million devices are compromised it's hard to quickly ban them because that means those phones can no longer watch Netflix etc.
But you don't have to ban them for this use case? You just need something opportunistic, not ironclad. An entity like Google could publish those devices' certificates as "we can't verify the integrity of these devices' cameras", and let the public deal with that information (or not) as they wish. Customers who care about proving integrity (e.g., the media) will seek the verifiable devices. Those who don't, won't. I can't tell if I'm missing something here, but this seems much more straightforward than the software attestation problem Android has been dealing with so far.
Cryptography also has answers for some of this sort of thing. For example, you could use STARKs (Succinct Transparent Arguments of Knowledge) to create a proof that there exists a raw image I, and a signature S_I of I corresponding to the public key K (public input), and that H_O (public input) is a hash of an image O, and that O is the output of providing a specified transformation (cropping, JPEG compression) to I.
Then you give me O, I already know K (you tell me which manufacturer key to use, and I decide if I trust it), and the STARK proof. I validate the proof (including the public inputs K and H_O, which I recalculate from O myself), and if it validates I know that you have access to a signed image I that O is derived from in a well-defined way. You never have to disclose I to me. And with the advent of zkVMs, it isn't even necessarily that hard to do as long as you can tolerate the overhead of running the compression / cropping algorithm on a zkVM instead of real hardware, and don't mind the proof size (which is probably in the tens of megabytes at least).
Not if you do it, only if the chip also gives you a signed JPEG. Cropping and other simple transformations aren't an issue, though, since you could just specify them in unsigned metadata, and people would be able to inspect what they're doing. Either way, just having a signed image from the sensor ought to be adequate for any case where the authenticity is more important than anesthetics. You share both the processed version and the original, as proof that there's no misleading alteration.
> You share both the processed version and the original, as proof that there's no misleading alteration
so you cannot share the original if you intend to black out something from the original that you don't want revealed (e.g., a face or name or something).
The way you specced out how a signed jpeg works means the raw data _must_ remain visible. There's gonna be unintended consequences from such a system.
And it aint even that trustworthy - the signing key could potentially be stolen or coerced out, and fakes made. It's not a rock-solid proof - my benchmark for proof needs to be on par with blockchains'.
> The way you specced out how a signed jpeg works means the raw data _must_ remain visible. There's gonna be unintended consequences from such a system.
You can obviously extend this if you want to add bells and whistles like cropping or whatever. Like signing every NxN sub-block separately, or more fancy stuff if you really care. It should be obvious I'm not going to design in every feature you could possibly dream of in an HN comment...
And regardless, like I said: this whole thing is intended to be opportunistic. You use it when you can. When you can't, well, you explain why, or you don't. Ultimately it's always up to the beholder to decide whether to believe you, with or without proof.
> And it aint even that trustworthy - the signing key could potentially be stolen or coerced out, and fakes made.
I already addressed this: once you determine a particular camera model's signature ain't trustworthy, you publish it for the rest of the world to know.
> It's not a rock-solid proof - my benchmark for proof needs to be on par with blockchains'.
It's rock-solid enough for enough people. I can't guarantee I'll personally satisfy you, but you're going to be sorely disappointed when you realize what benchmarks courts currently use for assessing evidence tampering...
It also occurs to me that the camera chips -- or even separately-sold chips -- could be augmented to perform transformations (like black-out) on already-signed images. You could even make this work with arbitrary transformations - just sign the new image along with a description (e.g., bytecode) of the sequence of transformations applied to it so far. This would let you post-process authentic images while maintaining authenticity.
AI tags are to cover issues in the other direction: you publish an event as real, but they can prove it wasn't. If you didn't put the tag on it, malice can be inferred from your post (and further legal proceeding/moderation can happen)
It's the same as paid reviews: tags and disclaimers exist to make it easier to handle cases where you intentionally didn't put them.
It's not perfect and can be abused in other ways, but at least it's something.
> The premise is that it would prevent someone from misleading you by thinking that something happened when it didn't, but someone who wants to do that would just not tag it then.
And when they do that, the video is now against Google's policy and can be removed. That's the point of this policy.
Not convinced by this. Camera sensors have measurable individual noise, if you record RAW that won't be fakeable without prior access to the device. You'd have a straightforward case for defamation if your real footage were falsely labeled, and it would be easy to demonstrate in court.
> Camera sensors have measurable individual noise, if you record RAW that won't be fakeable without prior access to the device.
Which doesn't help you unless non-AI images are all required to be RAW. Moreover, someone who is trying to fabricate something could obviously obtain access to a real camera to emulate.
> You'd have a straightforward case for defamation if your real footage were falsely labeled, and it would be easy to demonstrate in court.
Defamation typically requires you to prove that the person making the claim knew it was false. They'll, of course, claim that they thought it was actually fake. Also, most people don't have the resources to sue YouTube for their screw ups.
Moreover, someone who is trying to fabricate something could obviously obtain access to a real camera to emulate.
Yes, but not to your camera. Sorry for not phrasing it more clearly: individual cameras have measurable noise signatures distinct from otherwise identical models.
On the lawsuit side, you just need to aver that you are the author of the original footage and are willing to prove it. As long as you are in possession of both the device and the footage, you have two pieces of solid evidence vs. someone elses feels/half-assed AI detection algorithm. There will be no shortage of tech-savvy media lawyers willing to take this case on contingency.
But who is the "you" in this case? There can be footage of you that wasn't taken with your camera. The person falsifying it would just claim they used their own camera. Which they would have access to ahead of time in order to incorporate its fingerprint into the video before publishing it.
Most consumer cameras require access menus to enable raw because dealing with RAW is a truly terrible user experience. The vast majority of image/video sensors out there don't even support raw recordings, out of the box.
Anyone with a mid-to-upper range phone or better-than-entry level DSLR/bridge camera has access to this, and anyone who uses that camera to make a living (eg shooting footage of protests) understands how to use RAW. I have friends who are complete technophobes but have figured this out because they want to be able to sell their footage from time to time.
The do have some use. Take for example the AI images of the pope wearing luxury brands that someone made about last year. They clearly wanted to make it as a joke, not to purposefully misinform people, and as long as everybody is in on the joke then I see no issue with that. But some people who weren't aware of current AIgen capabilities took it as real and an AI tag would have avoided the discussion of "has AI art gone too far" while still allowing that person to make their joke.
I mean they’re building the labeled dataset right now by having creators label it for them.
I would suspect this helps make moderation models better at estimating confidence levels of ai generated content that isn’t labeled as such (ie for deception).
Surprised we aren’t seeing more of this in labeling datasets for this new world (outside of captchas)
And this isn't new. A fad in films in the 90's was hyper-realistic masks on the one side, and make-up and prosthetics artists on the other, making people look like other people.
Faking things is not new, and you've always been right to mistrust what you see on the internet. "AI" technology has made it easy, convenient, accessible and affordable to more people though, beforehand you needed image/video editing skills and software, a good voice mod, be a good (voice) actor, etc.
> you've always been right to mistrust what you see on the internet.
But these tools make deception easier and cheaper, meaning it will become much more common. Also, it's not just "on the internet". The trust problem this brings up applies to everything.
This deeply worries me. A post-truth society loses it's ability to participate in democracy, becomes a low-trust society, the population falls into learned helplessness and apathy ("who can even know what's true any more?")
Look at Russian society for a sneak preview if we don't get this right.
It just goes back to trusting the source. If 5 media orgs post different recordings of the same political speech, you can be reasonably sure it actually happened, or at least several orders of magnitude more sure than if it's one blurry video from a no name account.
This bodes well for autocracies and would-be autocrats. It's the logical extreme of what they've been trying to do on social media over the last decade or so.
I was immediately thinking that the #AI labels are going to give people a false sense of trust, so that when someone posts a good-enough fake without the #AI label, it can do damage if it goes viral before it gets taken down for the mislabeling. (Kudos for the effort, though, YouTube.)
Behind the scenes, I'm 99% confident that Google has deployed AI detection tools and will monitor for it.
That said, unless all the AI generators agree on a way to add an unalterable marker that something is generated, at one point it may become undetectable. May.
I'm not aware of any AI detection tools that are actually effective enough to be interesting. Perhaps Google has some super-secret method that works, but I rather doubt it. If they did, I think they'd be trumpeting it from the hilltops.
We have to expect people to think for themselves. People are flawed and will be deceived but trying to centralize critical thinking will have far more disastrous results. Its always been that way.
Im not saying Youtube shouldn’t have AI labels. Im saying we shouldn’t assume they’re reliable.
>but trying to centralize critical thinking will have far more disastrous results
No. Having sources of trust is the basis of managing complexity. When you turned the tap water on and bought a piece of meat at the butcher you didn't yourself verify whether its healthy right? You trust the medicine you buy contains exactly what is says on the label and didn't take a chemistry class. That's centralized trust. You rely on it ten thousand times a day implicitly.
There need to be measures to make sure media content is trustworthy, because the smartest person on the earth doesn't have enough resources to critically judge 1% of what they're exposed to every day. It is simply a question of information processing.
It's a mathematical necessity. Information that is collectively processed constantly goes up, individiual bandwith does not, therefore you need more division of labor, efficieny and higher forms of social organisation.
> Having sources of trust is the basis of managing complexity.
This is a false equivalence that I’ve already addressed.
> When you turned the tap water on and bought a piece of meat at the butcher you didn't yourself verify whether its healthy right?
To a degree, yeah, you do check. Especially when you get it from somewhere with prior problems. And if you see something off you check further and adjust accordingly.
Why resort to anology? Should we blindly trust YouTube to judge whats true or not? I stated that labeling videos is fine but what’s not fine is blindly trusting it.
Additionally, comparing to meat dispenses with all the controversy because food safety is a comparatively objective standard.
Compare, “is this steak safe to eat or not?” To “is this speech safe to hear or not?”
I'm probably paraphrasing Schneier (and getting it wrong), but getting water from the tap and having it polluted or poisonous, has legal and criminal consequences. Similarly getting meat from a butcher and having it tainted.
Right now, getting videos which are completely AI/deepfaked to misrepresent, are not subject to the same consequences, simply because either #1 people can't be bothered, #2 are too busy spreading it via social media, or #3 have no idea how to sue the party on the other side.
And therein lies the danger, as with social media, of the lack of consequences (and hence the popularity of swatting, pretexting etc)
I suspect we're headed into a world of attestation via cryptographically signed videos. If you're the sole witness, then you can reduce the trust in the event, however, if it's a major event, then we can fall back on existing news-gathering machinery to validate and counter your false tagging (e.g. if a BBC camera captured the event, or there is some other corroboration & fact checking).
How does the signature help? It only proves that the video hasn't been altered since [timestamp]. It doesn't prove that it wasn't AI-generated or manipulated.
Signatures are also able to (mostly) signal that a specific device (and/or application on that device) captured the video. It would be possible to check if a video was encoded by a specific instance of an iOS Camera app or AfterEffects on PC.
Everything else - corroboration, interviews, fact checking will remain as they are today and can't be replaced by technology. So I imagine a journalist would reach out to person who recorded thr video, ask them to show their device's fingerprint and ask about their experience when (event) occured, and then corroborate all that information from other sources.
When the news org publishes the video, they may sign it with their own key and/or vouch for the original one so viewers of clips on social media will know that Fox News (TM) is putting their name and reputation behind the video, and it hasn't been altered from the version Fox News chose to share, even though the "ModernMilitiaMan97" account that reshared it seems dubious.
Currently, there's no way to detect alterations or fabrications of both the "citizen-journalist" footage and post-broadcast footage.
If I have a CCTV camera that is in a known location and a TPM that signs its footage, I could probably convince a jury that it’s legit in the face of a deepfake defense.
That’s the bar- it’s not going to be infallible but if you don’t find evidence of tampering with the hardware then it’s probably going to be fine.
This might be worse than nothing. It's exactly the same tech as DRM, which is good enough to stop the average person, but where tons of people have private exploits stashed away to crack it. So the judge and general public trust the system to be basically foolproof, while criminals can forge fake signatures using keys they extracted from the hardware.
This isn’t strictly some blackhat thing, people will attempt to hand wave inconvenient evidence against them as AI generated and build reasonable doubt.
Porn classification / regulation boils down to: "I'll know it when I see it." Implying the existence of some hyper vigilant seer who can heroically determine what we should keep behind the video storre curtain of dencey, as if no grey areas exist. This also has the problem of requiring actual unbiased humans to view and accurately assess everything, which of course does not scale.
Perhaps AI classification is the mirror opposite to porn, using the test: "I'll know it when I don't see it", ie, if an average user would mistake AI generated content for reality, it should be clearly labeled as AI. But how do we enforce this? Does such enforcement scale? What about malicious actors?
We could conceivably use good AI to spot the bad AI, an endless AI cat and AI mouse game. Without strong AI regulation and norms a large portion of the internet will devolve into AI responding to AI generated content, seems like a gigantic waste of resources and the internet's potential.
> We could conceivably use good AI to spot the bad AI
I suspect this is Google's actual goal with the tagging system. It's not so much about helping users, rather it's a way to collect labeled data which they can later use to train their own "AI detection" algorithms
The great thing about AI, is that it's exactly optimized for discriminating in gray areas with difficult-to-articulate rules like "I'll know it when I see it."
Given the regular stories posted on HN about folks who've had some aspect of their social or other media canceled by any some SaaS company, are these companies having many (legal) qualms as it is about canceling people without providing a good reason for it? Would be nice if they did, though...
I'd much prefer Google cancel capriciously with solid TOS backing to it than without, but I'll complain about their double standards about what they choose to censor... Not regardless, but without a doubt, because Google will choose to selectively enforce this rule.
but this gives room to also abuse the uncertainty to censor anyone without recourse - by arguing such and such video is "AI" (true or not), they have a plausiblely deniable reason to remove a video.
Power is power - can be used for good or bad. This labelling is a form of power.
This is no different from their existing power. They can already claim that a video contained copyright infringement, and you can only appeal that claim once, or try to sue Google.
I think the real benefit for this is that probably that it establishes trust as the default, and acts as a discriminator for good-faith uses of "AI". If most non-malicious uses of ML are transparently disclosed, and that's normalized, then it should be easier to identify and focus on bad-faith uses.
> I am curious to see the legal arguments that will inevitably kick up from each of these
Google policy isn't law; there's no court judging legal arguments, it is enforced at Google’s whim and with effectively no recourse, at least not one which is focused on parsing arguments about the details of the policy.
So there won’t be “legal arguments” over what exactly it applies to.
Can’t they be sued for breach of contract if they aren’t following their own tos? Having a rule like this gives them leeway to remove what they consider harmful.
No, because the ToS says they can do anything they want to. The purpose of a ToS is to set an expectation on which things the company will tolerate users doing. It's (almost always) not legally binding for either party.
As someone who studied video production two decades ago, regarding the criteria you mentioned for AI:
- Makes a real person appear to say or do something they didn't say or do
- Alters footage of a real event or place
- Generates a realistic-looking scene that didn't actually occur
These are things that have been true of edited video since even before AI was a thing. People
can lie about reality with videos, and AI is just one of many tools to do so. So, as you said, there are many flaws with this approach, but I agree that requiring labels is at least a step in the right direction.
For videos that requires pretty significant effort even today to be done by hand. Humans don't scale. AI does.
I think anonymity is completely dead now. Not due to social networks but AI will definitely kill it. With enough resources it is possible to engage an AI arms race against whatever detector they put.
It is also possible to remove any watermarking. So the only way to prove if the content is made by humans will be requiring extensive proofs to their complete identity.
If you're from a minority or a fringe group, all of your hopes about spreading awareness anonymously on popular social media will be gone.
I wonder if this will make all forms of surveillance, video or otherwise, inadmissible in court in the near future. It doesn’t seem like much of a stretch for a lawyer to make an argument for reasonable doubt, with any electronic media now.
> I wonder if this will make all forms of surveillance, video or otherwise, inadmissible in court in the near future.
No, it won't. Just as it does now, video evidence (like any other evidence that isn't testimony) will need to be supported by associated evidence (including, ultimately, testimony) as to its provenance.
> It doesn’t seem like much of a stretch for a lawyer to make an argument for reasonable doubt,
“Beyond a reasonable doubt” is only the standard for criminal convictions, and even then is based on the totality of evidence tending support or refute guilt, its not a standard each individual piece of evidence must clear for admissibility.
Bad evidence is not the same thing as inadmissible evidence. Evidence is admitted, and then the fact finder determines whether to consider it, and how much weight to give it. It is likely that surveillance video will be slightly less credible now, but can still be part of a large, convincing body of evidence.
Video evidence already requires attestation to be admissible evidence. You need a witness to claim that the footage comes from a camera that was placed there, that it was collected from the night of, etc. It's not like the prosecutor gets a tape in the mail and they can present it as evidence.
By that rationale, all witness testimony and written evidence should already be inadmissible.
This website focuses too much on the technical with little regard for the social a bit too often. Though in general, videos being easily fakable is still scary.
Don't worry Google has become incompetent - it is in the "Fading AOL" portion of its life cycle. Do you remember how incontinent AOL became in the final minutes before it became completely irrelevant? Sure it's taking longer with Google. But it's not a process that they can reverse.
That means the system will be really really awful. So challengers can arise - maybe a challenger that YOU build, and open source!
That open source can replace corporate centralization? Since centralized platforms started extracting more profits (including manipulation) things like Fediverse are on the rise. For mindless browsing, centralized is still king for now (Fediverse also works to an extent) but if your site has something better than what's on the centralized corporate platform, people will go there once they learn about it. We're on Hacker News instead of Reddit because?
1. Google lost the battle for LLMs, and cannot win that battle without putting a nail in the coffin of its own search monetization strategy
2. Google search is now no better than DDG, which it also cannot recover from without putting a nail in the coffin of its own search monetization strategy
If you want a randomly accuse me of living in a bubble that's fine. It won't bother me one bit. Me living in a bubble doesn't change the fact that Google is backed into a corner when it comes to search quality and competing with LLMs.
Duck duck go is now more effective than Google, and I've been using it instead of Google for 5 years. Now I mainly ask GPT4 questions, and eliminate the need for search altogether in most cases. I run ad blockers on YouTube. And they're not getting cash from me via Android.
You may be right that they can survive more readily than AOL did.. but I certainly won't help them with anything more than a kick in the pants! ;)
I know people on HN love to hate on Google, but at least they're a major platform that's TRYING. Mistakes will be made, but let's at least attempt at moving forward.
Because right now AI is an issue the public and policymakers are concerned about, and this is to show that private industry can take adequate steps to control it to stave up government regulation while the attention is high.
keep holding ourselves back with poorly written legislation designed to garner votes while rival companies take strides in the technology at rapid rates
Examples of content creators don’t have to disclose:
* Someone riding a unicorn through a fantastical world
* Green screen used to depict someone floating in space
* Color adjustment or lighting filters
* Special effects filters, like adding background blur or vintage effects
* Production assistance, like using generative AI tools to create or improve a video outline, script, thumbnail, title, or infographic
* Caption creation
* Video sharpening, upscaling or repair and voice or audio repair
* Idea generation
Examples of content creators need to disclose:
* Synthetically generating music (including music generated using Creator Music)
* Voice cloning someone else’s voice to use it for voiceover
* Synthetically generating extra footage of a real place, like a video of a surfer in Maui for a promotional travel video
* Synthetically generating a realistic video of a match between two real professional tennis players
* Making it appear as if someone gave advice that they did not actually give
* Digitally altering audio to make it sound as if a popular singer missed a note in their live performance
* Showing a realistic depiction of a tornado or other weather events moving toward a real city that didn’t actually happen
* Making it appear as if hospital workers turned away sick or wounded patients
* Depicting a public figure stealing something they did not steal, or admitting to stealing something when they did not make that admission
* Making it look like a real person has been arrested or imprisoned
> * Voice cloning someone else’s voice to use it for voiceover
This is interesting because I was considering cloning my own voice as a way to record things without the inevitable hesitations, ums, errs, and stumbling over my words. By this standard I am allowed to do so.
But then I thought what does it even mean "someone else's" when multiple people can make a video, if my wife and I make a video together can we not then use my recorded voice because to her my voice is someone else.
I suspect all of these rules will have similar edge cases and a wide penumbra where arbitrary rulings will be autocratically applied.
A lot different? The equivalent would be applying makeup to your mannequin replacement. All the things you mention are decoration. Replacing your voice is more than a surface alteration. I guess if some clever AI decides to take issue with what I say, and uses some enforcement tactic to arm-twist my opinion, I could change my mind.
I think the suggestion being discussed was AI-cloning your voice, and then using that for text-to-speech. Audio generation, rather than automating the cuts and tweaks to the recorded audio.
Yes exactly. Thanks. Voice cloning was indeed the suggestion made above.
The ethical challenge in my opinion, is that your status as living human narrator on a video is now irrelevant, when you're replaced by voice cloning. Perhaps we'll see a new book by "George Orwell" soon. We don't need the real man, his clone will do.
Did you replace the real you with "AI you" to ask me "why"?
I presume you wouldn't do that? How about replacing your own voice on a video that you make, allowing your viewers to believe it's your natural voice? Are you comfortable with that deception?
Everyone has evolving opinions about this subject. For me, I don't want to converse with stand-in replacements for living people.
There are many cases where such content is perfectly fine. After all, YouTube doesn't claim to be a place devoted to non-fiction only. The first one is an especially common thing in fiction.
The third one could easily be satire. Imagine that a politician is accused of stealing from the public purse, and issues a meme-worthy press statement denying it, and someone generates AI content of that politician claiming not to have stolen a car or something using a similar script.
Valid satire, fair use of the original content: parody is considered transformative. But it should be labeled as AI generated, or it's going to escape onto social media and cause havoc.
It might anyway, obviously. But that isn't a good reason to ban free expression here imho.
Respectfully disagree. Satire should not be labelled as satire. Onus is on the reader to be awake and thinking critically—not for the entire planet to be made into a safe space for the unthinking.
It was never historically the case that satire was expected to be labelled, or instantly recognized by anyone who stumbled across it. Satire is rude. It's meant to mock people—it is intended to muddle and provoke confused reactions. That's free expression nonetheless!
So when we have perfect deep fakes that are indistinguishable from real videos and people are using it for satire, people shouldn’t be required to inform people of that?
How is one to figure out what is real and what is a satire? Times and technologies change. What was once reasonable won’t always be.
- "How is one to figure out what is real and what is a satire?"
Context, source, tone of speech, and reasonability.
- "Times and technologies change."
And so do people! We adapt to times and technology; we don't need to be insulated from them. The only response needed to a new type of artificial medium, is, that people learn to be marginally more skeptical about that medium.
Nah. Satire was always safe when it's not pretending to have documented evidence of the thing actually happening.
Two recent headlines:
* Biden Urges Americans Not To Let Dangerous Online Rhetoric Humanize Palestinians [1]
* Trump says he would encourage Russia to attack Nato allies who pay too little [2]
Do you really think, if you jumped back a few years, you could have known which was satire and which wasn't?
The fact that we have video evidence of the second is (part) of how we know it's true. Sure, we could also trust the reporters who were there, but that doesn't lend itself to immediate verification by someone who sees the headline on their Facebook feed.
If the first had an accompanying AI video, do you think it would be believed by some people who are willing to believe the worst of Biden? Sure, especially in a timeline where the second headline is true.
In one of the examples, they refer to something called "Dream Track"
> Dream Track in Shorts is an experimental song creation tool that allows creators to create a unique 30-second soundtrack with the voices of opted-in artists. It brings together the expertise of Google DeepMind and YouTube’s most innovative researchers with the expertise of our music industry partners, to open up new ways for creators on Shorts to create and engage with artists.
> Once a soundtrack is published, anyone can use the AI-generated soundtrack as-is to remix it into their own Shorts. These AI-generated soundtracks will have a text label indicating that they were created with Dream Track. We’re starting with a limited set of creators in the United States and opted-in artists. Based on the feedback from these experiments, we hope to expand this.
So my impression is they're talking about labeling music which is derived from a real source (like a singer or a band) and might conceivably be mistaken for coming from that source.
Even if it is fully AI-generated, this requirement seems off compared to the other ones.
In all of the other cases, it can be deceiving, but what is deceiving in synthetic music? There may be some cases where it is relevant, like when imitating the voice of a famous singer, but other than that, music is not "real", it is work coming from the imagination of its creator. That kind of thing is already dealt with with copyright, and attribution is a common requirement, and one that YouTube already enforces (how it does that is different matter).
From a Google/Alphabet perspective it could also be valuable to distinguish between „original“ and „ai generated“ music for the purpose of a cleaner database to train their own music generation models?
> If you manually did enough work have the copyright it is fine.
Amount of work is not a basis for copyright. (Kind of work is, though the basis for the “kind” distinction used isn't actually a real objective category, so its ultimately almost entirely arbitary.)
That could get tricky. A lot of hardware and software MIDI sequencers these days have probabilistic triggering built in, to introduce variation in drum loops, basslines, and so forth. An argument could be made that even if you programmed the sequence and all the sounds yourself, having any randomization or algorithmic elements would make the resulting work ineligible for copyright.
If someone else uses the same AI generator software and makes the same piece of music should Google go after them for it? I don't think that would hold in court.
Hopefully this means that AI generated music gets skipped by Googles DRM checks.
I hope there is some kind of middle ground, legally, here? Like say you use a piano that uses AI to generate artificial piano sounds, but you create and play the melody yourself: can you get copyright or not?
I think there's a clear difference between synthesizing music and synthetically generating music. One term has been around for decades and the other one is being confused with that.
To someone who is doing one or the there is a clear difference. I don't trust the EU or YouTube to be able to tell the difference from the other end, by the end product alone.
If AI writes MIDI input for a synthesizer, rather than producing the actual waveform, where does that land?
>Showing a realistic depiction of a tornado or other weather events moving toward a real city that didn’t actually happen
A bit funny considering a realistic warning and "live" radar map of an impending, major, natural disaster occurring in your city apparently doesn't violate their ad policy on YouTube. Probably the only time an ad gave me a genuine fright.
There’s a whole genre of videos on YouTube simulating the PSAs of large scale disasters. Nuclear war, meteors, etc. My 12 year old is really into them.
Interestingly they only say you have to disclose it if it's a singer missing a note. Seems like it's fair game to fix a note that was off key in real life and not disclose that.
Under the current guidelines, doesn't all music performances that make use of some sort of pitch correction assist are technically "digitally altered"?
Those voice overs on tiktok that are computer generated but sound quite real and often are reading some script. Do they have to disclose that those voices are artificially produced?
They don't bother to mention it, but this is actually to comply with the the new EU AI act.
> Providers will also have to ensure that AI-generated content is identifiable. Besides, AI-generated text published with the purpose to inform the public on matters of public interest must be labelled as artificially generated. This also applies to audio and video content constituting deep fakes
Is anyone else worried about how naive this policy is?
The solution here is for important institutions to get onboard with the public key infrastructure, and start signing anything they want to certify as authentic.
The culture needs to shift from assuming video and pictures are real, to assuming they are made the easiest way possible. A signature means the signer wants you to know the content is theirs, nothing else.
It doesn't help to train people to live in a pretend world where fake content always has a warning sticker.
I see a lot of confusing authenticity with accuracy. Someone can sign the statement "Obama is white" but that doesn't make it a true statement. The use of PKI as part of showing provenance/chain of trust doesn't make any claims about the accuracy of what is signed. All it does is assert that a given identity signed something.
It's not about what is being signed, it's about who signed it and whether you trust that source. I want credible news outlets to start signing their content with a key I can verify as theirs. In that future all unsigned content is by definition fishy. PKI is the only way to implement trust in a digital realm.
> It's not about what is being signed, it's about who signed it
Yeah, that's what I said about PKI. I also said there is confusion between provenance of a statement and its accuracy. Just because it's signed doesn't mean anything about its accuracy, but those that confuse the two will think that just because it is signed, or because it was signed by a certain "trustworthy" party, that indicates accuracy. PKI does not establish the trustworthiness of the other party, it only gives you confidence in the identity of the party who signed something.
George Santos could sign his resume. We know it was signed by George Santos. And yet nothing in the resume could be considered accurate (or even a falsehood) purely because it is signed. That it was proven to be signed by George Santos via PKI is independent of the fact that George Santos is a known liar.
Why do you need a whole PKI for that, rather than just, say, a link to the news outlet's website where the content is hosted? People have already been doing that pretty much since the web was created.
PKI has been around for, what, 30 years? Image authentication is just not going to happen at this point, because everyone's got too used to post-processing and it's a massive hassle for something that ultimately doesn't matter because real people use other processes to determine whether things are true or not.
Example: a video shows a group of police beating up a man for a minor crime (say littering). The video is signed by Michael Smith (the random passerby who filmed it on his phone). The video is published to Instagram and shared widely.
How do you expect people to take the authenticity of this video?
This is about as realistic as the next generation of congress people ending up 40 years younger.
We literally have politicians talking about pouring acid on hardware and expect these same bumbleheads to keep their signing keys safe at the same time. The average person is far too technologically illiterate to do that. Next time you go to grandmas house you'll learn she traded her signing key for chocolate chip cookies.
I imagine it would be something handled pretty automatically for everyone.
If Apple wanted to sign every photo and document the iPhone they could probably make the whole user experience simple enough for most grandmas.
Some people will certainly give away their keys, just like bank accounts and social security numbers today, but those people probably aren't terribly concerned with proving the ownership of their online documents.
>I imagine it would be something handled pretty automatically for everyone.
Then your imagination fails you.
If it is automatic/easy, then you have the 'easy key' problem, such as the key is easy to steal or copy. For example is it based on your apple account? Then what occurs with an account is stolen? Is it based on a device, what happens when the device is stolen?
Who's doing the PKI? Is it going to be like https, but for individuals (this has never really worked at this scale and with revocation). Like most social media is posting content taken by randos on the internet.
When your account is stolen someone can create "official" documents in your name and impersonate you. There could be a system for invalidating your key after a certain date to help out with those situations.
For prominent people who actually have to worry about being impersonated they could provide their own keys.
The infrastructure could be managed by multiple groups or a singular one like the government. The point isn't to be a perfect system, it's to generate enough trust that what you're looking at is genuine and not a total fraud.
In a world where AI bots are generating fake information about everyone in the world, that kind of system could certainly be built and be useful.
> The culture needs to shift from assuming video and pictures are real, to assuming they are made the easiest way possible.
That sounds like a dystopia, but I guess we're going into that direction. I expect that a lot of fringe groups like flat-earthers, lizard people conspiracy, war in Ukraine is fake, will become way more mainstream.
Usually when a big corporation gleefully announces a change like this it's worth checking whether there's any regulations on that topic taking effect in the near future.
On a local level, I recall how various brands started making a big deal of replacing disposable plastic bags with canvas or paper alternatives "for the environment" just coincidentally a few months before disposable plastic bags were banned in the entire country.
Seems like this is sort of a manufactured argument. I mean, should every product everywhere have to cite every regulation it complies with? Your ibuprofen bottle doesn't bother to cite the FDA rules under which it was tested. Your car doesn't list the DOT as the reason it's got ABS brakes.
The EU made a rule. YouTube complied. That changes the user experience. They documented it.
+1 in France at least, food products must not suggest that mandatory properties like "preservative free" is unique. When they advertise this on the package, they must disclose it's per regulation. Source: https://www.economie.gouv.fr/particuliers/denrees-alimentair...
Doesn't seem that out of place for a blog post on the exact change they made to comply though.
I mean you'd expect a pharmaceutical company to mention which rules they comply with at some point, even if not on the actual product (though in the case of medicine, probably also on the actual product).
So you making good pay by enabling a scammer makes it totally okay for the scammer to operate? By extension of that logic, hitmen should no longer be persecuted provided they make good pay from it.
You'd think they're evil too if they let a bunch of middlemen and parasitic companies dictate how the software you invested untold sums and hours developing and marketing should work.
Sorry, I wasn't entirely clear that I was specifically responding to the GP comment referencing the EU AI act (as opposed to creating a new top-level comment responding to the original blog post and Google's specific policy) which pointed out:
> Besides, AI-generated text published with the purpose to inform the public on matters of public interest must be labelled as artificially generated. This also applies to audio and video content constituting deep fakes
Clearly "AI-generated text" doesn't apply to YouTube videos.
But, it is interesting that if you use an LLM to generate text and present that text to users, you need to inform them it was AI-generated (per the act). But if a real person reads it out, apparently you don't (per the policy)?
This seems like a weird distinction to me. Should the audience be informed if a series of words were LLM-generated or not? If so, why does it matter if they're delivered as text, or if they're read out?
Most interesting example to me: "Digitally altering audio to make it sound as if a popular singer missed a note in their live performance".
This seems oddly specific to the inverse of what happened recently with Alicia Keys from the recent Superbowl. As Robert Komaniecki pointed out on X [1], Alicia Keys hit a "sour note" which was silently edited by the NFL to fix it.
Correct, it's the inverse that requires disclosure by Youtube.
Still, I find it interesting. If you can't synthetically alter someone's performance to be "worse", is it OK that the NFL synthetically altered Alicia Key's performance to be "better"?
For a more consequential example, imagine Biden's marketing team "cleaning up" his speech after he has mumbled or trailed off a word, misleading the US public during an election year. Should that be disclosed?
I don't understand the distinction. if the intent is to protect the user, then what if I make the sound better for rival contestants on American idol and don't do it for singers of a certain race.
This is a great example as a discussion point, thank you for sharing.
I will be coming back to this video in several months time to check whether the "Altered or synthetic content" tag has actually been applied to it or not. If not, I will report it to YouTube.
However autotune has existed for decades. Would it have been better if artists were required to label when they used autotune to correct their singing? I say yes but reasonable people can disagree!
I wonder if we are going to settle on an AI regime where it’s OK to use AI to deceptively make someone seem “better” but not to deceptively make someone seem “worse.” We are entering a wild decade.
A lot of people do! Tone correction [1] is a normal fact of life in the music industry, especially in recordings. Using it well takes both some degree of vocal skill and production skill.
You'll often find that it's incredibly obvious when done poorly, but nearly unnoticeable when done well.
>Some examples of content that require disclosure include: [...] Generating realistic scenes: Showing a realistic depiction of fictional major events, like a tornado moving toward a real town.
This sounds like every thumbnail on youtube these days. It's good that this is not limited to AI, but it also means this will be a nightmare to police.
Exactly, and many have done exactly the same kind of video using VFX. What's the difference?
These kind of reactions remind me of the stories of the backlash following the introduction of calculators in schools...
Using VFX for realistic scenes is more involved. VFX requires more expertise to do convincingly and realistically, in the thousands of hours of experience. More involved scenes require multiple professionals. The tooling and assets costs more. An inexperienced person, in a hundred hours of effort, can put out 10ish realistic scenes with leading edge AI tools, when previously they could do 0.
This is like regulating handguns differently from compound bows. Both are lethal weapons, but the bow requires hours of training to use effectively, and is more difficult to carry discreetly. The combination of ease, convenience, and accessibility necessitates new regulation.
This being said, AI for video is an incredibly promising technology, and I look forward to watching the TV shows and movies generated with AI-powered tooling.
What if new AI tools negate the thousands of hours experience to generate realistic VFX scenes, so now realistic scenes can be made by both non-AI VFX experts and AI-assisted VFX laymen?
Do we make all usages of VFX now require a warning, just in case the VFX was generated by AI?
I think this is different to the bow v gun metaphor as I can tell an arrow from a bullet, but I can foresee a future where no human could tell the difference between AI-assisted and non-AI-assisted VFX / art
I believe this is evidenced by the fact that people can go around accusing any art piece of being AI art and the burden of proving them wrong falls on the artist. Essentially I believe we are rapidly approaching the point of it not mattering if someone uses AI in their art because people won't be able to tell anyway
> Using VFX for realistic scenes is more involved.
This really depends on what you're doing. There are some great Cinema 4d plugins out there. As the plethora of YouTube tutorials out there clearly demonstrate, multiple professionals, and vast experience, are not required for some of the things they have listed. Tooling and assets costs are 0, in the high seas.
Until Sora is widely available, or the open source models catch up, at this moment it's easier to use something like Cinema 4d than AI.
I'm sorry, but using a calculator to get around having to learn arithmetic is not even close being the same thing. Prove to me that you can do basic arithmetic, and then we can move on to using calculators for the more complex stuff where if you had to could at least come to the same value as the calculator.
People using VFX aren't trying to create images in likeness of another existing person to get people to buy crypto or other scams. Comparing the two is disingenuous at best.
I’m reminded of how banks require people to fill out forms explaining what they’re doing, where it’s expected that criminals will lie, but this is an easy thing to prosecute later after they’re caught.
Could a similar argument be applied here? It doesn’t seem like there is much in the way of consequences for lying to Google. But I suppose they have other ways of checking for it, and catching someone lying is a signal that makes the account more suspicious.
It’s a compliance checkbox for the most part I think. They can stay on top of new legislation by claiming they are providing tools to deal with misinformation, whereas it’d be easier to say that they are encouraging the proliferation of misinformation by not doing anything about it. It certainly shifts the legal question in the way you described it would seem.
Yeah I think it’s a very similar approach to what you’ve described. The scale of YouTube, I don’t think you can just start banning content you don’t like. Instead you have to have a policy, clearly documented, and then you can start enforcing based on that policy.
The other thing is that they don’t necessarily want to ban all of this content. For example a video demonstrating how AI can be used to create misinformation and showing examples, would be fairly clearly “morally” ok. The policy being that you have to declare it allows for this sort of content to live on the platform, but allows you to filter it out in certain contexts where it may be inappropriate (searches for election coverage?) and allows you to badge it for users (like Covid information tags).
> Altering footage of real events or places: Such as making it appear as if a real building caught fire, or altering a real cityscape to make it appear different than in reality.
What about the picture you see before clicking on the actual video? This article of course is addressing the content of the videos, but I can't help but look at the comically cartoonish, overly dramatic -- clickbait -- picture preview of the video.
For example, there is a video about a tornado that passed close to a content author and the author posts video captured by their phone. In the preview image, you see the author "literally getting sucked into a tornado". Is that "altered and synthetic content"?
Without enforceability it'll go the same way as it has on Pixiv, the good actors will properly label their AI utilizing work, while the bad actors will continue to lie to try to maximize their audience until they get caught, then rinse and repeat. Kind of like crypto-scammers.
For context, Pixiv had to deal with a massive wave of AI content being dumped onto the site by wannabe artists basically right as the initial diffusion models became accessible. They responded by making 'AI-generated' a checkbox to go with the options to mark NSFW and adding an option for users to disable AI-generated content from being recommended to them. Then, after an incident of someone using their Patreon style service to pretend to be a popular artist, selling commissions generated by AI to copy the artist's style, they banned AI-generated content from being offered through that service.
I think that the idea is mostly to dictate culture. And I like the idea, not only for preventing fraud. Ever since the first Starship launches, the reality looks more incredible than the fiction. Go look up the SN-8 landing video, tell me that does not look generated. I just want to know what is real and what is generated, by AI or not.
I think that this policy is not perfect, but it is a step in the right direction.
I think that for now they're just going to use it as a means of figuring out what kind of AI-involved content people are ok with and what kind they react negatively to.
Personally, I've developed a strong aversion to content that is primarily done by AI with very little human effort on top. After how things went with Pixiv I've come to hold the belief that our societies don't help people develop 'cultural maturity'. People want the clout/respect of being a popular artist/creator, without having to go through the journey they all go through which leads to them becoming popular. It's like wanting to use the title of Doctor without putting in the effort to earn a doctorate, the difference just being that we do have a culture of thinking that it's bad to do that.
I think one of the bigger issues will be false positives. You'll do an upload, and youtube will take it down claiming that some element was AI generated. You can appeal, but it'll get automatically rejected. So you have to rework your video and figure out what it thought might be AI generated and re-upload.
Rather than tagging what’s made up, why not tag what’s genuine? There’s gonna be less of it than the endless mountain of generated stuff.
I’m thinking something as simple as a digital signature that certifies e.g. a photo was made with my phone if I want to prove it, or if someone edits my file there should be a way of keeping track of the chain of trust.
This would I think be the ideal if it's possible. I'd love videos to have signatures that prove when it was recorded, that it was recorded from so and so a phone, that it hasn't had any modification, and maybe even optionally the GPS location (for like news organisations, to even more reliabily prove the validity of their media). And then have a way to have a video format that can allow certain modifications (eg colour grading), but encode that some aesthetic changes were made. And, more importantly, a way to denote that a region of video is a clip of a another video, and provide a backing signature for the validity of the clip.
That would allow a much strong verifiability of media. But I'm not sure if that would be possible...
yeah expect this to flip. i am guessing this will go like “https” path. first we will saw green lock for https enabled sites, later we saw insecure for http sites.
Google of yore would have offered a 'not AI' type of filter in their advanced search.
Present day Google is too busy selling AI shovels to quell Wall St's grumbling, to even consider what AI video will to do to the already bad 'needle in a haystack' nature of search.
The cynic in me thinks this is just Google protecting their precious training data from getting tainted but I’m glad their goals align with what’s better for consumers for once.
I am not envious of the policy folks at Youtube who will have to parse out all the edge cases over the next few years. They are up against a nearly impossible task.
This is interesting because it highlights the trust we've always placed in real-looking images. It brings real-looking images down to the same level as text.
It's always been possible to write fake news. We've never had to add disclaimers at the top of textual content, e.g. "This text is made to sound like it describes real events, but contains invented and/or inaccurate facts." We feel the need to add this to video because until now, if it looked real, it probably was (of course, "creative" editing has existed for a long time, but that's still comparatively easy to spot).
>We've never had to add disclaimers at the top of textual content, e.g. "This text is made to sound like it describes real events, but contains invented and/or inaccurate facts."
The title was editorialized, which people do far more often than they should. The original title, with the domain name next to it, would have been fine.
I hope this allows me to filter them entirely. If it wasn't worth your time creating it, its not worth my time looking at it.
I am generally very skeptical of these tags though, I suspect a lot of them are in place to stop an AI consuming its own output rather than any concern for the end user.
Once something like Sora is available to the public, its going to be game over. A new bunch of creators will use it to "create" videos and I am sure you will change your mind then.
This is somewhat expected to be honest. I am rather pessimistic on the future solutions to such issues though. I can see only one possibility going forward: camera sensor manufactures will either voluntarily or forcibly implement hardware that inject cryptographic "watermarks" to the videos produced by their cameras. Any videos that do no bear valid watermarks are considered potentially "compromised" by GenAI.
Going to be a long road with this kinda thing but forums and places I visit often already have "no AI submissions" type rules and they have been received pretty well that I've seen.
Are they capable of enforcing it? I don't know, but it's clear users understand / don't like the idea of being awash in a sea of AI content at this point.
> Creators must disclose content that [...] Generates a realistic-looking scene that didn't actually occur
This may spoil the fun in some 3D rendered scenes. For example, I remember there was much discussion on whether a robot throwing a bowling ball was real or not[1].
Part of the problem has to do with all the original tags (e.g. "#rendering3d") being lost when the video spread through various platforms. The same problem will happen with Youtube -- creators may disclose everything, but after a few rounds through reddit and back, whatever disclosure and credit that was in the original video will be lost.
>in the future we’ll look at enforcement measures for creators who consistently choose not to disclose this information.
Nothing of use here. As per the usual MO of tech companies they throw the responsibility back on the user. Sounds like yet another bullshit clause that they can invoke when they want to cancel you.
call me cynic but i share the same thought. plus... unless we figure out a way to detect it, which we can't reliably do now at scale, this will be pretty useless. the ones who want to use it for profit will do whatever it takes, just the honest people will label it. i believe that this is even worse than to assume that everything is ai generated, as people without technical knowledge will trust that the labeling works.
This a pointless nearly unenforceable rule to make people feel better. Sure, if you generate something that seems like a real event that is provably false you can be caught, but anything mundane is not enforceable. Once models reach something like Sora 1.5 level of ability, we are kind of doomed on knowing whats real in video.
>This a pointless nearly unenforceable rule to make people feel better.
Pretty much. If Google says "Swiper no swiping" they can point at their policy when lobbying against regulations or pushing back against criticism.
Before surveillance capitalism became the norm, web services told users to not share personal information, and to not trust other users they had not met in real life.
naah, there still will be certain patterns and they will be recognisable.
once something sora 1.5 level of ability is there – definitely reverse-sora model which can recognise ai-made videos should be possible to train as well
Look at "realistic" photos , it is easy for someone with experience to spot issues, the hangs/fingers are wrong, shadows and light are wrong, hair is weird, eyes have issues. In a video there are much more information so much more places to get things wrong, making it pass this kind of test will be a huge job so many will not put the effort.
"Requires". It will rely on the honor system, which sleazy assholes won't honor; or the report system, which people will abuse. AI detectors aren't reliable; GenAI is basically tautologically defined as hard/impossible to detect and we keep getting reminded that "it will only get better".
Everyone calls this a problem, but it's a predicament because it has NO solution, and I have nothing but contempt for everyone who made it reality.
I’m wondering whether another motivation for this could be trying to keep the data set as clean as possible for future model training.
Creating videos takes quite a bit of time. If AI video generation becomes widely available, pretty soon, there could be more AI content being uploaded to YouTube than human-made stuff.
Presumably, training on AI generated stuff magnifies any artefacts/hallucinations present in the training set, reducing the quality of the model.
While their intentions are good, the solution isn’t. There’s a lot that they have left to the subjectivity of the creators. Especially for what is “clearly unrealistic”.
No mention of clearly labeling ads made using AI. The deepfake Youtube ads are so annoying. Elon wants to recruit me to his new investment plot? Yeah right.
I've said before that we're entering an age where no online material is truly verifiable without some kind of hardware signing (and even that has its flaws). Public figures will have to sort out this quagmire before things get even uglier than they are. And I really hope that's the biggest problem of the next decade or so, rather than that we achieved AGI and it decided we were inferior.
After reading it I think it's a good approach, whilst not perfect it's a good step.
Interestingly it isn't just referring to AI but also "other tools", used to make content that is "altered, synthetic and seems real".
Fair amount of ambiguity in there but I see what they're getting at when it comes to the bigger fish like the president being altered to say something they didn't.
This label will be mostly misleading. Absence of the tag will give false sense of veracity and presence of it on non-ai generated materials will discredit them.
Fact checking box like on twitter would be better and if you can't provide it, don't pretend you know anything about the content.
I'd like a content ID system for AI generated media. If someone tries to pass an image to me as authentic I can check its hash against a database that will say "this was generated by such-and-such LLM on 18 Mar 2024." Maybe even add a country of origin.
Well, it is software. So one can remove such patterns and generate a completely different hash or even train a model to specifically do that. It doesn't scale.
This will just result in a pop-up before every video, like the cookie warnings, “Viewers should be aware that this video may contain AI-generated or AI-enhanced images.” And it’ll be so annoying…
I suspect it’ll get me downvoted but this newish trend of using this grammar syntax drives me nuts. It’s “YouTube now requires YOU to” not “YouTube now requires to”. It’s lazy, it’s grammatically incorrect and it doesn’t scan.
This is great. Really well-thought out policy, in my opinion. Sure, some people will try to get around the restrictions, especially nefarious actors, but the more popular the channel, the faster they'll get caught. It also doesn't try to distinguish between regular special effects and AI-generated special effects, which is wise.
Scammers have been making fake content on youtube since its founding. And youtube has never even pretended so much as to care about doing anything about it.
While in the waiting room of my doctor's office, I noticed a few elderly patients playing videos of an obviously AI Joe Biden promising them free money and promoting a familiarly Trump platform, and they were /discussing/ it, as if it was real.
Who manufactured the problem here? It isn't just that these systems exist solely for individuals to be selected for by vulnerability, but that the people best poised to protect them are so effectively excluded from any sign of it.
What would it look like if we could see every ad meant for every audience from the same companies? What if we could just look up who was trying to target us? Is responsible, auditable advertisement impossible, or just not able to print money?
This isn't fair. If I make a video using AI, who is to say whether it's anymore real than a video taken with a camera? You think what a camera captures is reality?
And what if someone doesn't label the video? What if someone has drawn a fake video in Photoshop? The whole requirement to label AI-generated videos is dumb. Typical decision from some old politician who doesn't understand anything in AI or video editing and who should have retired 20 years ago instead of making such dumb rules.
Why movies are not labeled but AI video must be labeled? What about comedians impersonating politicians?
If Google or govt is afraid that someone will use AI-generated videos for bad purposes (e.g. to display a candidate saying things that he never said) then they should display a warning above every video to educate people. And popular messengers like Telegram or video players must do the same.
At least add a warning above every political and news videos.
ELI5: what would be the difference if you use AI or it is a new release of Star Wars? I understand that AI does not need proof-of-work and that is the difference?
Great, now bad actors can label child pornography as AI-generated to avoid legal consequences.
This label is worthless IMO. We're close or already at a point where it's impossible to distinguish between real and AI generated content. Any company offering "AI detection" are scammers. It's just plain not possible.
They're all garbage I'm tired of falling asleep and waking up to some ripped off youtube video that was altered with AI, often of large science channels, or the Elon Musk videos
It might take 50 years to awaken to the abuse of power going on here.
Forget individual videos for a second and look at youtube-the-experience as a whole. The recommendation stream is the single most important "generative AI" going on ever, using the sense of authenticity, curiosity and salience that comes from the individual videos themselves, but stitching them together in a very particular way. All the while the experience of being recommended videos being almost completely invisible. Of course this is psychologically "satisfying" to the users - in the shortest term - because they keep coming back, to the point of addiction. (Especially as features like shorts creep in).
Allowing the well of "interesting, warm, authentic audio & videos having the secondary gains of working on your psychological needs" being tainted with the question of generated content is a game changer because it breaks the wall of authenticity for the entire app. It brings the whole youtube-the-experience into question, it reduces its psychological stand-in function for human voice & likeness, band-aiding the hyper-individualized lonely person's suffering based content consumption habits. I know this is a bit dramatic, and for sure videos can be genuinely informative, but let's be honest, neither that is the entirety of your stream, nor that is the experience for the vast majority of the users. It will get worse as long as there is a mathematical headroom of making more money out of making it worse, that's what the shareholder duty is about.
When gen-AI came about I was naively happy about the fake "authenticity" wall of the recommended streams breaking down thanks to the garbage of generated sophistry overtaking and grossing out the users. Kind of like super delicious looking cakes turning out to be made of kitchen sponges turning people off of cakes all together. I was wrong to think AI oligopoly would let the opportunity of having a chokehold on the entire "content" business, and here we are. (Also this voluntary tagging will give them the perfect live training set, on top of what they have.)
Once the tech is good enough to generate video streams on the fly, so that all you need is a single livestream, that you won't even have a recommendation engine of videos and instead a team of virtual personas doing everything you could ever desire on screen, it is game over. It might already be game over.
To get out of this the single most important legislative maneuver is being able to accept and enforce the facts that a) recommendation is speech b) recommendation is also gen-AI, and should be subject to same level of regulatory scrutiny. I don't care if it generates pixels or characters at a time, or slaps together the most "interesting" subset of videos/posts/users/reels/shorts out of the vast sea of the collective content-consciousness, they are just one level of abstraction apart but functionally one and the same: look at me; look at my ads; come back to me; keep looking at me.
Society is simply revisiting a conversation about doctored photographs, videos, and audio recordings.
The last word on this subject was not written in the 1920s, it's good to revisit old assumptions every century or so, when new forms of media and media manipulation become developed.
The first pass on it is unlikely to be the best, or even the last one.
This is a very inaccurate depiction of copyright. It originally only lasted around 20 years with the option to double it. Then it was reformed over and over across history to create the monster we have today.
Copyright has been revised, overhauled and redefined multiple times over the past few centuries. You couldn't have picked a worse example.
Here's an obvious question that came up (and was resolved differently in different jurisdictions) - can photographs be copyrighted? What about photographs made in public? Of a market street? Of the Eifel tower? Of street art? Can an artist forbid photography of their art? An actor of their performance? A celebrity of their likeness? A private individual of their face? Does the purpose for which the photograph will be used matter?
At what point does a photograph have sufficient creative input to be copyrightable? Is pressing a button on a camera creative input? What about a machine that presses that button? Only humans can create copyrightable works under most jurisdictions. Is arranging the scene to be photographed a creative input? Can I arrange a scene just like yours and take a photo of it? Am I violating your copyright by doing it?
There's tens of thousands of pages of law and legal precedent that answer that question. As a conversation, it went on for decades, with no simple first-version solution sticking.
I wouldn't be surprised if this ends up like prop 65 cancer warnings, or cookie banners. The intention might be to separate believable but low quality hallucinated AI content spam from high quality manual content. But it will backfire like prop 65. You'll see notices everywhere because increasingly AI will be used in all parts of the content creation pipeline.
I see YouTube's own guidelines in the article and they seem reasonable. But I think over time the line will move, be unclear and we'll end up like prop 65 anyways.
The Prop 65 warnings are probably unhelpful even when accurate because they don't show anything about the level of risk or how typical or atypical it is for a given context. (I'm thinking especially about warnings on buildings more than on food products, although the same problem exists to some degree for food.)
It's very possible that Prop 65 has motivated some businesses to avoid using toxic chemicals, but it doesn't often help individuals make effective health decisions.
While you may think it didn’t have an effect a recent 99pi episode covered it and it sounds like it has definitely motivated many companies to remove chemicals from their products.
If it's something you've bought recently the offending ingredient should be listed. Otherwise, my money would be on lead being used as a plasticizer. Either way at least you have the tools to find out now.
Like is it one of those things the remove a 1 in a billion chance of cancer, and now have a product that wears out twice as fast leading to a doubling of sales?
Prop 65 is also way too broad. It needs to be specific about what carcinogens you’re being exposed to and not just “it’s a parking garage and this is our legally mandated sign”
Seems to still be pretty pointless considering that roads and parking lots and garages are all to be avoided if you want to avoid exposure… just stay away from any of those
The "sponsored content" tag on youtube seems to work very well though. Most content creators don't want to label their videos sponsored unless they are, I assume the same goes for AI generated content flags. Why would a manual content creator want to add that?
The "Sponsored Content" tag on a channel should link to a video of face / voice of the channel talking about what sponsored content means in a way that's FTC compliant.
That would be either poor understanding or poor enforcement of the rule, since they specifically list stuff special effects, beauty filters etc as allowed.
A more plausible scenario would be if you aren't sure if all your stock footage is real. Though with youtube creators being one of the biggest groups of customers for stock footage I expect most providers will put very clear labeling in place.
That's a much clearer line though, it's much simpler to know if you were paid to create this content or not. Use of AI isn't, especially if it's deep in some tool you used.
Does blurring part of the image with Photoshop count? What if Photoshop used AI behind the scene for whatever filter you applied? What about some video editor feature that helps with audio/video synchronization or background removal?
This is a problem of provenance (as it's known in the art world) and being certain of the provanence is a difficult thing to do - it's like converting a cowboy coded C++ project to consistently using const... you need to dig deep into every corner and prefer dependencies that obey proper const usage. Doing that as an individual content creator would be extremely daunting - but this isn't about individuals. If Getty has a policy against AI and guarantees no AI generation on their platform while Shutterstock doesn't[1] then creators may end up preferring Getty so that they can label their otherwise AI free content as such on Youtube - maybe it gets incorporated into the algorithm and gets them more views - maybe it's just a moral thing... if there's market pressure then the down-the-chain people will start getting stricter and, especially if one of those intermediary stock providers violates an agreement and gets hit with a lawsuit, then we might see a more concerted movement to crack down on AI generation.
At the end of the day it's going to be drenched in contracts and obscure proofs of trust - i.e. some signing cert you can attach to an image if it was generated on an entirely controlled environment that prohibits known AI generation techniques - that technical side is going to be an arms race and I don't know if we can win it (which may just result in small creators being bullied out of the market)... but above the technical level I think we've already got all the tools we need.
You may be interested in the Content Authenticity Initiative’s Content Credentials. The idea seems to be to keep a more-or-less-tamperproof provenance of changes to an image from the moment the light hits the camera’s sensor.
It sounds like the idea is to normalize the use of such an attribution trail in the media industry, so that eventually audiences could start to be suspicious of images lacking attribution.
Adobe in particular seems to be interested in making GenAI-enabled features of its tools automatically apply a Content Credential indicating their use, and in making it easier to keep the content attribution metadata than to strip it out.
Maybe this could motivate toolmakers to label their own products as “Uses AI” or “AI Free” allowing content creators verify their entire toolchain to be AI Free.
As opposed to today, where companies are doing everything they can, stretching the truth, just so they can market their tools as “Using AI.”
You can't use them - other tools that match most of the functionality without including AI tools will emerge and take over the market if this is an important thing to people... alternatively Adobe wises up and rolls back AI stuff or isolates it into consumer-level only things that mark images as tainted.
This is a great point and I don’t know. We are entering a strange and seemingly totally untrustworthy world. I wouldn’t want to have to litigate all this.
This is depressing, we’re going to intentionally use worse tools to avoid some idiotic scare label. Basically the entire GMO or “artificial flavor” debates all over again.
If you edit this image by hand you’re good, but if you use a tool that “uses AI” to do it, you need to put the scare label on. Even if pixel-for-pixel both methods output the identical image! Just as a GMO/not GMO has no correlation to harmful compounds being in the food, and artificial flavors are generally more pure than those extracted from some wacky and more expensive means from a “natural” item.
> To be effective, warnings like this have to be MANDATED on the item in question, and FORBIDDEN when not present.
I think for it to be effective you'd have to require them to provide an itemized list of WHAT is AI generated. Otherwise what if a content creator has a GenAI logo or feature that's in every video and put a lazy disclaimer.
> (This post may have been generated by AI; this notice in compliance with AI notification complications.)
For something like YouTube, you could have the video's progress bar be a different color for the AI sections. Maybe three: real, unknown, AI. Without an "unknown" type tag, you wouldn't be able to safely use clips.
This will make AI the new sesame allergen [1] — if you aren't 100% certain every asset you use isn't AI-generated, then it makes sense to stick some AI-generated content in and label the video accordingly, out of compliance.
Wow. This is an awesome education on why you can’t just regulate the world into what you want it to be without regard to feasibility. I’m sure the few who are allergic are mad, but it would also be messed up to just ban all “allergens” across the board - which is the only effective and fair way to guarantee that this approach couldn’t ever be used to comply with these laws. There isn’t much out there that somebody isn’t allergic to or intolerant of.
>would also be messed up to just ban all “allergens” across the board -
Lol, this sounds like one of those fabels where an idiot king bans all allergens then a week later everyone is starving to death in the kingdom because it turns out that in a large enough population there will be enough different allergies that everything gets banned.
> To be effective, warnings like this have to be MANDATED on the item in question, and FORBIDDEN when not present.
That already happens for foods.
The solution for suppliers is to intentionally add small quantities of allergens (sesame). [1] By having that as an actual ingredient, manufacturers don't have to worry about whether or not there is cross contamination while processing.
How much AI is enough to warrant it though. Like is human motion-capture based content AI or human? How about automatic touchup makeup? At what point does touch-up become face swap?
I’ve found Prop 65 warnings to be useful. They’re not pervasively everywhere; but when I see a Prop 65 warning, I consciously try to pick a product without it.
> You'll see notices everywhere because increasingly AI will be used in all parts of the content creation pipeline.
Which would be OK with me, personally. Right now, those cookie banners do serve a valuable function for me -- when I see them, I know to treat the site with caution and skepticism. If AI warnings end up similar, they too will serve a similar purpose. It's all better than nothing.
You put prop 65 as backfiring, but it looks to me like the original intent was reducing toxic products in tap water for instance and it largely achieved that goal.
From there warnings proliferated on so many more products, but getting told that chocolate bars can cause cancer is still a reasonable tradeoff. Especially as nothing is stopping the law from getting tweaked from there.
Comparing it to prop 65 or GDPR makes it look like a probably deeply effective, yes slightly annoying rule...I sure hope that's what we end up with.
The ePrivacy directive and GDPR don't literally require cookie banners but the former requires disclosure of specific information and the latter requires consent for most forms of data collection and processing. Even the 2002 directive actually require an option to refuse cookies which many cookie banners still fail to implement properly post-GDPR.
The problem is that most websites want to start collecting, tracking and processing data that requires consent before any interaction takes place that would allow for a contextual opt-in. This means they have to get that consent somehow and the "cookie banner" or consent dialog serves that purpose.
Of course many (especially American) implementations get this hilariously wrong by a) collecting and processing data even before consent is established, b) not making opt-out as trivial as opt-in despite the ePrivacy directive explicitly requiring this (e.g. hiding "refuse" behind a "more info" button or not giving it the same weight as "accept all"), c) not actually specifying the details on what data is collected etc to the level required by the directive, d) not providing any way to revise/change the selections (especially withdrawing consent previously given) and e) trying to trick users with a manual opt-out checkbox per advertiser/service labeled "legitimate interest" which is an alternative to consent and thus is not something you can opt out of because it does not require consent (but of course in these cases the use never actually qualifies as "legitimate interest" to begin with and the opt-out is a poorly constructed CYA).
In a different world, consent dialogs could work entirely like mobile app permissions: if you haven't given consent for something you'll be prompted when it becomes relevant. But apparently most sites bank on users pressing "accept all" to get rid of the annoying banner - although of course legally they probably don't even have data to determine if this gamble works for them because most analytics requires consent (i.e. your analytics will show a near 100% acceptance rate because you only see the data of users who opted into analytics and they likely just pressed "accept all").
Am I the only one who is bothered by calling this phenomenon "hallucinating"?
It's marketing-speak and corporate buzzwords to cover for the fact that their LLMs often produced wrong information because they aren't capable of understanding your request, nuance, or the training data it used is wrong, or the model just plain sucks.
Would we tolerate such doublespeak it were anything else? "Well, you ordered a side of fries with your burger but because our wait staff made a mistake...sorry, hallucinated, they brought you a peanut butter sandwich that's growing mold instead."
It gets more concerning when the stakes are raised. When LLMs (inevitably) start getting used in more important contexts, like healthcare. "I know your file says you're allergic to penicillin and you repeated when talking to our ai-doctor but it hallucinated that you weren't."
Human beings regularly hallucinate details that aren’t real when asked to provide their memories of an event, and often don’t realize they’re doing it at all. So whole AI definitely is lacking in the “can assess fact versus fiction” department, that’s an overlapping problem with “invents things that aren’t actually real”. It can, today, hallucinate accurate and inaccurate information, but it can’t determine validity at all, so it’s sometimes wrong even when not hallucinating.
I can't stand it being called "hallucinating" because it anthropomorphizes the technology. This isn't a conciousness that is "seeing" things that don't exist: it's a word generator that is generating words that don't make sense (not in a syntactic sense, but in a semantic sense).
Calling it "hallucination" implies that there are (other) moments when it is understanding the world correctly -- and that itself is not true. At those moments, it is a word generator that is generating words that DO make sense.
At no point is this a conciousness, and anthropomorphizing it gives the impression that it is one.
It isn't an error, either. It's doing exactly what it's intended to, exactly as it's intended to do it. The error is in the human assumption that the ability to construct syntactically coherent language signals self-awareness or sentience. That it should be capable of understanding the semantics correctly, because humans obviously can.
There really is no correct word to describe what's happening, because LLMs are effectively philosophical zombies. We have no metaphors for an entity that can appear to hold a coherent conversation, do useful work and respond to commands but not think. All we have is metaphors from human behavior which presume the connection between language and intellect, because that's all we know. Unfortunately we also have nearly a century of pop culture telling us "AI" is like Data from Star Trek, perfectly logical, superintelligent and always correct.
And "hallucination" is good enough. It gets the point across, that these things can't be trusted. "Confabulation" would be better, but fewer people know it, and it's more important to communicate the untrustworthy nature of LLMs to the masses than it is to be technically precise.
Calling it an error implies the model should be expected to be correct, the way a calculator should be expected to be correct. It generates syntactically correct language, and that's all it does. There is no "calculation" involved, so the concept of an "error" is meaningless - the sentences it creates either only happen to correlate to truth, or not, but it's coincidence either way.
> Calling it an error implies the model should be expected to be correct
To a degree, people do expect the output to be correct. But in my view, that's orthogonal to the use of the term "error" in this sense.
If an LLM says something that's not true, that's an erroneous statement. Whether or not the LLM is intended or expected to produce accurate output isn't relevant to that at all. It's in error nonetheless, and calling it that rather than "hallucination" is much more accurate.
After all, when people say things that are in error, we don't say they're "hallucinating". We say they're wrong.
> It generates syntactically correct language, and that's all it does.
Yes indeed. I think where we're misunderstanding each other is that I'm not talking about whether or not the LLM is functioning correctly (that's why I wouldn't call it a "bug"), I'm talking about whether or not factual statements it produces are correct.
It's a language model, trained on syntactically correct code, with a data set which presumably contains more correct examples of code than not, so it isn't surprising that it can generate syntactically correct code, or even code which correlates to valid solutions.
But if it actually had insight and knowledge about the code it generated, it would never generate random, useless (but syntactically correct) code, nor would it copy code verbatim, including comments and license text.
It's a hell of a trick, but a trick is what it is. The fact that you can adjust the randomness in a query should give it away. It's de rigueur around here to equate everything a human does with everything an LLM does, including mistakes, but human programmers don't make mistakes the way LLMs do, and human programmers don't come with temperature sliders.
It's not surprising if it generated syntactically correct code that does random things.
The fact that it instead generates syntactically correct code that, more often than not, solves - or at least tries to solve - the problem that is posited, indicates that there is a "there" there, however much one talks about stochastic parrots and such.
As for temperature sliders for humans, that's what drugs are in many ways.
> Would we tolerate such doublespeak it were anything else?
Yes: identity theft. My identity wasn't "stolen", what really happened was a company gave a bad loan.
But calling it identity theft shifts the blame. Now it's my job to keep my data "safe", not their job to make sure they're giving the right person the loan.
I don't get this at all. "Hallucinate" to me only can mean "produce false information". I've only ever seen it used perjoratively re: AI, and I don't understand what it covers up- how else are people interpreting it? I could see the point if you were saying that it implies sentience that isn't there, but your analogy to a restaurant implies that's not what you're getting at.
I think people are much more conservative with their health than text generation. If the text looks funky, you can just try regenerating it, or write it yourself and have only lost a few minutes. If your health starts looking funky, you're kind of screwed.
To me it sounds pretty damning. "The tool hallucinates" makes me think it's completely out of touch with reality, spouting nonsense. While "It has made a mistake, it is factually incorrect" would apply to many of my comments if taken very literally.
Webster definition: "a sensory perception (such as a visual image or a sound) that occurs in the absence of an actual external stimulus and usually arises from neurological disturbance (such as that associated with delirium tremens, schizophrenia, Parkinson's disease, or narcolepsy) or in response to drugs (such as LSD or phencyclidine)".
I would fire with prejudice any marketing department that associated our product with "delirium tremens, schizophrenia, [...] LSD or phencyclidine".
Nonsense. It isn't marketing speak to cover for anything. It's a pretty good description of what is happening.
The reason models hallucinate is because we train them to produce linguistically plausible output, which usually overlaps well with factually correct output (because it wouldn't be plausible to say e.g. "Barack Obama is white"). But when there isn't much data to show that something that is totally made up is implausible then there's no penalty to the model for it.
It's nothing to do with not being able to understand your request, and it's rarely because the training data is wrong.
So if I replied to your comment with "you are incorrect" I would be putting you in a worse light than saying "you are hallucinating"? The second is making it sound better? Doesn't feel that way to me.
My problem with "hallucination" isn't that it makes error sound better or worse, it's that it makes it sound like there's a consciousness involved when there isn't.
> Also those two statements are not mutually exclusive.
> Errors in statistical models being called hallucinations in the past does not mean that term is not marketing speak for what I said earlier.
The implicit claim was that they call this hallucination because it sounds better. In other words that some marketing people thought "what's a nicer word for 'mistakes'?" That is categorically untrue.
I don't think there's any point arguing about whether or not the marketers like the use of the word "hallucinate" because neither of us has any evidence either way. Though I was also say the null hypothesis is that they're just using the standard word for it. So the onus is on you to provide some evidence that marketers came in an said "guys, make sure you say 'hallucinate'". Which I'm 99% sure has never happened.
It's a term of art from the days of image recognition AI that would confidently report seeing a giraffe while looking at a picture of an ambulance.
It doesn't feel right to me either, to use it in the context of generative AI, and I'd support renaming this behaviour in GenAI (text and images both) — though myself I'd call this behaviour "mis-remembering".
Edit: apparently some have suggested "delusion". That also works for me.
AI can already create photo-realistic images, and the old "look at the hands" rule doesn't really work on images generated with modern models.
There may be a few tells still, but those won't last long, and the moment someone can find a new pattern you can make that a negative prompt for new images to avoid repeating the same mistake.
I think we are already there, and it seems like we aren't because many people are using free low-quality models with a low number of steps because its more accessible.
Yes. Nearly all EU regulations are going to end up like that. Over-regulate and people develop blindness to regulations. Our best hope right now is that EU becomes more and more irrelevant as the gap between US and EU grows to the point American companies can simply bankroll the EU leaders.
"made using AI" is such a fuzzy all-encompassing term that this feels like it will turn into another California Prop 65 warning scenario. Pretty soon every video will have a disclaimer like:
WARNING: This video contains content known to the State of Google to be generated by AI algorithms and/or tools.
Ok, beauty face filters are not included. How about character motion animations? How detailed does the after effects plugin need to be before it's considered AI? Can we generate just a background? Just a minor subject in the foreground? Or is it like pornography, where we'll recognize it once we see it?
I fear AI tools will soon become so embedded in normal workflows that it's going to become a question of "how much" not "contains", and "how much" is such a blurry, subjective line that it's going to make any binary disclaimer meaningless.
You might be interested in Adobe’s “Content Credentials” [1] which seemingly aim to clarify exactly what processing has applied to an image. I don’t like the idea of Adobe being the gatekeepers of image-fidelity-verification but the idea is intriguing and it seems like we’ll need something like this (that camera makers sign onto) to deal with AI.
EDIT: I think these should also include whatever built-in processing is applied to the raw sensor data within the camera itself.
AI that is indistinguishable from reality is a certainty for the not-so-distant future.
That future will come, and it will come sooner than anyone's expecting.
Yet all I see is society trying to prevent the inevitable from installing itself (because it's "scary", "dangerous", "undermines the very pillars of society" etc.), instead of preparing itself for when the inevitable occurs.
People seem to have finally accepted we can't put the genie back in the bottle, so now we're at the stage where governments and institutions are all trying to look busy and pass the image of "hey, we're doing something about it, ok? You can feel safe".
Soon we will be forced to accept that all that wasted effort was but a futile attempt at catching a falling knife.
Maybe the next idiom in line will be "crying over spilled milk", because could someone point me to what is being done in terms of "hey, let's start by directly assuming a world in which anyone can produce unrestricted, genuine-looking content will soon come and there's no way around it -- what then?"
All I see is a meteor approaching and everyone trying to divert it, but no one actually preparing for when it does hit. Each day that passes I'm more certain that we will we look at each other like fools, asking ourselves "why didn't we focus on preparing for change, instead of trying to prevent change"?
We've been preparing for a while? It's all that work people have been doing for years with asynchronous cryptography, ecc, and tech like what happens during heavy rain downpours and that coin with a bit in front of it.
These are all the proper preparation for AI. AI can't generate a private key given a public key. AI can't generate the appropriate text given a hash.
So we build a society upon these things AI can't do.
It has been a good run. We have done things like the tried and true ink stamping to verify documents. We have a labyrinth of bureaucracy for every little activity, mostly because it is the way that has always worked. It has surely been nice for the "administration" to sit around and sip lemonade in their archaic jobs. It has been nice to have incompetent people with no vision being appointed to high places for being born into the right families connected with the right people. That gravy train was surely a joy for those who were a part of it.
Sadly, it won't work anymore. We will need competent people now that actually care.
We need everything to be authenticated now with digital signatures.
It is not even that difficult a problem to solve. The existing systems are far more complex, far more prone to error, far more expensive, and far more difficult to navigate.
AI is giving us an opportunity to evolve. It is a time for celebration. Society will be faster, more efficient, more secure, and much more fun with generative content. AIs will produce official AI-signed content, and unsigned content. Humans will produce official human signed content, and unsigned content. Some AIs will use humans to sign content to subvert systems. But all of this pales in comparison to the fraud, waste, and total abuse of the current system.
Most nefarious AI content is going to be posted by humans misusing the AI tools, as opposed to some kind of AI gone rogue.
These humans would simply generate a public key from the private key, then post it under their human identity. The main threat from AI in the future IMO is not rouge AI, but bad human actors using it for their own nefarious agendas. This is how the first "evil" AI will probably come about.
There are some interesting hardware solutions from camera makers that provide provably authentic metadata and watermarks to videos and images - mostly useful for journalists, but soon consumers will expecting this to be exposed on social media platforms and those they follow on social media. There really are genuinely valuable things happening in this space.
This will always be spoofable by projecting the AI content onto the sensor and playing it to the microphone. Which will give the spurious content a veneer of authenticity, this is within reach of a talented malicious amateur, and would be trivial for nationa-state actors to do at scale.
Thank you for pointing that out, I want to reply to everyone here but I don't think I have it in me to fight this battle. It seems my initial message of "have we questioned ourselves what we'll do should the countermeasures fail?" fell on deaf ears. I asked a very simple question: "what will we do / should we do when faced with a world in which no content can be trusted as true", and most replies just went on to list the countermeasures being worked on. I will follow my own advice and simply accept that is how the band plays.
Of course. I don’t think anyone is going to be arguing that content captured by these cameras is real, it’s that the content is captured by the owner of that specific camera. There always needs to be some aspect of trust, and the value comes in connecting that with a trusted identity. Eg one couldn’t embed the CSPAN watermarks from a non-CSPAN camera.
> All I see is a meteor approaching and everyone trying to divert it, but no one actually preparing for when it does hit. Each day that passes I'm more certain that we will we look at each other like fools, asking ourselves "why didn't we focus on preparing for change, instead of trying to prevent change"?
I don't know, we've done a pretty good job at preventing nuclear war so far. We didn't just say "oh well, the genie is out of the bottle now. Everyone will have nuclear weapons soon and there's nothing we can do about it. All wars from now on are going to be nuclear. Might as well start preparing for nuclear winter." We signed treaties and made laws and used force to prevent us all from killing each other.
Forgive me on an initial reading, it is hard to have a nuanced discussion on this stuff without coming off like an uncaring caricature of one of two stereotypes, or look like you're attacking your interlocutor. When I'm writing these out, it's free association like I'm writing a diary entry, not as a critique of your well-reasoned and 100% accurate take.
Personal thoughts:
- we're already a year past the point where it was widely known you can generate whatever you want, and get it to a reasonable "real" threshold with less than a day worth of work.
- the impact is likely to be significantly muted, rather than an exponential increase upon, a 2020 baseline. professionals were capable of accomplishing this with a couple orders of magnitude more manual work for at least a decade.
- in general, we've suffered more societally from histrionics/over-reactions to being bombarded with the same messaging
- it thus should end up being _net good_, in that a skeptic has a 100% accurate argument for requiring more explanation than "wow look at this!"
- I expect that being able to justify / source / explain things will gain significant value relative to scaled up distributors giving out media someone else gave them without any review.
- something I've noticed the last couple years is people __hate__ looking stupid. __Hate__. They learn extremely quickly to refactor knowledge they think they have once confronted in public, even by the outgroup, as long as theyre a non-extremist.
After writing that out, I guess my tl;Dr as of this moment and mood, is there will be negligible negative effects, we already reached a nadir of unquestioned BS sometime between 2010 and 2024, and a baseline be _anyone_ can easily BS will lead to wide acceptance of skeptical reactions, even within ingroups.
I like the outlook you build through your observations, and I acknowledge the possible conclusion you arrive at as plausible. I do, however, put a heavier weight on your first point because I see what we have today in terms of image/video generation as very rudimentary compared to what we'll have in a couple years. A day's worth of work for a 100% convincing, AI-generated video immune to the most advanced forensics? We'll soon have it instantaneously.
Thank you for the preface you wrote, I completely understand your point of how easy it is to sound like a contrarian online, I'm sure my writing style doesn't help much on that front I'm afraid to admit.
There's a saying in my local language that people usually say to someone who's going through a breakup or going through an unfair situation:
"Accept it, it hurts less".
I'm not saying it makes the actual situation any better; it obviously doesn't. But anyone can feel the rarefied AI panic in the air growing thicker by the minute, and panic will only make the situation worse both before and after absolute change takes place.
When we don't accept incoming change before it arrives, we surely are forced to accept it after it arrives, at a much higher price.
You asked about preparations: prepare yourself to see governments try (and fail) to regulate what processing power can be acquired by consumers. Prepare yourself for the serious proposal of "truth-checking agencies" with certified signatures that ensure "this content had its chain of custody verified as authentic from its CMOS capture up to its encoded video stream", in which a lot of time and effort will be wasted (there's already people replying about this, saying metadata and/or encryption will come to the rescue via private/public keys. Supposedly no will would ever film a screen!).
The above might seem an exaggeration, but ask yourself: the YouTube guidelines this post is about, the recent EU regulation... do you think those are enough? Of course they're not. They will keep trying to solve the problem from the wrong end until they are (we are) forced to accept there's nothing that can be done about it, and that it is us who need to adapt to live in such a world.
From your comment's tone, it seems like this is supposed to be a bad thing. The only people who would be upset about this are folks who are trying to pass generated content off as real. I'm sorry if I don't have much sympathy for them.
"You have to tell people if you're lying" isn't a stupid rule because lying is good, it's a stupid rule because liars can lie about lying and proving it was the original problem.
Of course, YouTube is well-known for its methodical approach to video removal, strictly adhering to transparent guidelines, rather than deciding based on the "computer says no" principle.
It sounds like it could also be used to take down a video someone thinks is fake. Proving it may be easy in some cases, but in other's it may be quite difficult.
From the report button on a youtube video: Misinformation - Content that is misleading or deceptive with serious risk of egregious harm.
That sounds like a quagmire of subjectivity to enforce. You can argue whether generated content was created to mislead or whether it would cause _egregious_ harm.
Now there's no more arguing. Is it generated or is it real - and is it marked if generated? I still fail to see the downside here.
You're acting like this is a court with a judge. The company makes up subjective rules and then subjectively enforces them. There was never any arguing to begin with, they just ban you if they don't like you, or at random for no apparent reason, and you have no recourse.
> Now there's no more arguing. Is it generated or is it real - and is it marked if generated? I still fail to see the downside here.
How is this supposed to lead to less arguing? If there was an easy way to tell if something is AI-generated then you wouldn't need the user to tag it. When there isn't, now you have to argue about whether it is or not -- or if it obviously is, whether it then has to be tagged, because it obviously is and they've given that as an exception.
Youtube isn't some mom n' pop operation - they do have a review process and make an attempt at following the rules they set. They can ban you for any reason...but they generally don't unless you're clearly breaking a rule.
I'm not getting into the rabbit hole of finding out if it was actually generated - once again, that's a different problem. One that I already mentioned 2 comments back. My point was that there is less subjectivity with this rule. If the content is found to have been generated and isn't marked, then there are clear grounds to remove the video.
What youtube does when there is doubt is not known yet. I don't deal in "well this _could_ lead to this".
> Youtube isn't some mom n' pop operation - they do have a review process and make an attempt at following the rules they set. They can ban you for any reason...but they generally don't unless you're clearly breaking a rule.
They use (with some irony) AI and other algorithms to determine if you're breaking the rules, often leading to arbitrary or nonsensical results, which the review process frequently fails to address.
> I'm not getting into the rabbit hole of finding out if it was actually generated - once again, that's a different problem.
It isn't a different problem, it's the problem. You want people to label things because otherwise you're not sure, but because of that problem exactly, you have no way of reliably or objectively enforcing the labeling requirement. And you specifically have no way to do it in the cases where it most matters because it's hard to tell.
At a mom and pop you could at least talk to a person and figure out what happen.
>I don't deal in "well this _could_ lead to this"
Did you not learn from the entire DMCA thing? Remember the thing where piles tech people warned "Wow, this is going to be used as a weapon to cause problems" and then it was used as a weapon to cause problems.
Well, welcome to the next weapon that is going to be used to cause problems.
The DMCA implementation implemented is the only thing that saved youtube from getting sued out of existence. And people on the internet don't know what fair-use actually means, so they complain/exaggerate about DMCA takedowns when, surprise, it wasn't actually covered by fair-use.
There's a handful of cases where yt actually messed up w/ DMCA and considering the sheer volume of videos they process, I'd say it's actually pretty damn good.
So no, DMCA is not a valid reason to assume youtube will handle this improperly.
> The DMCA implementation implemented is the only thing that saved youtube from getting sued out of existence.
The alternative to the DMCA is Section 230 of the CDA, which could have just as easily been applied to copyright as it is to anything else if it weren't for the DMCA providing a more abuse-prone alternative.
> And people on the internet don't know what fair-use actually means, so they complain/exaggerate about DMCA takedowns when, surprise, it wasn't actually covered by fair-use.
Abusive takedowns are a huge problem, actually. It's common in cases of businesses sending takedowns for their competitors' websites or videos, for example. They're often completely fraudulent with no merit whatsoever, but the company receiving the takedown has no information on which to base a decision (who created this content? how would they know?), so they just mechanistically execute all of them with no validation.
> they just ban you if they don't like you, or at random for no apparent reason, and you have no recourse.
The ban recourse problem is the opposite.
This is the "keep recourse": "this video is obviously bad, but google doesn't feel like taking it down, and there is nothing I can do about it". Now there is, and it can actually go to a court with a judge in the end, if Google is obstinate.
You didn't have a right to be hosted on Google before, and you don't have now. Of course they can ban you as they like. The thing is, they can't host you as they like, if you're breaking this rule.
> The thing is, they can't host you as they like, if you're breaking this rule.
Except that the rule can be satisfied just by labeling it, and if there are penalties for not labeling but no penalties for labeling then the obvious incentive is to stick the label on everything just in case, causing it to become meaningless.
To prevent that would require prohibiting the label from being applied to things that aren't AI-generated, which is impracticable because now you need 100% accuracy and there is no way to err on the side of caution, but nobody has 100% accuracy. So then the solution would be to actually make everything AI-generated, e.g. by systematically running it through some subtle AI filter, and then you can get back to labeling everything to avoid liability.
You could be right. But I wonder if truly nobody will want to claim their video is not AI-generated. Seems like some people will, and they would get an advantage out of it. Yes, Fox News claims they're entertainment and nobody reasonable would believe them. But not all news channels do this.
Did the California proposition 65 really result in cancer labels on everything? Or is it just hyperbole? I suppose having a lot of labels is still bad, even if they're not technically on everything.
They can, but it's often possible to prove it later. If you have a rule against lying and it's retroactively discovered to have been broken, then you already have the enforcement mechanism in place.
Really, your argument can be generalized to 'why have laws at all, because people will break them and lie about it'.
> They can, but it's often possible to prove it later. If you have a rule against lying and it's retroactively discovered to have been broken, then you already have the enforcement mechanism in place.
It isn't a rule against lying, it's a rule requiring lies to be labeled. From which you get nothing useful that you couldn't get from a rule against lying, because you'd need the same proof for either one.
Meanwhile it becomes a trap for the unwary because innocent people who don't understand the complicated labeling rules get stomped by the system without intending any malice.
> Really, your argument can be generalized to 'why have laws at all, because people will break them and lie about it'.
The generalization is that laws against not disclosing crimes are pointless because the penalty for the crime is already at least as severe as the penalty for not disclosing it and you'd need to prove the crime to prove the omission. This is, for example, why it makes sense to have a right against self-incrimination.
There are already various situations where lying is banned: depending on the circumstances, lying might count as perjury, fraud, false advertising, etc. It seems silly to suggest that these laws serve no purpose.
The same reason you have to check a box saying you're not a terrorist when entering the USA. It gives them a legal basis to actually do something about it when found out from other means.
That's like saying speed limit signs have really nothing to do with cars, they're trying to impose a ban on collision velocity. Which is true, but only speciously, as the rule exists only because motor vehicles made it so easy to go fast.
They don't have anything specifically to do with cars. They apply equally to motorcycles, trucks and anything else that could go that fast. Get pulled over in a tank or a hovercraft and try to tell the officer that you can't have been speeding because it isn't a car.
Should we like deepfakes any better if they're created by a nation state using pre-AI Hollywood production technology, or "by hand" with Photoshop etc.? If 3D printers get better so that anybody can 3D print masks you can wear to convincingly look like someone else and then record yourself on camera, would you expect a different set of rules for that or are we talking about the same kind of problem?
You missed the analogy, so I'll spell it out: before we had cars[1], we couldn't go fast on roads, and there were no speed limit signs. Before we had AI, we couldn't deceive people with easy fakes, and so there was no need to regulate it. Now we do, and there is, and YouTube did.
Trying to characterize this as not related to AI just isn't adding to the discussion. Clearly it is a response to the emergence of AI fakes.
Trying to shovel in "and all of the other stuff" breaks the analogy though. Misinformation isn't new. Image gen is hardly the first time you could create a fictional depiction of something. It's not even the first time you could do it with commonly available tools. It's just the moral panic du jour.
YouTube did this because the EU passed a law about it. The EU passed a law about it because of the moral panic, not because the abstract concept of deception was only recently invented.
It's like having cars already, and speed limits, and then someone invents an electric car that can accelerate from 0 to 200 MPH in 5 seconds, so the government passes a new law with some arbitrary registration requirements to satisfy Something Must Be Done.
Yes, but it also creates a straightforward method for triggering a TOS violation if an unlabelled AI video is detected, allowing YT to remove deceptive videos and ban bad actors without adding more subjective editorial decisions into the review process.
It would be hard, for example, for a political party to claim that this policy selectively discriminates against their viewpoint.
Nope, not pointless: you remove bullshit excuses. "Everyone knows it's AI! I didn't mean to deceive anyone, honest!"
This is also why we have to do the insufferable corporate training on how not to do bribery, sexual harassment, etc. It's not that HR thinks you don't know, it's that HR knows that if they don't make everyone take the training then bad actors can successfully avoid/delay consequences by pretending to not have known. I think of these things like jury duty: a civic duty that is slightly obnoxious in and of themselves but very important for the functioning of the overall system.
It gives YouTube the justification to remove videos that may not be technically rule-breaking otherwise. Though, I do imagine proving that a video is AI generated will quickly become functionally impossible.
Still, I believe you are wrong despite your statement ringing true. You are conflating different reasons as to why people may want to generate AI videos. The nefarious motive may be nothing more than profit ('cheaper than paying actor') as opposed to malice('we want to defraud people'). There are all sorts of reasons as to why self-disclosure is not a bad start including the fact that if it turns out you lied, you can be removed without it being a question of freedom of speech and so on.
We need to fix the title. It's not just AI -- it's any realistic scene generated by VFX, animation, or AI. The title of the blogpost is "How we're helping creators disclose altered or synthetic content" -- it shouldn't say AI on the Hacker News title.
> Generating realistic scenes: Showing a realistic depiction of fictional major events, like a tornado moving toward a real town.
Does the Wizard of Oz tornado scene need a warning now? [0] (Of course not, but it may be hard to draw the line in some places.)
> It can be misleading if viewers think a video is real, when it's actually been meaningfully altered or synthetically generated to seem realistic.
> When content is undisclosed, in some cases YouTube may take action to reduce the risk of harm to viewers by proactively applying a label that creators will not have the option to remove. Additionally, creators who consistently choose not to disclose this information may be subject to penalties from YouTube, including removal of content or suspension from the YouTube Partner Program.
• Makes a real person appear to say or do something they didn't say or do
• Alters footage of a real event or place
• Generates a realistic-looking scene that didn't actually occur
At the very least this will test each of these hypotheses, which we'll learn from and iterate on. I am curious to see the legal arguments that will inevitably kick up from each of these - is color correction altering footage of a real event or place? They explicitly say it isn't in the wider description, but what about beauty filters? If I have 16 video angles, and use photogrammetry / gaussian splatting / AI to generate a 17th, is that a realistic-looking scene that didn't actually occur? Do I need to have actually captured the photons themselves if I can be 99% sure my predictions of them are accurate?
So many flaws, but all early steps have flaws. At least it is a step.