Hacker News new | past | comments | ask | show | jobs | submit login
Tree-Ring Watermark: Invisible Robust Fingerprints of Diffusion Images (arxiv.org)
97 points by anjel on June 21, 2023 | hide | past | favorite | 70 comments



And, once again, people seek to add additional, unnecessary steps and abstractions to perfectly fine processes, because of contrived bullshit ("copyright!")

Stop abstracting input away from output. Attribution here is a fools errand built on overbearing, unjust, and misinterpreted concerns over copyright, the relevance of which have yet to be proven in a legal sense. This is going to give The Man yet another way to suppress actors they don't like.

Repeat it with me, folks: YOU CANNOT CONTROL THAT WHICH HAS BEEN RELEASED TO THE PUBLIC. If you do not want things (people/their agents) to perceive your works, then keep them private.


On the contrary — copyrighted data industry(?) is quickly converging into copyright by appearance and not by provenance or by means. The dataset problem is kind of forgiven, but regurgitation problem continues to exist.

I think copyright as understood in IT is a bit of marsupials, that copyright is considered a property inherited by exact copies and intentional reconstructions. The criteria used in the real world is similar, but slightly different.

Also, this is going to be tone policing, but raging don’t make you right. If anything, it’ll do opposite of that.


> On the contrary — copyrighted data industry(?) is quickly converging into copyright by appearance and not by provenance or by means.

I mean, its always kind of been that way. You can copyright the appearence not the idea. Provenance doesn't really matter.


I don't care about copyright one iota, but this may be useful in distinguishing fake images from real photos.


People generating fake photos will just use an AI that doesn't leave a watermark. We are quickly approaching the point at which AI-generated and real images just cannot be distinguished, and we're going to have to adapt to that.


The other perspective is traceable cameras, such that you can have some guarantee that an image came from a particular device and wasn't AI-generated (e.g. your camera signs each image after capture).

There have been attempts to cryptographically sign images, which have largely been broken, but the principle is there: https://www.dpreview.com/opinion/9955873926/why-cryptographi...

See https://contentauthenticity.org/how-it-works


If the plan is to trust every phone and camera manufacturer to keep their key safe then the plan is stupid.

If the plan is to have a few trusted manufacturers for CCTV and police cameras than that might work, but will do little to combat misinformation.


There's value is raising the bar of difficulty, even if you don't make it completely impossible.


I see a future where training a model without a watermark is illegal.


>We are quickly approaching the point at which AI-generated and real images just cannot be distinguished

Eh, there are so many ways where a large photo can fail. I'm obsessed with AI Art, but I think there will be human(or AI) detectors that will be paid to determine validity.


Or just use a watermark removing AI.


> we're going to have to adapt to that.

I don't see how we can without great harm. The only adaptation that I can see is to stop trusting in anything at all, and we're already dangerously far down that road as it is.


Yeah, a zero-trust society totally won't catastrophically implode, trust me bro.


Not in general. You can't detect this watermark without access to the original model (the paper authors bill this as a feature). So the model owner could do that test to check if their particular model produced some image, but a third party trying to work out whether a given image was fake (a) would not have the data they need, and (b) even if they did, would need to repeat the test for every possible image generating model in existence.


All photos are edited in their taking. As such, all images represent a "fake" reality. The only difference between an AI "fake" image and photograph is the camera.


That's a lonely hill to die on..


No, it's a point people forget. Even 20 years ago people were asking "is this photo photoshopped?", but if you're viewing it on a webpage, it has absolutely been photoshopped. The problem is we aren't good at specifying what we mean by "photoshopped". Any photo you view has been digitally altered, getting more specific gets into the alterer's intent which gets really messy really quick.


What difference does it make if it was digitally altered or in an analogue fashion? A slightly longer exposure is also not showing reality as seen by human eyes. Besides, there is no single reality even among humans.


It's a rather extreme view but there is a lot that can be done with photography that can be very misleading. If you see a photo that seems incredible, it might be. People's general skepticism of photos is too low, AI might bring that skepticism a bit higher which I think would be a good thing.

You can think of a photo like a news article, the author can be completely truthful while omitting certain facts to give a false impression. The same can be done with how a photo is framed, cropped and processed.

The intent of a photographer can definitely be translated into the photograph or it wouldn't be an artform.


<< As such, all images represent a "fake" reality.

Parent may not be expressing it right, but he does have a point. There is a level of 'interpretation' that is being done during the taking of a picture. If you add to this recent attempts to incorporate AI into picture taking to add/remove parts thereof all of a sudden parent's proposition does not sound so far fetched.

Granted, it is still a different level than just generating an image from a prompt, but I would not dismiss the thought outright.


There's a difference between what happens when you point a regular digital camera at the moon and take a photo and what happens when you do that with a Samsung Space Zoom phone. Most people believe that difference is an important one.


If “sight” is perceptually completed in the visual cortex, then everything we see is “fake”. - Mr. Magoo


The actual question to ask isn't "is this fake?" it's "how is this record likely to have been manipulated or constructed and whom does it serve?"


Calm down. It’s a research paper. Ideas are not dangerous.


I see responses expressing disbelief, but think I agree with you. There's a quote whose source I can't find, but paraphrasing, is "there is no idea so dangerous it cannot be considered."

Belief in an idea can be dangerous, because of the actions it may lead to. But the mere reflection, consideration, or phrasing of an idea is not evil; cannot be evil.


> Ideas are not dangerous.

Wow.


this surely deserves an /s

right?

Like, you forgot the /s or are new to the internet and assumed people saw the sarcasm?


I'm completely serious. It pains me to have to spell it out: a metaphysical entity cannot do physical harm. The wrong idea in the hands of the wrong person, sure, that can be a dangerous combination. Does that mean we should ban ideas? Of course not.


> If you do not want things (people/their agents) to perceive your works, then keep them private.

That's the problem that copyright is intended to address. Without it, too many people do exactly that and society suffers.

Modern copyright is insufferable and desperately needs fixing. But the concept exists in order to address a different real problem. Without copyright, we still need a solution to that problem.


Does it though? With the sheer volume of creatives nowadays, combined with industry that seems to love nothing more than whoring itself out for maximum accessibility/customer base/sales, we live in a world where creative people create for the sheer sake and joy of creating. I would like to think that many of these folks would still create should copyright not have existed. Maybe in the old world, where something as simple as literacy was at a premium, but in this new world? I'd bet the lack of copyright would filter out so much mediocrity and lowest-common-denominator filth


You might be right, but I'm dubious.

> but in this new world?

This "new world" isn't actually all that new, though. I guess depending on when you count it as starting -- you brought up literacy being a premium in the "old world", which was largely a very long time ago, before the concept of copyright existed.


It won't even work for copyright, that is a legal process not a technical one, for which rights and limitations exists independently of the medium.


This reminds me of an early-2000s product (maybe earlier? Not sure) which used a frequency-domain manipulation of some sort to overlay a watermark on physical print. The idea was you could point your webcam at a magazine page and be whisked off to some website or other. It pre-dated the ubiquity of QR codes, and it was diffuse across the entire page rather than being a discrete visual element. It probably died off because not enough people had webcams and nobody had the custom software installed.

It came to mind because the `Watermarked x_t` image on page two reminds me of the overlaid noise pattern, which you could just make out on the page. I think they must have been doing something similar in frequency space.

Is this ringing any bells for anyone? I've tried googling but I just can't find evidence of it ever existing.


> the watermark signal is detected by inverting the diffusion process

Which, by definition, requires you to know in advance the exact process and parameters that were used? That seems untenable.


That seems to be the point. Sure, anyone can inject their own noise into the initial image, but the best place to be for that would be the people hosting the model. A host could watermark their service and be able to identify images that they'd produced down the line in a way that the end user can't remove and can't perceive.


"I need to protect the copyright of my AI generated image" is some big lol thought process. If that's all this is then it's not nearly as useful as claimed.


Yeah, it seems like exactly the wrong thing to be watermarking. I'd much prefer to be able to verify that something was human-generated, but of course that ship has sailed.


It's to trace down abuse.

People using products for fraud, defamation, illegal categories of porn, etc.


I would be very surprised if a company running a generative AI app wanted to be able to prove someone made illegal porn with it. I'd have thought they'd want people to believe that wasn't possible on their platform.


In a world where other startups can attribute outputs to particular tools, it'd be better to be able to show the database record, payment details, and IP address of those creating bad content.

You can internally investigate and see what users are doing to escape the guardrails. There's probably lots of legal but definitely juvenile and borderline offensive content that could also be studied.

There are even non-abuse analytical reasons for wanting this. If you sample the social media deluge for your watermarks, you could see how far your tool spreads and in which user clusters.


Couldn’t you just hash your image and share your proven ownership?


Didn't say anything about copyright, in the US at least that's already been ruled on saying that the raw output of one of these models is not copyrightable. Being able to trace something back to a service can be useful anyway. Imagine stuffing metadata in there, such as the prompt or the identity of the user who generated it. It'd be a powerful tool to combat everything from generated CSAM to political disinformation. If nothing else, a nice ghost story to tell the AI kiddies around the campfire.



Nice. If the work holds up to scrutiny, it's immediately useful for many.


also like... why would I voluntarily watermark my images that I want to pass off as "real art" (my personal stance is that AI art is absolutely real art)


I would argue that AI works are expressions like a poem or a song. The word “art” has a diluted meaning, it means everything and nothing in common parlance. “Real” art has to have certain attributes like a certain timelessness and character, something that can only be really be ascertained after a certain time has passed.

I feel that if you never studied art or never really learned or did the effort on studying it, and gone to the processes of art creation, most people probably aren’t making art. It still can be beautiful, creative, meaningful but “real” art, something that adds to the cultural lexicon, probably not.

The book “But is it art?” by Cynthia Freeland I can recommend.


I've done some thousands of hours of practice, my fingers are covered in watercolor paint as I type this. I don't know how to tell if I'm an artist or not, maybe I haven't done any art? I rarely try to say anything at all, I just like to see if I can make a thing I imagine real. I do wonder about it, I'm not being snarky.

AI seems like a tool that can be used to make art, I think all that takes is a human trying to elicit a response in another human.

Typing into a prompt box and setting some sliders, then picking the output I like... is it different from setting my shutter speed and framing a shot? What if I shoot on a phone where that's all automated, is the art the environment I put myself in? What if I take a hundred random abstract snaps and pick the best 3 and arrange them together in a triptych? Is the argument that the arrangement is the art, and the photography isn't?

It just seems so absurd to me that AI can't make art, sure maybe not on it's own, but a paintbrush or a camera can't make art on it's own either, the human wielding it can.


You are touching on a big subject, far too big to really touch upon in a comment (hence the book recommend)

I studied art formally. I had the chance to meet and be tutored by some really incredible artists. At some point I had to choose another path that led my foray into frontend development and UX.

It is indeed in the human context and character where great artists come from. Some artists can read the zeitgeist and extract a meaningful and masterly done work that defies the status quo. One of my teachers framed it “It’s like the olympics and you try to jump just that bit higher and raise the bar. And you don’t know if you are doing that unless you try.”

So yeah, the human, their process, their ability to discern what is visually needed, their cultural context, their ability to be autonomous when it is called for, those are hallmarks of significance.

Now the AI generator can help you iterate ideas (like you have to choose photos after a shoot) , but you are still the director, you still have to know what to pass and what to embrace.

Some artists are not known until long after. For instance there was a street photographer who did street photography long before it was known as such. Now those works are seen as very significant. This is someone who tapped into something and showed a new way of looking.


HCB is the reason I almost always shoot with a manual fast 50. :)


Street photography can be great fun and interesting! And culturally significant. Day to day portraiture is a boon for antropology.


Who?


People who want to treat AI generated works differently, even tho by objective examination, it is not differentiable from a man made one.


The sister thread notes that this requires inverting the process, so it is likely only useful to those hosting the particular AI model; not an outside person who wants to check if a specific image is AI generated.


That seems silly though as presumably that is a different set of people then those generating the images.

Any security system that requires the adversary to be on board is doomed to fail. If the adversary was willing to play by the rules you wouldn't need a security system.


That doesn't work for that purpose because nothing mandates that AI-generated works be watermarked, and latent diffusion models are easily run on a desktop GPU now.


I wonder how deep watermarks would perform against attacks using Image2Image. Running an image through I2I as-is would be able to destroy the watermark with minimal/minor changes, and out-painting may significantly impact certain watermarking methods by wrapping images with random contents.


With control net guidance it's going to be even easier to maintain the image extracting the idea from the noise


It’s fascinating that it’s robust to noise injections, especially lossy compression and blurring. I suppose it makes sense given the latent space water mark presumably makes global changes to the significant features even if imperceptible. It feels like this might be possible to use as a upscaling technique to inject the watermark into existing photographs without directly manipulating the photographic content.


I'm skeptical this survives very long: these diffusor systems are noise removal systems, this is just inventing a noise injection they currently haven't had to deal with.

Particularly if the idea is it leads to a big change in the latent space representation as the mode of action, the question to ask is how hard is it for another system to learn what that latent space anomaly looks like and remove it?


They have a strange idea of "invisible" - the watermarked image is totally different from the other algorithms examples. So this watermark is not visible by itself, but the images it is on are different from non-watermarked ones. Huh.


So just blurring the image and resharpening using AI can remove the watermark, right? Asking just in case i need to steal some AI generated artwork few years down the line.


No need, AI generated imagery isn't copyrightable.


incorrect, AI generated imagery has weaker copyright protections as of right now, but there is limited precedent, this stuff will need to be argued up and down the courts before we know exactly how copyrightable it is (but it's absolutely not zero)

fwiw i think copyright should be greatly weakened in general, just correcting you on the facts


> (but it's absolutely not zero)

I agree that nobody knows for sure until a court rules, but it absolutely could be zero, and there are reasonable arguments for it being zero, especially in the usa [uk maybe a different story] (e.g. non-human angent, is a prompt really a creative endeavor?).


As of right now it isn't. Any legal fact is subject to change in the future pursuant to litigation or legislation.


? Anything might happen in the future, but the current guidance is not ambiguous.

https://www.copyright.gov/ai/ai_policy_guidance.pdf

(It’s not copyrightable in the US)

> fwiw i think

Doesn’t really seem relevant, legally speaking, and unless there’s a specific precedent you’d care to share, you are not correcting the parent post, you are simply incorrect.


> It’s not copyrightable in the US

That's document is about guidance about works applying to be registered at the Copyright office. Which is a process made largely redundant in 1978 by automatic copyright assignment.

Now the Copyright Office may be of the opinion that copyright doesn't apply, but apart from registrations (or passing new regulations, which they didn't do) they can only give an expert's opinion. To decide how existing laws and regulations apply is the job of the courts.


IANAL, but I’m confident that any assumption that the copyright office is “largely redundant” is not correct.

Of course, you’re welcome to your own opinion, but I urge anyone who believes this to be the case and you know, doing commercial work, to seek legal advice rather than depending on their assumptions about it.

See also -> https://www.copyright.gov/help/faq/faq-general.html#register


That's not at all what that says. Did you read it? It specifically says that rh copyright office would have to judge whether the art is a mechanical reproduction, or whether it originated from a humans "own original mental conception" and thus "is necessarily a case-by-case inquiry." That's the definition of ambiguous.


Is the watermark carried forward through — that is, expressed in the subsequent output of — a diffusion kernel trained on watermarked images?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: