Can someone who understands this work explain why using a range (de)coder to uncompress your ciphertext into the GPT2 generated distribution (in other words the most obvious construction for the case when the sender and receiver have the model) is insufficient to achieve perfect security by their definition?
If I’m not mistaken, you still need the key itself to undo the steganography. They provide a way to autoregressively rewrite the decoded message with the aid of the key. I think the perfect secrecy part means that embedding the plaintext doesn’t cause any divergence from the model’s output distribution, so it’s undetectable. You just adapt the covertext in such a way that still has about the same information content but eg via different words.
I’m a co-author on one recent (cited) work, Meteor. The challenge in LLM/stego is that the model outputs a probability-weighted vector at each step. Ordinarily you just sample a token from this vector using a random number generator. For stego the idea is to sample from that vector using a string of (pseudorandom) input bits that represent your message, in a way that (1) gives you a valid sampling distribution and (2) is recoverable, meaning that given a second copy of the model plus each token it outputs you can recover the bits used to sample. The first part is relatively easy, it’s the second part that’s “hard” in the sense that you either waste a lot of bits (meaning an inefficient covertext) or you end up with sampling that isn’t statistically identical to a random sampling approach. In Meteor we used a practical approach that’s fairly efficient but trades a bit of statistical perfection. This paper focuses more on the theory of how to do this optimally, which is nice! I don’t fully understand the result right now (it’s a tough paper to read) but this kind of theoretical improvement inspired by previous work is nice.
Right, so in total ignorance what I would do is just make a GPT powered text compressor and decompressor using a range coder with its probabilities at each token set by the model. Care would need to be taken with precision and end termination so a bias wouldn't be introduced (more care than typical in compressors, since no one normally cares much about 0.01% inefficiencies-- things like end termination have been addressed by bijective range coders).
Then to use it for stego just take your headerless ciphertext and decompress it. Tada: you get model output. To decode the stego just compress it. Assuming everything was built right and that your ciphertext is uniform, the output should be indistinguishable from the model just sampling using an RNG.
As a bonus, you don't even have a stego tool on your computer you just have a particularly carefully constructed text compressor and decompressor that is perfectly usable (and even state of the art in compression rate, given a big enough model) for the compression application.
Hi, I am one of the authors on the PSSuMEC paper. Thanks a lot for your interest in our work. If I understand correctly, range coding would have the same lack of perfect security properties as arithmetic coding (cf our paper). Having said this, we are investigating the utility of iMEC in compression settings.
Ah, yeah I was trying to get a lay explanation of how arithmetic coding assuming it was done with enough precision wouldn't achieve the perfect security properties.
It might just be that the precision rapidly becomes unmanageable, because I guess you need to support the least probable symbol from every token without ever losing precision (normally arithmetic coders will re-normalize after each symbol to keep the accumulator in a reasonable precision, though they could be designed to do so as infrequently as you like at a performance cost)... If no renormalization is possible I guess an arithmetic coder accumulator would need to handle values like (1/least_prob_token)^n_tokens which gets extremely big extremely fast -- and would at the very least need an unconventional construction.
I think it's the LLM (GPT2) link that makes this interesting to most since it allows relatively high data rate hidden/deniable communication through text interfaces.
It's been known for a long time that the bandwidth steganography is limited to the amount of randomness in the open data, eg Anderson and Peticolas "On the limits of steganography", '98 (I hope that's the right one anyway - a while since I read it). Which implies that if someone came up with a good reason to publish a bunch of high entropy randomness, the world would have much greater bandwidth of potential steganographic channels.
Now AI has done this for us. I suppose that under authoritarian regimes you will soon have to cryptographically prove that you generated your random bits deterministically from specific keys.
> I suppose that under authoritarian regimes you will soon have to cryptographically prove that you generated your random bits deterministically from specific keys
I guess that's sarcasm but I don't really get it.
Just in case you mean it, that doesn't seem even remotely technically doable to me. And even if you managed to make people generate all randomness from a fixed PRNG with key escrow, you would have to check what they did with that randomness. If you're willing to go that far, the obvious way to go is to just ban the internet completely (and perhaps also people talking to each other).
While I'm sure it is sarcastic it's worth noting that most (all?) blockchain wallets already do this.
Technically it's in reverse: the seed is random and then serialized into a human readable seed phrase but the deterministic key generation is already widely deployed.
Don't forget that an authoritarian state can use punishment as leverage - they don't need to do all the diligence themselves, they can just punish you if you can't prove that your random numbers were generated deterministically.
But I was being a bit facetious. Most likely they would just to punish people who are caught using steganography.
I can see how steganography applied to images can result in hard-to-detect watermarks or provenance identifiers.
But I don't see how these can be effectively used in text content. Yes, an AI program can encode provenance identifiers by length of words, starting letters of sentences, use of specific suffixes, and other linguistic constructs.
However, say that I am a student with an AI-generated essay and want to make sure my essay passes the professor's plagiarism checker. Isn't it pretty easy to re-order clauses, substitute synonyms, and add new content? In fact, I think there is even a Chrome extension that does something like that.
Or maybe that is too much work for the lazy student who wants to completely rely on ChatGTP or doesn't know any better.
The point of steganography (as discussed in the paper) is not unerasable watermarks but undetectable (to the adversary) messages in innocent-looking communication.
I'm confused why you focus on plagiarism detection. That being said, your scenario is very briefly mentioned in the conclusion and requires augmenting the approach (entropy coding) with error correction.
The result would be that as long as your modifications (reordering clauses, etc.) reasonably closely follow a known distribution with limited entropy (which I think it clearly does, although specifying this distribution and dealing with the induced noisy channel might be very hard), there will be a way to do covert communication despite it, though probably only a very small amount of information can be transmitted reliably. For plagiarism detection, you only need a number of bits that scales like -log[your desired false positive rate] so it would seem theoretically possible. Theoretically it also doesn't matter if you use text or images, though in practice increasing the amount of transmitted data should make the task a lot easier. However, I'm not sure if something like this can be practically implemented using existing methods.
Instead of manually reordering clauses, etc, you could just run the original essay through another LLM without watermarking capability and ask it to write a new essay based on the original.
Then test the result against your own plagiarism detector and iterate through the watermark-less LLM until the resulting essay passes.
Or just proactively run it through a bunch of times.
Or just use the watermark-less LLM to begin with.. personal, unshackled, powerful LLMs are definitely on the trajectory we're headed in.
This is fairly theoretical work. It assumes that the parties (and the adversary) know the distribution precisely.
Its direct practical may be potentially somewhat limited because people aren't going around communicating randomly selected LLM outputs... and if you use LLM output in a context where text would be expected it could be distinguished.
It's not useful for watermarking as the first change will destroy all the rest of the embedding.
I can make a contrived example where it's directly useful: Imagine you have agents in the field, you could send out LLM generated spam to communicate with them. Everyone expects the spam to be LLM generated, so it's not revealing that its detectable as such. This work discusses how you can make the spam carry secret messages to the agents in a way that is impossible to detect (without the key, of course) even by an attacker that has the exact spam LLM.
Less contrived, a sufficiently short message from a sufficiently advanced LLM is probably indistinguishable from a real post in practice. -- but that's outside of the scope of the paper. It's hard (impossible?) to rigorously analyze that security model because we can't say much about the distribution of "real text" so we can't say how far from it LLM output is. The best models we have of the distribution of real text are these LLMs, if you take them to BE the distribution of real text then that approach is perfectly secure by definition.
But really, even if the situation it solves is too contrived to be a gain over alternatives, the context provides an opportunity to explore the boundary of what is possible.
A higher level of abstraction generative AI will soon be able to apply is in the design of algorithms that humans cannot possibly comprehend that go well beyond rudimentary newspaper want ad spycraft.
Is this using steganography to create un-tamperable watermarks (which would allow downstream users of possibly-AI-created material to prove that it is indeed created by an AI) or, is this for something different?
This is to transmit confidential information without using encryption. Something I predict will be more and more needed, as democratic governments succumb into their autocratic temptations and attempt to ban encryption.
The secret message must be encrypted (ideally with just a one time pad to preserve the guarantees) or everyone could simply read it -- the paper assumes the model is public.
The technique lets you turn an encrypted message into the output of a sampled auto-regressive model in a way which is impossible for someone to distinguish from just running the model normally unless they have the private key, even if the attacker has the exact model you're using.