Since an arbitrarily tall stack of combining characters still counts as one grapheme cluster, if some application limits string length by counting grapheme clusters then you can stuff an unlimited amount of data in there, with "only" 2x overhead in the byte representation.
Unfortunately HN filters some of the codepoints so I can't demonstrate here. Since I chose "A" as the base character which the diacritics are stacked on, it has a similar aesthetic to the SCREAM cipher although a little more zalgo-y.
Interesting, I actually expected it to encode a single letter with infinitely long combining marks such that 'highlighting' it was just highlighting one character.
That's curious, because the only character is just the letter A. But I suppose if the font doesn't support a particular combining mark, it gives up on the whole grapheme?
HN filters some combining characters? That's weird, compared to the symbol/emoji blocking.
Also I'm reminded that the unicode normalization annex suggests that legitimate grapheme clusters will be 31 code points or less. "The value of 30 is chosen to be significantly beyond what is required for any linguistic or technical usage."
I have been using ROT13, but I’ve been looking for a post-quantum replacement so definitely I’m going to convert to SCREAM. It’s generally understood that qubits are unable to represent or even discern the little squiggly bits above normal Latin letters.
Thank you for this important contribution to cryptography!
The important part about applying ROT13 is the number of iterative applications. The security of even-numbered applications is undeniable. Odd-numbered is even better than that.
I’m currently building an implementation with fractional rotation. Of course I will post a Show HN when it’s ready.
Not perfectly. I grabbed a random encoded line from these comments, and asked ChatGPT to decode it[1]. It determined the plaintext was:
> Immediately thought of Moby, infact a quick search for this title... coincidental, but I would mention it in the page if I were you.
and noted that it had "preserved punctuation and capitalization from the ciphertext". The actual plaintext should be:
> Immediately thought of XKCD, infact a quick search for this title gives me XKCD, it could be coincidental, but I would mention it in the page if I were you.
I've hit my free usage limit so can't currently prompt it further about its mistake.
It's truly an honour to have been able to teach the PSF's Security Developer-in-Residence something about the implementation of a simple substitution cipher in Python. ;) (In all seriousness, thanks for all your excellent work. The many projects you help out with — and advocate for — in the Python ecosystem, including CPython itself, are all far better off for it.)
Threading is done with the wave character ~ in Racket? I can't decide if I hate it or not (am used to Clojure's ->). I think my pinky finger doesn't like ~.
In this case ~> is a macro from a widely used package (https://docs.racket-lang.org/threading/index.html) so if you defined an alias for it (or forked the package) you could use any valid identifier.
I was very confused why this would be useful for Telegram messages, but the Why? part of the readme makes perfect sense. Great workaround for a stupid limitation!
I got nerd sniped by this xkcd and was happily working my way through an implementation and realized that the accent combiners work with any character. It is trivial to add bad steganography to your bad encryption.
It's hilarious that Stream Ciphers are the closest thing to the One-Time-Pad (which provides "Perfect Secrecy") and this thing is a Monoalphabetic Substitution Cipher which provides no security whatsoever.
It feels like you're trying to express that stream ciphers are especially secure compared to block ciphers (which is what most of them are built out of), which isn't the case.
I was rather confused by the dictionary comprehension syntax used there, because I wasn't aware that you could write one without the ":" to delineate the key: value pair. Turns out you can, but it just creates a dict with no values stored, just the keys! This works here because the returned dict is an iterable that returns the keys on iteration, and "update" accepts an iterable of (key, value) tuples - and the keys are just that in this case. So the effect is the same as if it was a list comprehension! Just slightly more confusing
Ah, my bad! I did not know these were a thing, but that makes more sense! Teaches me a thing about only quickly trying things in an online REPL on mobile and jumping to conclusions - I forgot curly braces were also a way to denote a set
For fun there are other variants of base64 in a similar spirit [1][2] full unicode. [3] see other links in that repo... Not a stream cipher, just encoding but could be used in conjunction with a stream cipher to add compression. It could go turtles all the way down.
And what do you think is the algorithm from the article? Looks awfully similar to base64 to me, except its lacking the bit-shifts. Both use a lookup table like that.
I think a lot of this depends on if you read the article as the scream cipher being specifically the exact listed substitutions or just any substitution with forms of As. Also depends on how you define encoding, cipher and the overlaps between the two. Plus questions on the relevance of intent, transformation of data, plus changing of meaning and definitions over the years. Some people say morse code is a cipher, but braille isn't - definitions can depend on way more than the black and white logical "but it does this" you're using.
You'd do better debating this with a real life friend over a pint, rather than wasting your time trying to argue with multiple people here.
You will find that the pigpen cipher has a 1:1 mapping between its input alphabet and its output alphabet, and that a 1:1 mapping is a necessity for full invertibility.
What people in this thread call a "key" is, not like a key, auxiliary input data, but hard-coded into the program. We are looking at encodings.
Maybe this differentiation is not popular or well accepted, but it was surely part of my cryptography curriculum and the following exam. I'd rather believe my prof than strangers on the internet.
Key can mean different things in different contexts. In a substitution cipher, the key is the mapping. In modern ciphers, the key would be some set of secret bytes. Everyone agrees that this cipher would be a bad way to encrypt/encode something. But using the word cipher like this has real historical meaning, and that is the meaning that is being used in the project.
https://gist.github.com/DavidBuchanan314/07da147445a90f7a049...
Since an arbitrarily tall stack of combining characters still counts as one grapheme cluster, if some application limits string length by counting grapheme clusters then you can stuff an unlimited amount of data in there, with "only" 2x overhead in the byte representation.
Unfortunately HN filters some of the codepoints so I can't demonstrate here. Since I chose "A" as the base character which the diacritics are stacked on, it has a similar aesthetic to the SCREAM cipher although a little more zalgo-y.
reply