Scream cipher

Retr0id · 2025-09-20T10:36:33 1758364593

There are a little over 256 unicode Combining Marks that have a 2-byte UTF-8 encoding. I picked a set of them, defining an encoding I call zalgo256:

https://gist.github.com/DavidBuchanan314/07da147445a90f7a049...

Since an arbitrarily tall stack of combining characters still counts as one grapheme cluster, if some application limits string length by counting grapheme clusters then you can stuff an unlimited amount of data in there, with "only" 2x overhead in the byte representation.

Unfortunately HN filters some of the codepoints so I can't demonstrate here. Since I chose "A" as the base character which the diacritics are stacked on, it has a similar aesthetic to the SCREAM cipher although a little more zalgo-y.

junon · 2025-09-20T11:32:56 1758367976

A demonstration as a comment on the gist would probably work! I'd love to see that

Retr0id · 2025-09-20T11:55:17 1758369317

Good point, added

junon · 2025-09-20T12:10:54 1758370254

Interesting, I actually expected it to encode a single letter with infinitely long combining marks such that 'highlighting' it was just highlighting one character.

Retr0id · 2025-09-20T12:24:09 1758371049

You can do that too, if you increase the STACK_HEIGHT constant (btw, the decoder still works the same, so changing this doesn't break compatibility)

junon · 2025-09-20T13:28:16 1758374896

Oh neat! Thanks :)

all2 · 2025-09-20T19:55:07 1758398107

Most of the characters appear as boxes on my phone.

Retr0id · 2025-09-20T19:56:11 1758398171

That's curious, because the only character is just the letter A. But I suppose if the font doesn't support a particular combining mark, it gives up on the whole grapheme?

Dylan16807 · 2025-09-20T17:19:20 1758388760

HN filters some combining characters? That's weird, compared to the symbol/emoji blocking.

Also I'm reminded that the unicode normalization annex suggests that legitimate grapheme clusters will be 31 code points or less. "The value of 30 is chosen to be significantly beyond what is required for any linguistic or technical usage."

Retr0id · 2025-09-20T17:25:45 1758389145

If I had to guess, they probably filtered the ones that could be used to break page layouts by creating very-tall glyphs.

Dylan16807 · 2025-09-20T17:29:00 1758389340

I guess that's one way to do it. Pretty far from ideal though.

RGamma · 2025-09-20T16:30:27 1758385827

Are you sure this doesn't summon The One by accident?

efitz · 2025-09-20T13:32:09 1758375129

I have been using ROT13, but I’ve been looking for a post-quantum replacement so definitely I’m going to convert to SCREAM. It’s generally understood that qubits are unable to represent or even discern the little squiggly bits above normal Latin letters.

Thank you for this important contribution to cryptography!

jagged-chisel · 2025-09-20T14:55:25 1758380125

The important part about applying ROT13 is the number of iterative applications. The security of even-numbered applications is undeniable. Odd-numbered is even better than that.

I’m currently building an implementation with fractional rotation. Of course I will post a Show HN when it’s ready.

fragmede · 2025-09-20T16:09:00 1758384540

oh so that's where æ comes from!

eastbound · 2025-09-20T14:12:14 1758377534

I wonder if Chatgpt can decrypt all of them just by analyzing vowel frequency, and then trying to find the algo on the internet.

ameliaquining · 2025-09-20T19:27:15 1758396435

In my tests, ChatGPT 5 Thinking can handle a monoalphabetic substitution cipher if you prompt it a couple times to keep going.

stordoff · 2025-09-20T21:16:26 1758402986

Not perfectly. I grabbed a random encoded line from these comments, and asked ChatGPT to decode it[1]. It determined the plaintext was:

> Immediately thought of Moby, infact a quick search for this title... coincidental, but I would mention it in the page if I were you.

and noted that it had "preserved punctuation and capitalization from the ciphertext". The actual plaintext should be:

> Immediately thought of XKCD, infact a quick search for this title gives me XKCD, it could be coincidental, but I would mention it in the page if I were you.

I've hit my free usage limit so can't currently prompt it further about its mistake.

[1] https://chatgpt.com/share/68cf17a6-8478-8011-a44e-64d43ad8a4...

thenewwazoo · 2025-09-20T23:41:57 1758411717

I pushed it a bit and it didn’t do so hot.

https://chatgpt.com/share/68cf3b9f-decc-8007-8a5d-cc7b583d0e...

zahlman · 2025-09-20T14:59:54 1758380394

It's not necessary to write the ciphering logic.

  CIPHER, UNCIPHER = str.maketrans(CIPHER), str.maketrans(UNCIPHER)

  print(s := 'STREAM CIPHER'.translate(CIPHER))
  print(s.translate(UNCIPHER))

SethMLarson · 2025-09-20T17:56:37 1758390997

NICE!!

zahlman · 2025-09-20T22:17:46 1758406666

It's truly an honour to have been able to teach the PSF's Security Developer-in-Residence something about the implementation of a simple substitution cipher in Python. ;) (In all seriousness, thanks for all your excellent work. The many projects you help out with — and advocate for — in the Python ecosystem, including CPython itself, are all far better off for it.)

SethMLarson · 2025-09-21T12:58:09 1758459489

<3 Thanks for the kind words!! :)

sixhobbits · 2025-09-20T11:41:58 1758368518

I did something similar a while back but using all the invisible characters to encode extra data into telegram messages for metadata storage

https://github.com/sixhobbits/unisteg

velcrovan · 2025-09-20T15:23:45 1758381825

I had fun writing a Racket version:

    #lang racket/base

    (require net/base64
             threading)
    
    (define FIRST-INVISIBLE-CHAR 917760)
    
    (define (invis-encode str)
      (list->string
       (for/list ([c (in-list (string->list str))]
                  #:do [(define cnum (char->integer c))]
                  #:when (<= cnum 127))
         (integer->char (+ cnum FIRST-INVISIBLE-CHAR)))))

    (define (invis-decode str)
      (list->string
       (for/list ([c (in-list (string->list str))]
                  #:do [(define plaintxt-c (- (char->integer c) FIRST-INVISIBLE-CHAR))]
                  #:when (> plaintxt-c 0))
         (integer->char plaintxt-c))))

    (define (hide secret plain)
      (~> (string->bytes/utf-8 secret)
          (base64-encode #"")         ; use #"" vs #"\r\n" to prevent line-wrapping
          (bytes->string/utf-8)
          (invis-encode)
          (string-append plain _)))

    (define (unhide ciphertext)
      (~> (invis-decode ciphertext)
          (string->bytes/utf-8)
          (base64-decode)
          (bytes->string/utf-8)))

    (module+ test
      (require rackunit)
      (define secret "this is a s3cret message. ssh")
      (define plaintext "Hey you, nothing to see here.")
      (define to-share (hide secret plaintext))
    
      (check-equal? (string-length to-share) 69)         ; count of bytes
      (check-equal? (string-grapheme-count to-share) 29) ; 29 actually-visible graphemes
      (check-equal? secret (unhide to-share)))

askonomm · 2025-09-20T16:48:07 1758386887

Threading is done with the wave character ~ in Racket? I can't decide if I hate it or not (am used to Clojure's ->). I think my pinky finger doesn't like ~.

velcrovan · 2025-09-22T12:10:53 1758543053

In this case ~> is a macro from a widely used package (https://docs.racket-lang.org/threading/index.html) so if you defined an alias for it (or forked the package) you could use any valid identifier.

soegaard · 2025-09-23T12:27:39 1758630459

The identifier `->` is already used for type annotations.

franga2000 · 2025-09-20T11:52:07 1758369127

I was very confused why this would be useful for Telegram messages, but the Why? part of the readme makes perfect sense. Great workaround for a stupid limitation!

RealCodingOtaku · 2025-09-20T12:52:48 1758372768

Ǎầầặắǎaạặậā ạẵẫȁẳẵạ ẫằ ȂẤĂẮ, ǎẩằaăạ a ǟȁǎăấ ǡặaȧăẵ ằẫȧ ạẵǎǡ ạǎạậặ ẳǎàặǡ ầặ ȂẤĂẮ, ǎạ ăẫȁậắ áặ ăẫǎẩăǎắặẩạaậ, áȁạ Ǎ ảẫȁậắ ầặẩạǎẫẩ ǎạ ǎẩ ạẵặ äaẳặ ǎằ Ǎ ảặȧặ āẫȁ.

rawling · 2025-09-20T11:20:15 1758367215

https://xkcd.com/3054/

codeulike · 2025-09-20T11:34:54 1758368094

Ah its from XKCD (feb 2025), bit odd of the OP not to mention that

hnlmorg · 2025-09-20T12:33:26 1758371606

I think this might be more coincidental than derivative.

sim7c00 · 2025-09-20T16:32:52 1758385972

scream ciphers. a bit like back when we invented fire :p

SethMLarson · 2025-09-20T13:22:17 1758374537

OP here, I either didn't know or completely forgot this XKCD existed and it resurfaced as a good idea haha! Time to update the post lol

PenguinRevolver · 2025-09-20T11:34:33 1758368073

Oh god, now we're gonna have two different standards for a scream cypher https://xkcd.com/927/

RealCodingOtaku · 2025-09-20T12:56:02 1758372962

https://www.dcode.fr/scream-cipher-xkcd

somat · 2025-09-20T21:24:52 1758403492

I got nerd sniped by this xkcd and was happily working my way through an implementation and realized that the accent combiners work with any character. It is trivial to add bad steganography to your bad encryption.

r̊e̝q̝ůěs̔t͞ p̊e̝a͞c̍e̊ t̠a̗lks

s20n · 2025-09-20T13:40:40 1758375640

It's hilarious that Stream Ciphers are the closest thing to the One-Time-Pad (which provides "Perfect Secrecy") and this thing is a Monoalphabetic Substitution Cipher which provides no security whatsoever.

tptacek · 2025-09-20T17:24:53 1758389093

It feels like you're trying to express that stream ciphers are especially secure compared to block ciphers (which is what most of them are built out of), which isn't the case.

timonoko · 2025-09-20T14:08:08 1758377288

Most of these are short/long vowel markings, except last one which is (probably) implosion. And rest are frontal/back and wide/narrow A.

But Swedish "Å" is just stupidity "O", because they started pronouncing "O" as "U" and "U" as "Y".

-- Can you pronounce these screams?

pezezin · 2025-09-21T20:00:35 1758484835

The last one is the ogonek, it usually indicates nasalization: https://en.m.wikipedia.org/wiki/Ogonek

codeulike · 2025-09-20T10:34:23 1758364463

Ảặậậ ạẵaạǡ ȧặaậậā ǎẩạặȧặǡạǎẩẳ, a áǎạ ầẫȧặ ằȁẩ ạẫ ȁǡặ ạẵaẩ ȦẪẠ13

BubbleRings · 2025-09-20T12:10:50 1758370250

I bet that last word is ROT13! We can crack it now! And maybe the second to last is “like”.

sim7c00 · 2025-09-20T16:33:42 1758386022

wait is it rot13 on the screamphabet or the alphabet?

dsjoerg · 2025-09-20T11:37:08 1758368228

Ǎ ăẫẩǡǎắặȧ ǎạ a ăẵaậậặẩẳặ áặằẫȧặ ạẵặ ảẵẫậặ ẵȁầaẩ ȧaăặ! Aẩắ Ǎ aǎẩ'ạ ẳẫẩẩa ậẫǡặ.....

tetris11 · 2025-09-20T12:04:16 1758369856

> > "Hope remains strewn asunder, I weep holy tears oh great one, Paul:16"

> "I belong to a secret group of panda bear hunters! Eat a meaty flesh chunk...."

For anyone wondering..

cluckindan · 2025-09-20T16:11:51 1758384711

Do this with variants of O and the ghosts will be happy.

pbsd · 2025-09-20T19:46:15 1758397575

I thought this was gonna be about the actual Scream stream cipher: https://eprint.iacr.org/2002/019

yencabulator · 2025-09-21T01:46:49 1758419209

Previous attempt at cracking this cipher:

https://www.youtube.com/watch?v=ZlIz0q8aWpA

jdranczewski · 2025-09-20T19:16:56 1758395816

I was rather confused by the dictionary comprehension syntax used there, because I wasn't aware that you could write one without the ":" to delineate the key: value pair. Turns out you can, but it just creates a dict with no values stored, just the keys! This works here because the returned dict is an iterable that returns the keys on iteration, and "update" accepts an iterable of (key, value) tuples - and the keys are just that in this case. So the effect is the same as if it was a list comprehension! Just slightly more confusing

duskwuff · 2025-09-20T19:50:20 1758397820

Not precisely. {x} is a set literal; {x for y in z} is a set comprehension.

jdranczewski · 2025-09-20T20:28:16 1758400096

Ah, my bad! I did not know these were a thing, but that makes more sense! Teaches me a thing about only quickly trying things in an online REPL on mobile and jumping to conclusions - I forgot curly braces were also a way to denote a set

vehementi · 2025-09-20T15:52:14 1758383534

Finally we can talk to the bomb dudes in Serious Sam

cluckindan · 2025-09-20T16:10:06 1758384606

Or teach it as the only righteous alphabet to our children.

fainpul · 2025-09-20T11:15:28 1758366928

One could use emojis instead, then the message could be hidden in plain sight in places where emoji-spam is common.

Bender · 2025-09-20T13:56:25 1758376585

For fun there are other variants of base64 in a similar spirit [1][2] full unicode. [3] see other links in that repo... Not a stream cipher, just encoding but could be used in conjunction with a stream cipher to add compression. It could go turtles all the way down.

[1] - https://github.com/qntm/base2048

[2] - https://github.com/qntm/base32768

[3] - https://github.com/qntm/base65536

cluckindan · 2025-09-20T11:28:51 1758367731

Emojis have a high overhead, a single emoji is typically 4 bytes but may be up to 35 bytes.

foofoo12 · 2025-09-20T11:44:17 1758368657

Yikes! Imagine if people were to start sending photos and videos to each other!

fainpul · 2025-09-20T12:39:05 1758371945

Oh, my bad - I wasn't aware that we're doing serious engineering here :p

kirjavascript · 2025-09-20T18:24:02 1758392642

here's a JS one liner that handles scream and unscream in one function

  transform=s=>[...s.toUpperCase()].map(s=>(l="BÁGẲLẬQǞVÀCĂHẴMẦRȦWẢDẮIǍNẨSǠXȂEẶJÂOẪTẠYĀFẰKẤPÄUȀZĄ")[l.indexOf(s)^1]||s).join``

adv0r · 2025-09-20T11:55:01 1758369301

it had to be done https://chatgpt.com/g/g-68ce9419c7d4819190f82744d6e2741e-url...

vman512 · 2025-09-20T16:55:24 1758387324

How sand people talk

slig · 2025-09-20T14:49:15 1758379755

Ạẵặǡặ Äȧặạąặậǡ Aȧặ Ầaấǎẩẳ Ầặ Ạẵǎȧǡạā!

ethmarks · 2025-09-20T13:09:11 1758373751

Here's another implementation I made a few months ago:

https://ethmarks.github.io/posts/screamcipher

DonHopkins · 2025-09-20T10:25:26 1758363926

ẰǍȦǠẠ ÄẪǠẠ

codeulike · 2025-09-20T10:36:53 1758364613

Ằậaẳẳặắ

ginko · 2025-09-20T10:34:35 1758364475

Now pack even more info in each character with Zalgo text.

Retr0id · 2025-09-20T10:44:30 1758365070

zalgo256: https://gist.github.com/DavidBuchanan314/07da147445a90f7a049...

BubbleRings · 2025-09-20T12:28:39 1758371319

Hey, cool little rabbit hole there. I had totally missed all that.

https://en.m.wikipedia.org/wiki/Zalgo_text

permo-w · 2025-09-20T10:37:53 1758364673

am I unusual in not really seeing the "creepiness" of zalgo text?

faeyanpiraat · 2025-09-20T11:56:45 1758369405

Maybe you missed this piece of the internet history: https://stackoverflow.com/a/1732454

lambdaone · 2025-09-20T10:42:00 1758364920

Think of it as representing something like the letters actively 'creeping' and giving off tendrils of darkness. Does this help?

permo-w · 2025-09-20T12:45:32 1758372332

I understand the idea, it just doesn't impact me

Retr0id · 2025-09-20T10:58:46 1758365926

It's not inherently creepy but often symbolic of corruption or someone talking in a raspy/synthetic "evil overlord" kind of voice.

fuzzy_biscuit · 2025-09-20T13:48:18 1758376098

Now I need a TTS to read a scream.

sigseg1v · 2025-09-20T21:14:58 1758402898

Artosis' channel on Twitch has got that one covered.

blueflow · 2025-09-20T10:30:03 1758364203

... in the same sense that ROT13 or base64 would be a cipher.

hennell · 2025-09-20T11:13:40 1758366820

Rot 13 is a cipher. It's a substitution cipher, and more specifically a shift cypher or Caesar cipher. It's not a secure cipher but it is one.

Base64 is an encoding. It's an algorithm, no attempt at secrecy, thus not a cipher.

cluckindan · 2025-09-20T11:31:33 1758367893

And thus we arrive at SCREAM64 encoding, base64 in scream cipher.

KPGv2 · 2025-09-20T12:39:40 1758371980

such a great idea that we ought to call it based64 encoding

foofoo12 · 2025-09-20T11:49:41 1758368981

Sweet Lord Jesus.

mistercow · 2025-09-21T08:23:47 1758443027

If you use base64 with the intention of hiding the encoded information, surely it’s as much a cipher as rot13 is, right?

blueflow · 2025-09-20T11:24:24 1758367464

And what do you think is the algorithm from the article? Looks awfully similar to base64 to me, except its lacking the bit-shifts. Both use a lookup table like that.

hennell · 2025-09-20T12:13:10 1758370390

I think a lot of this depends on if you read the article as the scream cipher being specifically the exact listed substitutions or just any substitution with forms of As. Also depends on how you define encoding, cipher and the overlaps between the two. Plus questions on the relevance of intent, transformation of data, plus changing of meaning and definitions over the years. Some people say morse code is a cipher, but braille isn't - definitions can depend on way more than the black and white logical "but it does this" you're using.

You'd do better debating this with a real life friend over a pint, rather than wasting your time trying to argue with multiple people here.

jeroenhd · 2025-09-20T11:05:06 1758366306

The original Caesar cipher supposedly also had a constant offset, yet it's still considered a cipher.

A bad substitution cipher is still a cipher. Just one you shouldn't use for anything important.

andy99 · 2025-09-20T10:35:11 1758364511

Yes https://en.m.wikipedia.org/wiki/Substitution_cipher

JdeBP · 2025-09-20T10:44:42 1758365082

… and no, since neither the enciphering nor the deciphering do a 1:1 mapping for all possible input code points.

amenhotep · 2025-09-20T10:51:41 1758365501

That's not a requirement. Pigpen is a substitution cipher.

JdeBP · 2025-09-21T11:36:18 1758454578

You will find that the pigpen cipher has a 1:1 mapping between its input alphabet and its output alphabet, and that a 1:1 mapping is a necessity for full invertibility.

blueflow · 2025-09-20T10:39:19 1758364759

First sentence:

> with the help of a key

So, where is the key?

bradrn · 2025-09-20T10:41:09 1758364869

In the code in this article, the key is the mapping stored in ‘CIPHER’.

DonHopkins · 2025-09-20T13:53:06 1758376386

Ha ha ha ha ha ha! You want the key?

https://www.youtube.com/watch?v=vsb9-wPYpxI

shakna · 2025-09-20T10:40:46 1758364846

The key is the data table, representing which each character encodes to or from.

JdeBP · 2025-09-20T10:41:57 1758364917

First, second, and third statements of the provided source code.

blueflow · 2025-09-20T10:47:32 1758365252

Like i said, by these measurements, base64 would also be a cipher.

omnicognate · 2025-09-20T10:57:54 1758365874

And people are telling you yes, they (rot13 and base64) are indeed ciphers. What's the confusion?

blueflow · 2025-09-20T11:18:50 1758367130

What people in this thread call a "key" is, not like a key, auxiliary input data, but hard-coded into the program. We are looking at encodings.

Maybe this differentiation is not popular or well accepted, but it was surely part of my cryptography curriculum and the following exam. I'd rather believe my prof than strangers on the internet.

greysonp · 2025-09-20T11:42:27 1758368547

Key can mean different things in different contexts. In a substitution cipher, the key is the mapping. In modern ciphers, the key would be some set of secret bytes. Everyone agrees that this cipher would be a bad way to encrypt/encode something. But using the word cipher like this has real historical meaning, and that is the meaning that is being used in the project.

personalcompute · 2025-09-20T10:38:22 1758364702

Ăặȧạaǎẩậā ȧẫạ13, áaǡặ64, aẩắ ạẵǎǡ ẩặả ǡăȧặaầ ăǎäẵặȧ aȧặ aậậ ǎẩǡặăȁȧặ, áȁạ ạẵặā ắẫ ầặặạ ạẵặ ạặăẵẩǎăaậ ắặằǎẩǎạǎẫẩ ẫằ a ăǎäẵặȧ.

lambdaone · 2025-09-24T15:52:07 1758729127

Yog-Sothoth has heard your plea, and will be along to help you shortly.