scrapscript.py

rahimnathwani · 2024-01-24T05:07:37 1706072857

This is about how scrapscript is implemented.

There was a popular Show HN about 9 months ago, about scrapscript itself: https://news.ycombinator.com/item?id=35712163

Karupan · 2024-01-24T07:21:21 1706080881

I have to admit this broke my brain. This is the first time I'm hearing about content addressable languages, and once you get over that barrier, a distributed language doesn't seem far fetched.

As a big fan of functional programming, is this something that is going to just end up being an esoteric language? Don't get me wrong, I absolutely love the vision of the authors, but after being bitten by the Elm bug and that crashing and burning, I'm just cautious of getting invested in new languages and tools.

MarceColl · 2024-01-24T09:35:26 1706088926

In case you don't know about this, this is kind of what they are trying to achieve: https://www.unison-lang.org/

slowmotiony · 2024-01-24T10:27:01 1706092021

I still don't get it, could someone smarter than me explain?

helloWorld : '{IO, Exception} () helloWorld _ = printLine "Hello World"

The example above is followed by explanation "{IO, Exception} indicates which abilities the program needs to do I/O and throw exceptions." Well, which abilities does it need then? No idea.

trenchgun · 2024-01-24T10:38:32 1706092712

>Well, which abilities does it need then? No idea.

Abilities called IO and Exception.

I am sure you are familiar with effect systems and algebraic effects, right? Abilities are what algebraic effects are called in Unison: https://www.unison-lang.org/docs/fundamentals/abilities/

So, in Haskell you would have IO monad and Exception monad, but in Unison you have an IO ability and an Exception ability.

If you want to know more: https://www.unison-lang.org/docs/language-reference/abilitie... and: Convent, L., Lindley, S., McBride, C. and McLaughlin, C., 2020. Doo bee doo bee doo. Journal of Functional Programming, 30, p.e9. https://arxiv.org/pdf/1611.09259.pdf

sanderjd · 2024-01-24T13:17:42 1706102262

> I am sure you are familiar with effect systems and algebraic effects, right?

Probably not, based on their question!

These are pretty esoteric concepts. I think it was one of a few bullet points on "other interesting ideas" in the functional programming portion of my programming languages course, and I doubt that most working programmers have taken an academic PL course like that at all.

But effects are indeed an awesome concept, and thanks for the excellent links! The parent is one of today's lucky 10,000: https://xkcd.com/1053/

helboi4 · 2024-01-24T14:19:21 1706105961

As a junior that did not do CS at uni, a lot of stuff around here goes right over my head. I often do feel that I might never catch up. I just about understand what functional programming is in terms of a one line definition let alone any concepts that fall under it. To be fair, I only really use Object Oriented.

slowmotiony · 2024-01-24T13:49:33 1706104173

Thank you!

trenchgun · 2024-01-24T11:41:32 1706096492

>I still don't get it, could someone smarter than me explain?

It is not about smartness, but it is probably you not having encountered these concepts before.

Karupan · 2024-01-24T09:41:03 1706089263

Thank you, this looks awesome. I already have a use case in mind, so will explore unison since it seems more mature.

adius · 2024-01-24T08:50:59 1706086259

Nothing crashed and burned. Elm is in a state where it’s fully usable and has all the futures you need. I use it every day!

Since it’s a DSL to create HTML + JS + CSS websites, you still get all the new features of browsers!

mst · 2024-01-24T16:22:28 1706113348

I think, even though I can absolutely see the arguments for doing so, locking down the compiler to no longer accept external native extensions was a huge mistake community wise, since a lot of the people advocating for Elm were the sort of early adopter who really really wants an escape hatch because they're in the habit of getting into situations where they need one no matter what tools they're using.

Certainly that describes me, and when all the people who seemed to be like me got told to go sit on a cactus by the Elm core developers and bailed out to work with something else, my experiments got pretty much immediately shelved and Elm moved into the "interesting place to steal ideas from, actively hostile to my actually using it" category.

This may be unfair, but I'm pretty sure it's a reasonable description of what -did- happen, fair or not.

Karupan · 2024-01-24T09:33:09 1706088789

Except for the compiler bugs, lack of self hosted package management, improvements to tooling, etc. I use it regularly too, but it is frustrating to see it in a state of decay.

They can definitely claim the above points and more are not goals, and Evan absolutely has every right to do so. But don’t be surprised when devs see it as dead.

csantini · 2024-01-24T10:32:47 1706092367

Interesting idea, but isn't code addressable already in most languages?

We call them modules/libraries and we pip/npm install them from Github and you can keep track of changes/versions/PRs.

trenchgun · 2024-01-24T11:46:02 1706096762

Content addressable has a very specific meaning: https://en.wikipedia.org/wiki/Content-addressable_storage

Modules and libraries are addressable based on their names or URI:s.

"Unison eliminates name conflicts. Many dependency conflicts are caused by different versions of a library "competing" for the same names. Unison references defintions by hash, not by name, and multiple versions of the same library can be used within a project." https://www.unison-lang.org/docs/what-problems-does-unison-s...

"Here's the big idea behind Unison, which we'll explain along with some of its benefits:

Each Unison definition is identified by a hash of its syntax tree.

Put another way, Unison code iscontent-addressed. Here's an example, the increment function on Nat:

increment : Nat -> Nat increment n = n + 1

While we've given this function a human-readable name (and the function Nat.+ also has a human-readable name), names are just separately stored metadata that don't affect the function's hash. The syntax tree of increment that Unison hashes looks something like:

increment = (#arg1 -> #a8s6df921a8 #arg1 1)

Unison uses 512-bit SHA3 hashes, which have unimaginably small chances of collision.

If we generated one million unique Unison definitions every second, we should expect our first hash collision after roughly 100 quadrillion years! " https://www.unison-lang.org/docs/the-big-idea/

flir · 2024-01-24T13:36:29 1706103389

Seems like identifying your library with a git tag would drop that risk to zero.

I guess what I'm not understanding here is the utility. Why is it useful to include multiple versions of a library in a project? Is this a limitation I've been coding around without knowing it?

celeritascelery · 2024-01-24T14:21:03 1706106063

Have you ever had problem where two of your dependencies are each using a different version of the same library? Or have you ever wanted to incrementally upgrade an API so that you don’t have to change your entire code base in one fell swoop? That is where things like Unison or scrapscript can make it very easy.

flir · 2024-01-24T15:27:20 1706110040

Ok, I can see "incremental upgrade" as a use-case. Thanks.

cnity · 2024-01-24T14:31:29 1706106689

I recommend reading the benefits section in the Unison docs[0].

0: https://www.unison-lang.org/docs/the-big-idea/#benefits

penteract · 2024-01-24T14:21:14 1706106074

One reason for multiple versions of a library in a project is that the project wants to use 2 different dependencies, which themselves depend on incompatible versions of a third library.

flir · 2024-01-24T15:27:55 1706110075

ok, yep, that's one I've had myself. Thanks.

computerfriend · 2024-01-24T15:13:49 1706109229

Tags are not immutable.

bestai · 2024-01-24T14:34:55 1706106895

I think it is something like Hoogle for haskell but instead of looking for the types of the functions you look for a hash of some kind of canonical encoding of the definition, so it is like an encoded knowledge graph but you should have to give rules in order to construct that graph in a canonical way.

Edited: What I thought was wrong, anyway the idea of above could be useful for something like copilot to complete definitions.

kitd · 2024-01-24T10:54:10 1706093650

Content-addressable, not code addressable. It's kind of like global, distributed memoization (IIUC).

edit: not memoization, just hashing the AST of a function.

throwaway290 · 2024-01-24T13:11:47 1706101907

Content is by definition content addressable. x = 42 is a hardlink to every other instance of x = 42 if you will. What this does is more compact and practical content addressing, like Nix or Git. But realizing that there are always more than one way of expressing the same logic (with different hashes no matter how you canonicalize) makes me doubt it is a killer feature.

throwaway290 · 2024-01-24T13:06:44 1706101604

That does not sound like it could make any money though...

account-5 · 2024-01-24T13:53:45 1706104425

Can someone explain what this is? Why it's a good thing? What's it's for? I have to admit based on reading the post I have absolutely no idea.

surprisetalk · 2024-01-24T15:43:53 1706111033

I should probably write a longer post about this, but scrapscript is an attempt to fix a lot of the "in-between" problems in software engineering.

Instead of working on "real" problems, I find myself battling untyped/undocumented YAML/JSON configurations, syncing JSON encoders/decoders, massaging incompatible dependencies, writing unholy SQL, etc.

I obviously don't have all the answers, but a system with the following properties seems like a worthwhile pursuit: (1) small enough to be used like JSON yet powerful enough to used like Javascript, (2) cryptographic guarantees that code is compatible over time, (3) a compiler that checks live servers for compatibility before deploying, (4) simple but expressive type system, (5) a package manager that facilitates all of this at a granular level... and so on.

On top of all that, I think these properties lend themselves to some grand ambitions like "a new internet" and a "google-docs live coding editor experience". Maybe I'm just full of myself though haha

scrapscript.py is the first real attempt at making scrapscript a reality, so some folks who feel these pains are getting excited to see some movement on the project.

EDIT: Here’s my recent scrapyard demo, if you want to see it in action: https://www.youtube.com/watch?v=SngOLU5G1Eg

manifoldgeo · 2024-01-24T17:50:02 1706118602

> I find myself battling untyped/undocumented YAML/JSON configurations, syncing JSON encoders/decoders, massaging incompatible dependencies

I feel your pain on having to manage so many dependencies. I write primarily in Python, and the various pip / Pipenv / pipx / PDM / Poetry dependency managers drive me pretty crazy. That's not even accounting for the multiple Python versions I need!

That said, I'm surprised that you're trying to _alleviate_ this by implementing your FP language in Python. The Python ecosystem is full of half-documented config files, incompatible dependency trees, etc.

Have you considered implementing it in any other languages after the Python one proves its worth? For example, if the language becomes strong enough, would you consider writing a scrapscript compiler in scrapscript, itself?

surprisetalk · 2024-01-24T18:32:50 1706121170

Yeah, I'm not a huge fan of Python, but Max and Chris are world-class in that domain, so that's what we're doing for now.

Max has already started working on a meta scrapscript compiler: https://github.com/tekknolagi/scrapscript/pull/100

One thing I think we all agree on is that the implementations should be simple enough to easily port themselves to other languages. For example, one could probably port the existing scrapscript.py to Rust or Javascript using GPT in a single weekend.

You can see echoes of what I'm talking about in my tiny JS POC: https://github.com/tekknolagi/scrapscript/blob/trunk/scrapsc...

Some languages like Rust and Go put a lot of weight on the "official" implementation. I think scrapscript can be more like Lisp/Json where the spec guides parallel implementations. There are obvious downsides to this in general, but I think that content-addressability makes some of those problems moot.

tekknolagi · 2024-01-24T18:47:49 1706122069

None of these config/dependency problems are present in scrapscript.py because it has no external dependencies and is written in one file. This is intentional!

account-5 · 2024-01-24T16:03:22 1706112202

Still not sure I fully understand, but that is more than likely down to my ignorance. I really appreciate your effort in explaining here. I should mention I'm not a full time developer and certainly not a webdev so this might be why I'm not grokking this. Thanks.

mst · 2024-01-24T16:17:16 1706113036

I'd been kind of interested by https://yglu.io/ and now ingy's new piece of insanity https://yamlscript.org/ - helm appears to let you inject your own script to template charts and I was wondering about trying a wrapper around one of those (because text templating an indentation sensitive language like YAML makes me itch).

I think scrapscript is a really interesting idea, mind, this isn't a "here's an alternative" type comment, it's a "here's things that I think are neat in a similar way to how I think scrapscript is neat" :)

Edit: I forgot something! https://trout.me.uk/lisp/termite-r7rs.pdf is a paper on adding library support to the cross-network (kinda erlangish) termite scheme extensions - and leans heavily on content addressable-ness. Termite itself has gone the way of small lisp projects but I kept this around specifically for the content addressable stuff having been solidly worked out in a language I understood; maybe that'll come in handy for ideas for you as well.

asveikau · 2024-01-24T17:23:38 1706117018

> (2) cryptographic guarantees that code is compatible over time,

What does this mean? You hash dependencies?

> (3) a compiler that checks live servers for compatibility before deploying,

Why does a compiler need to talk to a server? Why should it? Seems like a huge step backwards in what a compiler is and expecting it to work later on.

surprisetalk · 2024-01-24T17:48:07 1706118487

> What does this mean? You hash dependencies?

Yes, but everything is hashed at the expression-level rather than at the file-level, which prevents a few classes of errors.

> Why does a compiler need to talk to a server? Why should it? Seems like a huge step backwards in what a compiler is and expecting it to work later on.

Imagine if Javascript tooling could throw an error when a client implementation diverges from the server's expected input/output types:

  > const res = await fetch("https://example.com/api", [1, 2, 3]);

  ERROR: You're sending this REST endpoint a list of integers, but it expects a string!

Wouldn't that be nice in some applications?

dflock · 2024-01-24T16:37:45 1706114265

I assume you're well aware of: https://www.unison-lang.org/ - as well as 9p and union mounting from plan9.

surprisetalk · 2024-01-24T16:53:15 1706115195

Yes, I'm aware :) I actually built the first scrapscript demo in 2018, drawing on inspiration from Ethereum's Solidity. Somebody pointed me toward Unison when I attended Strange Loop in 2019, and I chatted with Paul Chiusano, and it seemed like Unison and Scrapscript had incompatible design goals. Even now, I don't see much overlap outside of content-addressability. Unison is super cool though, and I wish their team the best!

nerdponx · 2024-01-24T16:40:20 1706114420

It might be interesting to include a comparison with Dhall and Jsonnet while you're writing docs.

KTibow · 2024-01-24T14:22:44 1706106164

From reading https://scrapscript.org/ it sounds like its main feature is that things can be split up, put on platforms like IPFS, and distributed allowing you to access them from wherever.

cdchn · 2024-01-24T06:20:34 1706077234

Making python scripts into an 'Actually Portable Executable' is what really interested me here.

semi-extrinsic · 2024-01-24T08:00:10 1706083210

Not just python scripts, you can package any C code as well. We've used it to compile up a "python.com" APE file with a python 3.11 interpreter that has lots of packages (including C extensions) that we can just drop straight into old airgapped lab instrument computers and get a modern python data analysis suite up and running.

cl3misch · 2024-01-24T15:58:38 1706111918

That sounds very interesting! I probably could hack it together myself, but do you happen to have a writeup on that, or maybe some pointers on how to include the numpy-scipy stack into the executable?

actionfromafar · 2024-01-24T08:10:40 1706083840

Now integrate Shedskin the Python compiler. :)

semi-extrinsic · 2024-01-24T08:45:55 1706085955

Thanks, TIL. Now we can combine this with the xlcalculator package and transpile models built in Excel right down to C code and build it as a portable executable.

mst · 2024-01-24T16:26:18 1706113578

That's both terrifying and wonderful and I hope to see a write up of it on the front page one day :)

iainmerrick · 2024-01-24T11:08:24 1706094504

I'm a little confused by this part:

This executable is theoretically runnable on all major platforms without fuss. And the Docker container that we build with it [...]

It sounds like they're putting an APE inside a Docker container, but why would you want both?

zilti · 2024-01-24T11:29:04 1706095744

Because some people throw a tantrum if it doesn't come pre-dockerized, I suppose

cl3misch · 2024-01-24T16:00:17 1706112017

No, they build the APE with a docker container. The APE itself is... actually portable.

tekknolagi · 2024-01-24T16:30:11 1706113811

We do both. We also deploy a Docker container that runs scrapscript.com

zem · 2024-01-24T14:33:00 1706106780

perhaps to integrate with tooling that wants to work with containers

tekknolagi · 2024-01-24T14:44:08 1706107448

Yep, to deploy on fly.io

fermigier · 2024-01-24T07:59:56 1706083196

Is this the reason why there is only one 5+ KLOC module (which includes the tests)? I personally prefer short / shorter modules with clear responsibilities.

tekknolagi · 2024-01-24T08:10:46 1706083846

No. This is not a limitation of APE/Cosmopolitan. This is just my personal preference because imports get tricky in Python land unless you either have a single file or go Full Package Mode. There's probably a world where we split the tests out, though.

fermigier · 2024-01-24T12:47:41 1706100461

Well, just to better understand how the code is organized (and experiment a bit with it), I have forked it: https://github.com/sfermigier/scrapscript/tree/trunk/src/scr...

fermigier · 2024-01-24T12:23:33 1706099013

This reminds me of a talk Tim Berners-Lee did in 2002 (at the 10th Python conference):

https://www.w3.org/2002/Talks/0206-python/ ("Webizing Python")

I wasn't there but I remember hearing that this wasn't well received by the participants.

Also, TBL references a post by Aaron Swartz at the end of his slides: https://web.archive.org/web/20050208021219/logicerror.com/we... (also titled "Webizing Python")

ingenieroariel · 2024-01-24T16:02:08 1706112128

Since this uses cosmopolitan and the build script already downloads portable binaries from https://cosmo.zip has there any thought been given to wrap other portable binaries in scrapscript / download them?

Small, pure, functional, content-addressable and network-first sounds a lot like a mini Nix+ca-derivations [1]

[1] https://www.tweag.io/blog/2021-12-02-nix-cas-4/

deepnet · 2024-01-24T05:39:44 1706074784

Elegant and pure.

I also like the Javascript lambda calculus this is a fork of.

Like early Haskell when it was just for fun before Haskell's Meta-monadic library sprawl that upped the learning curve

hitekker · 2024-01-24T05:32:51 1706074371

I enjoyed the link to the language checklist https://www.mcmillen.dev/language_checklist.html

popcorncowboy · 2024-01-24T09:16:18 1706087778

> Programming in this language is an adequate punishment for inventing it

I was already laughing hard by this final punchline. Bravo.

CodeCompost · 2024-01-24T06:37:11 1706078231

Same here. It's the first time I've ever seen it.

throwaway290 · 2024-01-24T05:50:35 1706075435

[flagged]

tekknolagi · 2024-01-24T06:06:44 1706076404

The website does lean a little more marketing-heavy than my blog but I think that is just how Taylor writes and presents on the internet. A lot of the fancy-sounding claims are just very boring details.

The dependency fixing thing is mostly a claim based on something like static linking, where you can do the moral equivalent of require(x) and have x downloaded and checksum'd and loaded into memory---and then you keep using that version. Like a Merkle DAG. I don't know that this approach fixes all problems, but the solution is a lot less interesting than it sounds in that description IMO.

"scrap" is just an expression but you can reference any expression by its hash since it's all serializable. That's also what he means by the next bullet.

"Optimized for AI" is, IMO, a bad misrepresentation because what he means (I think) is since you put the use before the definition (f x . f = ...), it gives more room for sufficiently advanced autocomplete to fill in the definition.

If you can picture a language where every expression is a Git object and you can pull in objects from other repos at will and you can take arbitrary snapshots of your program (a tree-equivalent), you kind of get an idea of what's going on.

hombre_fatal · 2024-01-24T06:36:12 1706078172

You're going to resist this, but your cynicism here is a lot closer to “ugh why didn’t he create this with me in mind?” than you think.

Just consider for a second that you’re commenting on a blog post about two people who were compelled by the project to contribute some good work to it.

Yet your reaction is to come here and tell us about how another website doesn't live up to your standards or something, and I bet you thought the quote by quote takedown was quite scathing. ;)

throwaway290 · 2024-01-24T10:03:39 1706090619

> ugh why didn’t he create this with me in mind?

Who is "he" and created what?

I only said that the poster ("creator"?) of this blog article should explain what from his view the thing is, because the website for the thing (whoever made that seems to be another person) that he links to is superbly malinformative, which I illustrated with vivid examples.

Whoever implemented this library did some maybe good work but he does it a disservice by offloading the explanation to poorly written marketing copy

account-5 · 2024-01-24T13:52:04 1706104324

I know you've been flagged but having read the post and a few links posted in this thread I still have no idea what this thing is or does or why it's useful.

throwaway290 · 2024-01-26T06:47:26 1706251646

In my comment I quoted a few sentences from scrapscript.org, and called it "a dumpster fire of hype or really good satire", which is why it was flagged I guess.

Hackbraten · 2024-01-24T07:54:23 1706082863

[flagged]

tekknolagi · 2024-01-24T07:58:44 1706083124

Yeah. We decided pretty early on to use .scrap

Hackbraten · 2024-01-24T12:03:21 1706097801

Thank you!

dtx1 · 2024-01-24T08:27:52 1706084872

German here.

That is the stupidest thing I have heard in a while. Just because a language might be abbreviated with SS will not stop any reasonable person from using it.

aardvarkr · 2024-01-24T05:17:27 1706073447

Interesting tidbit: the book series that this website is named for is actually spelled berEnstAin bears, emphasis on the letters that everyone (including myself) remembers being spelled the other way. I literally learned this yesterday

juliusgeo · 2024-01-24T06:30:54 1706077854

I think the website name is a pun based on the last name of its author.

tekknolagi · 2024-01-24T06:50:00 1706079000

Correct!

LeonB · 2024-01-24T06:00:42 1706076042

I concur! This is considered part of her Mandela effect^1 isn’t it?

The Wikipedia article on the aforementioned bears even has a section on it —

https://en.wikipedia.org/wiki/Berenstain_Bears#Name_confusio...

^1 - which was the Mandala effect, in my original universe, I’m sure.

lkirkwood · 2024-01-24T10:36:39 1706092599

Why would it be the Mandala effect? It's named for the mass false memory of Nelson Mandela dying in prison in the 80s.

amenhotep · 2024-01-24T13:00:49 1706101249

One high profile archetype of the mandala is the sand mandala, where practitioners painstakingly construct an intricate mandala out of sand over the course of days and then ritually sweep it away once it's complete, leaving no trace, as a meditation on impermanence or something like that.

Much like how in the Mandela effect the original universe is wiped at least partially away, leaving no trace of what was a complex and fully featured aspect of the timeline, other than what remains in your memory. Other people say "no, that's always been a table" while you remember the sand that was on top of it. Or something along those lines! For some people the resonance is strong enough for the mandala imagery to potentially overwrite the Mandela etymology. Especially if you're a person who's never experienced the effect about Mandela himself.

NoZebra120vClip · 2024-01-24T08:29:52 1706084992

> everyone

The books were a very minor part of my childhood, but I noticed immediately, and my family always pronounced it correctly.