Pluto.jl [1] is an interesting project around creating reproducible notebooks.
MIT's Intro to Computational Thinking uses it for displaying interactive code cells [2].
They're reactive cells too, so if you modify an variable or function, subsequent cells will automatically update. Something that Jupyter lacks currently.
I develop a similar project for Python, called marimo [1] [2]: marimo is an open-source reactive notebook for Python with built-in UI elements.
Our users use marimo to do traditional notebook work faster and with more confidence (computational experiments, data exploration), and also to build pipelines and apps: marimo notebooks can be run as Python scripts and seamlessly converted to apps.
We've designed marimo with devX in mind (pure Python file format, git-friendly, black formatting built-in, GitHub Copilot built-in, modularity coming soon ...).
I have dreamed about an extremely rich document which actually presents the internal state of the runtime and compiler. So all the abstract syntax tree and control flow graph, SSA and data structures are presented in different sections of the document. Have animations of data flow too.
It could double as REPL and you can reach into any part of the compiler pipeline and write your own transformations.
Does anybody remember the documentation for Backbone? The "annotated source" documentation was side by side the code, it was really helpful.
I have practically no exposure to Bazel and Buck but these advanced build systems have advanced refresh logic which is feels it is an advanced caching.
These systems and Makefiles essentially do refresh logic/regeneration from changed sources. Maybe it could be like Greenspun's 10th rule, every system eventually builds its own caching system and refresh logic and dirty-rechecking. (See React's virtual DOM)
This seems within reach of Livebooks. I typically write livebooks attached to my running elixir dev node which just a plain elixir project. So it’s a side-by-side notebook and library development all in the same runtime. Elixir’s VM is highly observable so a lot of what you mention should be possible out of the box, but I’m not sure if anyone has glued all those pieces together in a neat prebuilt package.
Nice concept and demonstration! Would argue though that it is the states in Jupyter notebooks that make it work so well for prototyping, and that should be its main purpose always. Using dependencies might break that.
As for the documentation purposes it should be the programmers/authors' duty to ensure that the code works out by executing all cells in order once.
The main problem I see is relying on a server. Running sandboxes server-side takes up resources even when idle (the user went to lunch, or left it running overnight). Google can do it with Colab, but this is expensive to provide to the general public, so often the sandbox gets killed after a timeout.
Observable does a similar thing, but in the browser, so the cells run JavaScript. I suppose a WASM implementation could provide more languages?
Or it might be better done using a VS Code plugin or something like that. (Thry already exist.) But that means casual readers won’t be able to play with it.
In this case, running sandboxes does not consume resources when idle. There is no state, and the containers exist only for the duration of the "run" operation.
There's a database to pass from one cell to the next, which seems rather stateful to me?
If no state is cached and running a cell automatically reruns all its dependencies from scratch, that's the equivalent of a clean rebuild when using a makefile. It might be too slow to use interactively if you're doing anything heavy. It's how continuous builds work, though.
I love these kinds of interactive code/sandbox projects! The author's example page [1] gives some more background on the design of Codapi as well [2]. Excited to try this out!
Another similar project is Runno which runs client-side in the browser [3].
This gets close to something really important, but I'd like to forward something that I wrote previously - the SAME project[1]
The difference/compliment here is that after you finish writing these cells, i think there's a translation into a backend service for execution at scale.
I'd love someone to pick this up and run with it in collaboration with tools like this!
> Behind the scenes, codapi-snippet calls a codapi server (either a cloud or self-hosted instance) so it can run any programming language, database or software you've configured.
I'd rather have a flavor of notebooks that work by referencing other notebooks comprising human- and machine-readable specifications that describe those other languages' syntax and semantics and how their execution models work. Look at the way IETF RFCs are crosslinked, for example. With your browser pointed at one of these documents, it traverses all the linked documents and resolves a given cell's results based its working knowledge of the language in that cell. (If you really want to, you can point your server to the same corpus and have it cache some of the processing in order to relieve pressure re compute requirements for the browser, but in principle it should work the same way.)
Off topic: does any other here have trouble with queries like this?
Rank the employees according to their salaries in each department
My SQL is “good” I think, but to be honest in my day to day I usually don’t deal with aggregation nor windowed queries (my BI colleagues are masters are those, though )
What piques my interest in this tool is that I haven't found a good "notebook" like solution for Postgres yet, I've tried using SQL magic functions in Jupiter to generate some documentation for pgsodium[1] but I'm not entirely pleased with the results, there's too much Python still showing through.
Would this tool work as a general purpose documentation generator for a Postgres extension?
I think it would! Are you interested in trying it out? I can probably prepare a simple example and open an issue in the pgsodium repo to discuss the rest.
Love that it has transactions/states between blocks. It's super annoying if you override a variable / change an object in Jupyter and try to re-run it.
Wonderful! And it's probably not too hard to take this to the next step by having natural language to code (via genai) baked into this? Interactive natural language functional cells. Something an average writer, journalist etc can put in any given article. Thatll be incredibly helpful
I didn't mind so much losing the highlighting upon entering edit mode, but I did mind losing it forever, i.e. not having it even after leaving edit mode. Perhaps if there was a way to re-apply the syntax highlighting once the editable element loses focus, then losing syntax highlighting temporarily when actively editing wouldn't be as much of a problem?
While I get that states is frustrating for share-ability and communicating your work, it is extremely useful for prototyping. Perhaps there just needs to be an explicit toggle for stateful vs non-stateful for notebooks
> The last thing I want as a reader is code examples that fail or behave oddly because they were run out of order. Or because I didn't run some cells. Or because I changed a cell and didn't re-run it.
> Codapi code cells, unlike Jupyter's, have no hidden state. Instead, they execute the whole chain of dependencies as needed to ensure that the reader gets a consistent result.
To the second point, this is always how Jupyter behaves for me because I just "Run to cell"... If the other cells ran, it should be apparent with some sort of feedback. Don't need the output but at least visual feedback that they ran. Otherwise I'm distrustful that anything is actually running behind the scenes because the runtime shouldn't have those inserted tables.
It's confusing because while it does run the previous cells as needed, it does not update the "output" of those cells on the page. So it's not very clear what the dependencies are and what steps were actually executed.
this is so interesting! does it memoize dependencies? if i have several cells that all depend on a central one, will the central one only run once? what about state? can you declare a variable in one cell then use it in another?
They're reactive cells too, so if you modify an variable or function, subsequent cells will automatically update. Something that Jupyter lacks currently.
[1] - https://plutojl.org/
[2] - https://computationalthinking.mit.edu/Fall23/