The majority of the complaints I hear about notebooks I think come from a misunderstanding of what they're supposed to be. It's a mashup between a scientific paper and a repl. So it's useful for a bit of both:
a) Just like with a paper, you can present scientific or mathematical ideas with accompanying visualizations or simulations. From the REPL side, as a bonus, you get interactivity, and the reader can pause and experiment with the examples you're giving to improve their understanding or test their hypotheses. If I change this variable, how will the system react? You can just try it!
b) Just like with a REPL, you can type in and execute commands step by step, viewing the output of the previous command instead of running the whole thing at once. From the document side, as a bonus, you get nicer presentation (charts, interactivity, nice and wide sortable tables, etc) than you would in a shell, which comes in handy when doing things like data exploration or mathematical simulation.
It's decidedly NOT there for you to type all your code in like an editor and make a huge mess. It's apples and oranges w.r.t and a poor substitute for something like PyCharm or VS Code or vim. It is there for you to a) try things out yourself, and whatever you discover hopefully eventually make it into proper python modules b) make interesting ideas presentable and explorable for others. That's all!
When I see stuff like "out of order execution is confusing", I don't disagree, but it does make me wonder how long and convoluted the notebooks these people work with are - probably a ripe candidate to refactor stuff out into python modules as functions. When I see stuff around notebooks for "reproducibility", I'm a bit confused in that notebooks often don't specify any guidance on installation and dependencies, let alone things like arguments and options that a regular old script would. In that regard I think it's barely an improvement over .py files lying around. When I hear "how do I import a notebook like a python module", I'm very very scared.
Granted, I've seen huge notebooks that are a mess, so I understand the frustration, but it's not like we all haven't seen the single file of code with 5000 lines and 10 nested layers of conditionals at some point in our lives.
> When I see stuff around notebooks for "reproducibility", I'm a bit confused in that notebooks often don't specify any guidance on installation and dependencies, let alone things like arguments and options that a regular old script would.
At the core of this, as some others may have already alluded to already, is that many academic scientists have not been socialized to make a distinction between development and production environments. Jupyter notebooks are clearly beneficial for sandboxing and trying out analyses creatively (with many wrong turns) before running "production" analyses, which ideally should be the ones that are reproducible. For many scientific papers, the analysis stops at "I was messing around in SPSS and MATLAB at 3 AM and got this result" without much consideration for reformulating what the researcher did and rewriting code/scripts so that they can be re-run consistently.
> many academic scientists have not been socialized to make a distinction between development and production environments
Geologist here - definitely true in my field. Nonetheless, while I don't develop in notebooks at all, I do use them for "reproducibility" in a sense -- by putting a bit of dependency info in a github repo along with a .ipynb file, I can do things like this: https://mybinder.org/v2/gh/brenhinkeller/Chron.jl/master?fil...
Which ends up being useful when a lot of folks in my field don't do any computational work at all, so being able to just click on a link and have something work in browser is a big help.
This is kind of a broad observation, but scientists tend to borrow tools from a huge variety of fields, and use them in ways that seem un-disciplined to the practitioners of those fields. For instance, an engineer would be horrified to see me working in the machine shop without a fully dimensioned and toleranced drawing. A project manager would be disturbed to learn that I don't have a pre-written plan for my next task. How do I even know what I'm going to do? If we adopted the most disciplined processes from every field, we'd grind to a halt.
In fact, there might be something about what attracts people to be scientists rather than engineers, that makes us bristle at doing what engineers consider to be "good" engineering.
I agree that science can't be bound by the rigid structures of most applied disciplines, and that the freedom to combine technologies in novel ways is a pre-requisite to novel findings.
What I find objectionable is the inability of scientists to explicitly delegate tasks to domain specialists in their everyday work when it makes sense. I think that it's unrealistic of you to believe that engineers always work with "a fully dimensioned and toleranced drawing" before starting work on a project and that your would work "grind to a halt". Indeed, there's a reason for the qualifier rapid in the term "rapid prototyping". If you can give an engineer general specifications for what you want and then leave him/her alone, he/she should be able to produce something that mostly fits your needs while avoiding all of the pitfalls that wouldn't have occurred to you. It would also be incorrect to assume that engineering does not involve creativity and is purely bound by rigid processes- if your requirements were strange enough, something fresh would inevitably be built.
This sort of delegation of course, is actually more efficient, since you can work on other tasks in parallel with the engineer (such as writing your next grant proposal or article or gasp teaching). Most scientists also already do this implicitly by choosing to purchase instrumentation from manufacturers like Olympus, Phillips, or Siemens rather than building it themselves.
Part of the reason for why I have such strong opinions about this matter, is that I've actually witnessed scientists waste more time messing around in fields where they were clearly out of their depth. As an example, there was a thread on a listserv in my (former) field that lasted for literally months that was solely devoted to the appearance of a website. Everyone wanted to turn the website design into an academic debate, when the website's creation (which had little to do with the substance of the scholarship itself) could have been turned over to a seasoned web developer and finished in less than a week or two.
But in the case of dev and prod distinction it has nothing to do with fitting some over-constrained engineering principle, but about fitting actual science: if you cannot reproduce something, you don't have a result, you have a fluke.
I think GP here is an insightful comment. Reproducing things is indeed important, but re-running code is much too narrow a definition, and possibly distractingly narrow.
Maybe your awful notebook gets the same answer you got the day before on the blackboard. Or the same answer your collaborator got independently, perhaps with different tools. Those might be great checks that you understand what you're doing. Spending time on them might be more valuable for finding errors than spending time on making one approach run without human intervention.
Not to say that there aren't some scientists who would benefit from better engineering. But it's too strong to say that fixing everything that looks wrong to engineer's eyes is automatically a good idea.
I find that with Jupyter, re-running code does serve one useful purpose, which is to make sure that your result isn't affected by out-of-order execution or a global that you declared and forgot about. That is a real pitfall of Jupyter that has to be explained to beginners.
For my work, reproducing a result may involve collecting more data, because a notebook might be a piece of a bigger puzzle that includes hardware and physical data. This is where scripting is a two edged sword. On the one hand, it's easy to get sloppy in all of the ways that horrify real programmers. On the other hand, scripting an experiment so it runs with little manual intervention means that you can run it several times.
Huge fan of just including an environment.yml for a conda virtual-env in the repo you store your notebooks in, but the challenge there is that it's OS specific reproducibility. I've had no luck creating a single yml for all OS's and the overhead of creating similar yml's for (say) Mac and Win is a lot unless you plan on sharing your notebook widely.
If you ever have used an R Notebook written in R-Markdown, then its pretty easy to see why Jupyter Notebooks putting everything in JSON is just... infuriatingly wrong-headed. In an R Notebook, I can see my code, I can see my text, everything is exceedingly simple to understand, and I can edit it in any of the fantastic text editors out there (Jupyter's editor is not among them)
RStudio is also my favorite editor. All my work is data science / stats related, where I like the workflow of writing/modifying code in a .R (or .py) file, and being able to quickly experiment by running chunks in a REPL with Ctrl + Enter.
R and Python are supported. No Julia, unfortunately.
VS Code and Atom support similar workflows with Julia. However, the Julia Language server in VS Code is extremely unstable and I regularly lose LaTeX completions.
The REPL in Atom is mind boggling laggy and slow to the point that it is much less frustrating to copy and paste code into a REPL running in your favorite terminal emulator.
Is that atom's fault or just the Julia REPL's fault? I use the REPL directly on Windows and it seems to be really slow as it will take something like "using JuMP" and precompile the module which takes time.
I think it is Juno (the Julia package for Atom)'s fault.
Atom is fine on its own, as is the Julia REPL after compilation.
I just looked through Julia's settings tab in Atom, and saw the option "Fallback Renderer" with the note "Enable this if you're experiencing slowdowns in the built-in terminals."
It was disabled by default, so I've just enabled it.
Subjectively, I think it feels fine now. Longer use will tell, but I suspect I was just running into a known issue some setups run into, and they already provided the workaround.
EDIT:
Comparing running some code in Atom's terminal and a REPL running in the GNOME Terminal, the regular REPL still feels notably snappier -- even though I'm `using OhMyREPL`, which makes the REPL a bit less responsive.
I'd say Atom feels acceptable (and definitely not "mind boggling laggy" right now), and shift/ctrl + enter more convenient than switching tabs.
So I will stick with it (for Julia). More time shall tell.
As a result of the serialize-to-json approach, jupyter supports R, python, scala, go, lua, bash, julia, and haskell, among others. Its accessible to a much wider range of programmers, at the cost of version control being a bit weirder.
That is a complete non-sequitur. Json in no way enables that. Just having a defined format enables that.
Emacs org-mode is proof that a simple text format with markup rules is all you really need to support multiple languages in a single file. You lose some of the simplicity of parsing the file, but you gain a ton more.
This isn't true though, right? If you were writing an org mode document about org mode, you now need an escaping mechanism to not mix your structure and text
Multi-language parsing is much harder to solve than simply enforcing some escaping mechanism in the inner protocol level and having tools do the "heavy" lifting (basically a solved problem).
That said, you can define away a large part of the problem.
Edit: For trivial examples of "org-mode" in an org-mode document, you need only look at the documentation of org-mode. That said, I expect there to be limitations, because they make sense. Similar to how you can pretty print json inside a jupyter notebook, but don't expect to have a notebook interpreted in the notebook. (If that makes sense.)
Emacs is, amusingly, a lighter client than most browsers nowadays.
On point, a browser can not remember ifa notebook. Just parse the json. It can also parse text/plain. So, could show the org document without styling. The org document is actually readable. Json... Not so much.
To see the notebook, you have to have a Jupyter setup somewhere.
How is a notebook without proper software to handle it in any way more useful then other structured plaintext-file? Yes, JSON can be prettyprinted in a browser, but what then? It's still a useless mess you can't work with.
It might be the case that serializing to json facilitates support for multiple languages, though I wonder how.
With the reticulate package in R Markdown you can run python chunks, by putting, e.g.
```{python}
for i in range(1:10):
print("{}:{}".format(i, i*i))
# etc
```
And in emacs org-mode you can:
#+begin_src python
for i in range(1:10):
print("{}:{}".format(i, i*i))
#+end_src
Language support in org-mode is pretty comprehensive, afaik.
I do not know the details of the implementations behind these, but my own source code is plain and simple unserialized text, and that means a lot to me.
Ah, actually it looks like I'm somewhat mistaken. R Markdown supports other languages as well. I think the real difference is that it doesn't look like R Markdown supports partial evaluation.
By that I mean that that to share the r-markdown doc it appears that you need to rerun the whole thing. It does some tricks to do concurrent visualization, but to actually share the doc you have to rerun all the R/python from scratch.
In jupyter OTOH, if I have a long running ML pipeline as part of my doc, I can render without rerunning the pipeline.
You can cache the results of an rmd cell, and you can also share the rendered version of the doc first. You're right that there's a higher emphasis on "run the whole thing," and I think that that's a conscious (and acceptable) design choice vs not being sure that the shared doc will run as provided.
Yeah I don't disagree about that, but when the question is why is this ugly format more popular, the answer "one has a share button and the other can be edited in emacs" sort of gives you the answer.
The main reason for json, I believe, is that the Jupyter client is separate from the backend. It's actually pretty trivial to run the engine on a beefy box while interacting on a light laptop (on the same subnet). With Jupyter Lab and some fiddling, you can put the server anywhere.
It's also trivial to export notebooks to .py files.
That said, my goodness do notebooks wreak havoc on git. I hope this in particular gets fixed as popularity grows.
Having a proper server available is even more reason to use a proper fileformat. The client doesn't care what the server handles and the server doesn't need to send raw datastructures directly from the storage.
Actually, fixing the fileformat-mess should be very simple. Just change the file-load/save-functions. Use a folder-structure with every cell being a seperate file. Or switch to XML. Or make a generic interface and allow to save in whatever the people want. Saving notebooks in Mongodb or some SQL-Database seems like a good goal for dedicated services.
You parting "Granted..." is precisely what fills me with dread when I see notebooks. Yes, I have seen poorly done source files. I made more than a few myself. However, many of the practices we have grown into as sound programming advice seem to be largely thrown out the window for these notebooks.
The irony, to me, is that I actually typically argue for the mixing of presentation and content. But to me, notebooks look like an attempt by people to make a WYSIWYG out of JUnit/TestNG/whatever style reports. Only, without the repeatability.
There is also the entire bend where these are taking off in a way that doesn't make sense. Do they do the things you are saying? Well, yeah. But no better than plenty of tools before them. Mathematica and Matlab both had "notebook" like features for a long long time. Complete with optimized libraries. And this is ignoring the interactivity of the old LISP machines. (You can see from my history I have a soft spot for emacs org-mode.)
Jupyter is a lot of things. Bad isn't necessarily one of them, but exceptional isn't, either. Heavily marketed is.
There is also the entire bend where these are taking off in a way that doesn't make sense.
It makes perfect sense. Just not to a lot of HN readers.
The average HN reader is approaching this from a perspective of "I am a professional programmer who might occasionally dabble in scientific computing, and therefore I hate this thing because it's not a professional programmer's tool designed by and for professional programmers according to the best practices of professional programmers".
The people who are actually using notebooks, meanwhile, are not professional programmers. They're scientists who increasingly have to do programming as part of their science. And notebooks are a godsend for them. We don't need to drag them all the way into our world; we need to pay attention to what they actually want, need, and find useful, and accept that it's going to differ from what we want, need, and find useful.
They're scientists who increasingly have to do programming as part of their science. And notebooks are a godsend for them
2/3 of scientific research cannot be reproduced by other scientists. But tell us more about why scientists should ignore best practices from other fields.
Because I don't have four years to get something done, that doesn't do what I want when I finally get it, if it even works at all, and that I can't fix myself.
Okay, that was extreme, and if you think I was talking about programming, it's because you have a guilty conscience. ;-) It actually applies to all interesting fields -- programming, engineering, management, classical music composition, etc. Those fields don't even know what their best practices are, and acknowledge that things take too long and can't be managed. No manager would say: "Our programmers have best practices, so the work will be done next week." Why should scientists have such faith?
Meanwhile, do you trust Maxwell's Equations, Darwinian evolution, quantum mechanics, etc.? How did we establish the physical constants to mostly better than 8 digits of precision? Science has somehow figured out how to make progress despite the messy business of research.
For me, it's not that I "have" to do programming, but that physical science has been computation driven since before the 1940s. Programming is how I think and work. With apologies to Richelieu, "programming is too important to be left to the programmers."
> Those fields don't even know what their best practices are
"Best practices" are a chimera. The issue at hand isn't about what is "best", but whether or not a software engineer's "good enough" practices are more likely to achieve science's goals than a graduate student's "good enough" practices.
It's also disingenuous to claim that classical music composition doesn't have "best practices" when the field of music theory exists as an explicit manifestation of "best practices" in music. Having gone to a school with a conservatory, I also believe that I know several individuals who would would disagree with your mindset regarding how the creative process can't be managed. Indeed, if creativity, as it relates to musical composition, couldn't be managed most orchestras would be brimming with anger at the number of commissions that weren't finished on time for the concert, and most Hollywood studios and Broadway shows would screech to a halt.
Show me the reproducible research in programming about the merits of different type systems (murky at best). Or of different approaches to testing. Or software architecture. Or... well, most of the stuff day-to-day working programmers actually do. There are barely even attempts at rigor in most of our practices, let alone the kind of reviewed and reproduced results we demand from the sciences.
I run my unit and integration tests with every build, and they reproducibly pass if my code is working. If you have code, it doesn't take much to make it able to run again and get the same result, and it's frustrating to see Jupyter users mess it up.
> Mathematica and Matlab both had "notebook" like features for a long long time.
They probably didn't take off to the same extent as Jupyter because they're not free. IIRC MATLAB was quite expensive, particularly if you wanted to do anything specialised.
Yes, both are expensive outside the student licenses, but Mathematica is significantly cheaper and has a lot more built into the language, so you don't have to turn around and buy expensive "toolboxes" for all the functionality missing in Matlab.
Notebooks have been in Mathematica for ages and are really powerful and difficult to describe to those who haven't used them. To give an example, I was building a tool and embedded images as variables in a way reminiscent of being an engineer on the USS Enterprise. You can point to a file in Python as a variable, but you can't just copy-paste an image in as a variable last I checked (don't think Jupyter is there yet).
I work in development of scientific equipment. Jupyter is my lab notebook. I think that to make good use of Jupyter for this purpose, you have to be a good programmer and a good scientist. No tool will turn us into these things against our will.
With that said, Jupyter has greatly improved my ability to find my own mistakes, and to reproduce my own results later on.
I think it speaks to people's desire for a quick and easy to set up basic GUI creator with an editor that allows inline code editing, and no need to deal explicitly with the client server interaction.
I myself, as someone who likes to create really solid and maintainable tools, have fallen into the notebook trap and written things like "change the month in cell 22 then execute cells 1 through 3 and 20 through 27 to update the report".
The notebook format was great for prototyping what was really a small app. You don't really have those problems when you're just generating a document.
Yes. It's called Microsoft Excel. Software engineers don't like VB for the same reason they don't like Python-in-a-notebook but you cannot deny its effectiveness.
You're right about what excel is (and the whole VB ecosystem for that matter), but I think the critical difference is that the language and environment are very different. If I know the smallest amount of python (or R) I can leverage Jupyter notebooks and it is intuitive.
To really get something great out of excel you have to learn excel. I think that difference is almost as important as the excel stigma.
I agree completely on your first point - notebooks are a poor substitute for proper software tooling. I wrote this recently [1]
> In the case of an analyst, the domain of "software engineering" lies close to their own domain. Projects in both areas require code which (ideally) exhibits clarity and reproducibility. Obfuscated software is bad [...] and idempotency is good.
> The problem, then, is when the analyst takes a core tool from their domain and applies it to a slightly different domain like software engineering. Things go south fast: your notebook has not-quite-imperative code that is untested and unmonitored. It is, in other words, bad software.
As for the point about "refactoring stuff out into python modules as functions," the problem is that the new crop of data scientists aren't learning how to do this. The role of "machine learning engineer" is emerging to address this shortcoming in SWE skill throughout the data science community. It honestly cannot happen quickly enough.
I fundamentally agree with you but I have the feeling that some some of the major proponents of notebooks belong to the category of people who misunderstand them, and simply use them for everything, and write long and convoluted notebooks; I’ve definitely seen my share of those in my domain (bioinformatics, AI) and elsewhere. By contrast, Joel Grus for instance perfectly understands their strengths and weaknesses.
As for being a a good REPL, I feel that an actual REPL (+ editor integration) works better than notebooks: You can combine a literate document with a REPL but still get the benefits of a proper editor/IDE and a proper execution environment, rather than a half-hearted mix of both that’s hosted inside a HTML contenteditable (= Jupyter), and you also get “charts, interactivity, nice and wide sortable tables, etc” if you want). RMarkdown inside RStudio or Nvim-R does this well. — I just don’t want to give up the advantages of a proper editor for the very slight increase in integration that Jupyter gives me.
I think we can all agree some notebooks are shit storms and should not be relied upon at ALL for production. AT my job we started using notebooks as an 'in-repo', 'interactive' documentation of sorts. Showcase various modules and give simple usage examples of them. It was pretty awesome. I love using notebooks as a more advanced scratch pad. For the times when ipython shell isnt enough, and you want something extra. Also i had to install the vim bindings ASAP, gotta have that vim
Is it actually a common skill to write meaningful non-helloworldish Python code that yields expected results without a number of iterations of debugging and correcting and without PyCharm intelligent completion, hinting and correcting features? I understand the value of Jupyter notebooks for publishing your work results but find it almost impossible to use it to actually do the work - it feels million times more convenient to code in PyCharm then copy-paste the code to Jupyter once it's ready.
I'd say it's more of a shell than a REPL. For most languages that provide a shell, there isn't a real separation between the reader, evaluator, and the printer. Being able to interact with those components separately is the real advantage of a REPL over a shell.
The majority of the complaints I hear about notebooks I think come from a misunderstanding of what they're supposed to be
No, the majority of complaints are that notebooks are great, but Jupyter is a bad notebook. I mean maybe it’s impressive to someone who’s never seen a notebook before but to someone used to Mathematica, MathCAD,
RMarkdown, org-mode, whatever, it just seems clunky as hell. I wonder how many “data scientists” claiming it as their top choice have ever tried anything else?
a) Just like with a paper, you can present scientific or mathematical ideas with accompanying visualizations or simulations. From the REPL side, as a bonus, you get interactivity, and the reader can pause and experiment with the examples you're giving to improve their understanding or test their hypotheses. If I change this variable, how will the system react? You can just try it!
b) Just like with a REPL, you can type in and execute commands step by step, viewing the output of the previous command instead of running the whole thing at once. From the document side, as a bonus, you get nicer presentation (charts, interactivity, nice and wide sortable tables, etc) than you would in a shell, which comes in handy when doing things like data exploration or mathematical simulation.
It's decidedly NOT there for you to type all your code in like an editor and make a huge mess. It's apples and oranges w.r.t and a poor substitute for something like PyCharm or VS Code or vim. It is there for you to a) try things out yourself, and whatever you discover hopefully eventually make it into proper python modules b) make interesting ideas presentable and explorable for others. That's all!
When I see stuff like "out of order execution is confusing", I don't disagree, but it does make me wonder how long and convoluted the notebooks these people work with are - probably a ripe candidate to refactor stuff out into python modules as functions. When I see stuff around notebooks for "reproducibility", I'm a bit confused in that notebooks often don't specify any guidance on installation and dependencies, let alone things like arguments and options that a regular old script would. In that regard I think it's barely an improvement over .py files lying around. When I hear "how do I import a notebook like a python module", I'm very very scared.
Granted, I've seen huge notebooks that are a mess, so I understand the frustration, but it's not like we all haven't seen the single file of code with 5000 lines and 10 nested layers of conditionals at some point in our lives.