Hacker News new | past | comments | ask | show | jobs | submit login
Spreadsheets are dreams (2015) (medium.com/hand-brain)
74 points by dhotson on April 29, 2022 | hide | past | favorite | 19 comments



> Every cell can contain text, data or formulae; every cell, row and column may be endlessly multiplied and referenced. These two qualities make spreadsheets an indeterminate material matrix — the textured all-over-ness of a Pollock painting. Or the empty space of a desert landscape in whose expansive lines could be written every story.

> Spreadsheets can render scenarios with total variability, but the complexity needed to turn every product, object, idea or structure in a spreadsheet into a twiddlable dial or live display often suffocates the insight in a sandstorm of choking numbers. …

There seem to be quite a few recent tools which try to solve this problem by replacing the grid paradigm with something a bit more structured. The main ones I’m aware of are https://inflex.io/ and https://www.trymito.io/, but there are many more, and I even had a go at making one myself. I’m not optimistic about their chances in general, though. Traditional spreadsheet UIs are immensely flexible, and great for small calculations and anything involving tables or lists. They also happen to be utterly awful at anything even remotely large-scale, but by the time people figure that out, it’s usually too late to switch as the sunk-cost fallacy kicks in.

On the other hand, what are the alternatives? Programming languages require a fairly significant expenditure of effort to learn, and don’t give nearly the amount of interactivity that spreadsheets do. Even environments like Jupyter notebooks, or the MATLAB IDE, don’t come close. Besides, in the hands of the unskilled — and even the skilled, really — programs for data analysis can become nearly as messy as spreadsheets, especially with popular languages like Python and MATLAB.

For these reasons, though I utterly despise spreadsheets, I am also beginning to despair of ever successfully replacing them with something better: spreadsheets are just too convenient, so why would anyone use anything else? Excel is always going to be more convenient in the moment than any more principled tool, precisely because it is infinitely flexible and has no restrictions. People don’t like friction in their UX when they just want to do a few calculations. There is an avenue to wide usage for tools like Mito (linked above), which give programmers a more spreadsheet-like interface, and so integrate nicely into workflows which already exist. But this approach is in itself limiting; I want a tool I can open and use right now, not one where I have to make a whole new Python environment and notebook and so on just to do a simple calculation. Alas, I see no way to get wide adoption, or perhaps even adoption by myself, for any ‘better spreadsheet’ implementation.


Hey, I'm one of the founders of Mito (https://www.trymito.io/). This is a super interesting perspective. I agree with a lot of your thoughts and wanted to respond to a few in particular.

> They also happen to be utterly awful at anything even remotely large-scale.

I think there's a few reasons why spreadsheets struggle to scale to large datasets and complex analyses.

When it comes to data size, legacy spreadsheets like Excel were just built for an age with different data size expectations and its hard to upgrade that monstrous code base. That's why Mito uses Python to make all of the transformations. Python still has limitations, but it works for tens of millions of rows of data.

Complex analyses are the other big cause of pain when using spreadsheets. Specifically, spreadsheets can quickly get super messy when using a mix of tabular data and singular cell results. Once the structure of the spreadsheet loses consistency, it takes a lot more mental effort to untangle the spreadsheet.

These complexities arise because Excel is super un-opinionated about what types of analyses make sense for a spreadsheet and how those analyses should be structured. Because Mito is designed specifically for working with tabular data through pandas dataframes, we're able to make design decisions that enforce a bit more structure into the analysis. 1) All data in Mito must be tabular -- it both preserves the structure of the spreadsheet and fits the ideals of pandas dataframes. 2) Every edit you apply in Mito applies the entire column (or dataframe for ops like filter, sort, pivot, etc.).

The result of 1 + 2 + the fact that Mito generates the equivalent pandas code for every edit makes it fairly easy to understand what transformations are applied to the data at any given time.

In practice, we see complexity explosion is the result of combining data exploration and analysis. In the exploration phase users apply temporary filters, column transformations, etc. But they don't want to take those transformations with them. What is exploratory and analysis work is often not known until after the analysis, so its a hard problem to design for, but its something we spend a lot of time talking about. Our most recent work to address this area of complexity is optimizing the pandas code that we generate. We can use obvious cues like if the user deleted a column or dataframe that they had previously created to tell us that work was only part of exploratory work that they no longer want. As a result, we can safely delete the python code used to create those columns/dataframes.

> I want a tool I can open and use right now, not one where I have to make a whole new Python environment and notebook and so on just to do a simple calculation

I totally agree with this! Even as the creator of Mito, if I have to do some quick ad-hoc analysis, I'll end up opening Excel instead of launching Jupyter and then Mito. We're looking into ways of improving this though! One idea is to create a command like mito <file path> that automatically launches your juptyer server and opens the file in Mito. Another is to add support for Jupyter Lab desktop so you can get closer to launching with the click of a button.

Lastly, I'd love to engage with you more about this since you clearly have a lot of interesting thoughts. If you want, reach out to me aaron <@> sagacollab (dot) com.


I completely agree with your assessment of why spreadsheets fail. Completely unstructured data plus a mixture of exploration and analysis is a recipe for disaster.

> These complexities arise because Excel is super un-opinionated about what types of analyses make sense for a spreadsheet and how those analyses should be structured. Because Mito is designed specifically for working with tabular data through pandas dataframes, we're able to make design decisions that enforce a bit more structure into the analysis. 1) All data in Mito must be tabular -- it both preserves the structure of the spreadsheet and fits the ideals of pandas dataframes. 2) Every edit you apply in Mito applies the entire column (or dataframe for ops like filter, sort, pivot, etc.).

I tend to agree with this too, though there are cases where either (1) or (2) may need to be relaxed. Personally, I think static type checking will also turn out to be useful for structure enforcement: it’s nice to have things like builtin support for units, or defining enumerations for categorical data, or even just making sure that each column has the same type of data throughout. (This is also why I’m uncomfortable with building a spreadsheet on Python, for all the advantages such an approach has.)

> In practice, we see complexity explosion is the result of combining data exploration and analysis. In the exploration phase users apply temporary filters, column transformations, etc. But they don't want to take those transformations with them. What is exploratory and analysis work is often not known until after the analysis, so its a hard problem to design for, but its something we spend a lot of time talking about.

Making a UI good for both data exploration as well as more in-depth analysis is an interesting problem, and I’m not convinced we’ve found a good solution yet. Spreadsheets are good for the former, but not for the latter; programming is good for the latter, but not the former. Inserting a spreadsheet into a notebook interface seems a reasonable compromise, but I’m sure it’s possible to find something better and more tightly integrated.

> Lastly, I'd love to engage with you more about this

Sure, thanks! I’ll send you an email now.


Beautiful piece. A poem in parts.

I think spreadsheets are one of the most powerful mind-extensions people developed from the Information Age.

But as far as they can take you, it's worth remembering they're only extending our human minds, going in the direction we point them. They labor only for the narratives we set them to and fail according to our own misconceptions.


Totally agree. It's interesting to ponder where the ditest for spreadsheets comes from. You can make a mess in any programming language after all.

With conventional imperative programming languages there is a kind of "culture" of sorts, collectively owned by programmers. It presents a fairly difficult barrier to entry, and has been built up over time as people learn from mistakes. But it's interesting that this culture doesn't really "live" anywhere centralized - there is no actual guild of programmers with a book of rules, even if we feel like there is (or is that feeling all that is needed for it to exist?)

But spreadsheets don't really have this. This is a blessing and and a curse. When you build something in a spreadsheet it's not going to be critiqued by anyone. But also no one is going to teach you the "right way". There is no "sheethub" to host your little project and get stars and merge requests, and get good and bad feels.

I spend a lot of time thinking about spreadsheets, but I never actually use them. I also spend a lot of time imagining what programming will look like in 100 years time. I don't have anything to show from that process. Maybe I would sketch my thoughts in a spreadsheet so no one would critique them. But then I would never learn anything I guess.

I would like to see spreadsheets and regular programming languages reconcile some how. I think both have a lot to learn from each other. Regular programming language culture has a lot of baggage that doesn't get questioned often. It's easy for programmers to look down on spreadsheeters, and yet the sheets keep coming.


The problem with spreadsheets is that they are a straitjacket for many of the users who don't realize that there are so often better ways of achieving their ends.


For people with a self-perception of not being programmers (in other words, most people) spreadsheets are often the easiest and most flexible way to leverage a computer's ability to automate whatever their novel task is. Without spreadsheets, they'd either need to learn how to code or hire a programmer. Hiring a programmer to solve your special problem is a high bar to clear, and most people will never become programmers for various reasons. If you took spreadsheets away, I think most spreadsheet users would begrudgingly go back to pencils and pocket calculators.


True. The levels of complexity taken on by spreadsheets is astounding. Couple this with how difficult it can be to own a database in a corporate context and the company ends up with hundreds of spreadsheets that must be maintained across shared drives along with buggy macros.


Which speaks to the problem not being with spreadsheets, but with their alternatives. You want a better database used by your employees? Don't make it a 6-month requisition process or a $10k/table/month ongoing cost for the team. Spreadsheets can be used now by the people that need to get work done, and many of the better options are barred to them by cost, accessibility, or training. Same for planning programs and workflow management.

Some systems (like JIRA, actually) are quite malleable, if you're permitted to make those changes and create custom workflows. But many corporations mandate a system like that but then lock it down so that the teams aren't able to use it effectively. The same happens with many other better alternatives to spreadsheets.


Another way to put it: Excel is the leading tool to author boring science fiction.


Sure, nightmares qualify as dreams.


That drawing reminds me of a C script I wrote to convert a PGM image into a spreadsheet (fods) (:

I can't find the actual script, but here are two images:

http://1507103400/krneki/stalin.fods http://1507103400/krneki/stalin.xlsx http://1507103400/krneki/lena.fods http://1507103400/krneki/lena.xlsx

XLSX versions are both less than 2 MiB, while FODS uncompressed XML plaintext are around 30 MiB (:


Here's a 3 model rendered in a spreadsheet https://docs.google.com/spreadsheets/d/19Lg8icHa-F0NlGPC5W1H...


Discussed at the time:

Spreadsheets are dreams - https://news.ycombinator.com/item?id=10569155 - Nov 2015 (12 comments)


I'm not the biggest spreadsheet fan but they do make data easy to understand for people who don't understand data or programming. You can have a visual representation of the data itself with the mathematical formulas and results all on one page. They also make entering data easy. Being able to just whip up a spreadsheet in a few minutes to show calculations to someone who isn't tech or math savvy is useful.

That being said, for my own purposes I'd rather use a programming language + file/database representation of the data.


Spreadsheets are one of the most ubiquitous business tools in the world. Similar to email, another ubiquitous business tool, there are good uses and bad uses of the tool. Users need to know when to use them, when to not use them, and when to move onto something more robust.

> So they’re at their best when you have a foundation to build on — a decent number of fixed assumptions atop which you want to see the effect over time or scale of a limited number of variables. See the myriad permutations proliferate from a small number of questions.


    ...
    I'm adding A1 and the 7C
    Everybody's looking to SUM() things


Concrete poetry


In the spirit of dreams and poetry:

__________________

Spreadsheets are my desired cuisine. Excel the dish I adore.

But fill it up with too many ingredients, and you’ll find Excel on the floor.

It happens all the more frequent now, as my grocery store is now a warehouse

So I pound my fists and curse and yell, and leave Excel in the guesthouse.

Lost without Excel, in a ‘landscape where nothing officially exists’

I saw a single cell appear above the dune in a hazy mist

Running up the dune, the grid displayed with all its might

A new spreadsheet designed for my growing appetite.

__________________

I’m one of the founders of Mito [1] (and a very bad poet). Like Spreadsheets are Dreams discusses, we believe that spreadsheets are the most powerful low-code tool because of their versatility to capture the simplicity or complexity of nearly any analysis.

As my poem tries to highlight, existing spreadsheet tools like Excel are not designed for today’s growing data sets. Today's spreadsheets should help users leverage their Excel skills while upgrading to a more robust and powerful environment like Python.

Mito is a spreadsheet extension to your Jupyter environment. You can display any Pandas dataframe as a spreadsheet, and edit it in a very similar way to Excel. For each edit you make, it generates the corresponding Python code below for those edits. Practically, you can think about Mito as recording a macro, but instead of generating VBA code, it generates Python.

[1] https://www.trymito.io/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: