Is Excel not a GUI? Photoshop? Unity? You can get pretty damn far with those too...

haZard_OS · on April 16, 2018

Can you? I have long been under the impression that Excel is to be avoided for all but the lightest, most cursory analyses.

From 2007:

http://people.umass.edu/evagold/excel.html

From 2013:

http://biostat.mc.vanderbilt.edu/wiki/Main/ExcelProblems

I could post more but I would have to fire-up my old spreadsheet ;)

CJefferson · on April 16, 2018

I opened your second link. The first issue is the classic floating point numbers have rounding errors problem, which as far as I know, every system suffers from. That isn't just an excel problem.

ska · on April 16, 2018

The problem is often related to floating point representation, that is true, but it's not correct to conclude "oh well, everything gets this wrong so I might as well use excel".

One issue with excel is that many of the built in functions and statistical measures are implemented in numerically naive ways (and presumably remain so for reasons of computation speed and backwards compatibility) so if you want to do robust analysis you have to avoid them entirely - at which point you are far better off with a language designed for this. This is particularly an issue with larger data sets, where accumulation errors can become acute. Excel also introduces additional error terms due to binary encoding.

By the way: it is misleading to think of "rounding error problems". Far better to think about it as "rounding properties"/"truncation properties" and the like, then realize that you can't (in general) write floating point operations as if they were utilizing real numbers and expect correct behavior. That doesn't mean correct behavior is not achievable.

dwaltrip · on April 16, 2018

I think Excel is a bit of both. It is pretty amazing what people do with Excel. It's probably one of the most effective software tools ever created.

It straddles this incredible balance between completely free-form input and structured data enabling very powerful functionality.

scroot · on April 16, 2018

There is a reason VisiCalc was such a big deal when it first came out. This is a UI that is both intuitive to regular people and also incredibly powerful -- and one that takes little effort to learn. It's a sign of the kinds of things that are possible with computing (reducing the gap between what today we call "coders" and "users"). There are few really large efforts towards research in this area these days and we are poorer for it.

Here is a good paper from Alan Kay that relates: https://frameworker.files.wordpress.com/2008/05/alan-kay-com...

cup-of-tea · on April 16, 2018

But there could be errors hiding anywhere. All you know is it looks correct.

dwaltrip · on April 16, 2018

Sure, but the cost to fix that issue, and the flexibility you have to give up to do so, is apparently not worth it most of the time, if we look at how people use the software.

Perhaps there is a systematic undervaluing of more polished, robust custom solutions by excel users across the world. That could be the case.

But there is also probably an under-supply of adequate custom solutions. I have seen comments over the years on HN from people who have had much success in consultanting gigs where they simply built custom tools to replace ad-hoc workflows and processes living in places like Excel.

lmm · on April 16, 2018

All existing business processes tolerate high error rates. They have to, because anything that people do by hand will necessarily have a high error rate. So when programmers come to automate an existing business process, they often vastly overestimate the value of correctness at least in the short term: if the program does the wrong thing even 1% of the time, it's really not a problem, because the processes around this process will be built to tolerate that.

In the long term, correctness may become more valuable. An analogy: when factories first switched to electric power, they simply connected electric motors to the existing driveshafts used to distribute steam power around the factory, and only realised small improvements in productivity that way. But once factories were more fully converted to electric power, it became possible to rearrange machines to suit workflows (rather than being arranged around the driveshafts) and this lead to much bigger productivity gains.

cup-of-tea · on April 16, 2018

> Sure, but the cost to fix that issue, and the flexibility you have to give up to do so, is apparently not worth it most of the time, if we look at how people use the software.

That's not true. People just don't know any better. Look around you and you'll see many people using the wrong tools. It doesn't mean they've made a rational decision to use those, it usually means they're not aware of a better alternative.

coldtea · on April 16, 2018

You don't get any better assurances on the cli.

cup-of-tea · on April 16, 2018

What do you mean by "the cli"? Garbage in/garbage out always applies, that's not what I'm talking about. The point is errors in the code doing the processing. If it's written in a language you can read and understand the entire thing. You cannot read and understand an entire spreadsheet.

coldtea · on April 16, 2018

>You cannot read and understand an entire spreadsheet.

Why wouldn't you?

Except if you mean the internal implementation (by MS). But nobody reads the implementation of Pandas or R either...

cup-of-tea · on April 16, 2018

I'm not talking about reading the underlying implementation, although that is, of course, another advantage of Python and R. I'm talking about reading the project. What's your process for reading a spreadsheet? Reading every single cell?

telchar · on April 16, 2018

That's a good point. In a block of 10,000 cells that should all have the same formula (offset by 1, say) how do you know that they all actually have that same formula? I'm not aware of a way to check that that doesn't involve writing custom VBA.

adrianN · on April 16, 2018

A spreadsheet doesn't offer you a linear story that you can read. It's a strange amalgamation of computation and program.

peterburkimsher · on April 16, 2018

I like iTunes 10 as a GUI.

In fact, I like it so much that I'm hoping to rewrite it in JavaScript so it can run in the browser. I also don't like what Apple did with iTunes 11 and 12.

1. SQL queries are Smart Playlists.

2. Summary statistics are shown on the bottom, and apply to the selection (if there is one), or the playlist (if nothing is selected).

3. It's object-oriented data. This is a valid AppleScript command:

tell application "iTunes" to return name of first track whose artist contains "Blink 182"

4. A browser view along the top to quickly see Genre, Artist, Album.

5. Nested playlist folders, including smart playlists.

6. Support for other data types. I wrote a script to generate m3u files to add virtual "Radio stations" to iTunes. Clicking those triggers a PHP script that usually opens my browser with a URL, but could technically do anything else. (OK, this is a hack - I'll document it if anyone wants to know).

Excel is fine, but it doesn't have the hierarchical structure that iTunes is so good at. It also mixes the data and the instructions.

bonniemuffin · on April 16, 2018

And let us not forget that Powerpoint is turing complete!

http://www.andrew.cmu.edu/user/twildenh/PowerPointTM/Paper.p...

baldfat · on April 16, 2018

> Is Excel not a GUI?

Yes and Excel is horrible for Data Science. You have hidden data and to be honest almost every complex spreadsheet has at least 2 errors in it. https://www.forbes.com/sites/salesforce/2014/09/13/sorry-spr...

OtterCoder · on April 16, 2018

Excel? Not in the traditional sense. It's more of a... spatial(?) programming language.