The big problem I've always had with "programming" in a spreadsheet is by nature everything is obfuscated and difficult to trace. Yes, you can inspect a cell and see what the source for that cell is, but that might be 10 other cells and you can only really review one cell at a time. It's like a programming language where you only see one line of code at a time. Worse, those references usually aren't named. What does "A1 + SomeOtherTab:B2" mean?
All of this really starts to fall apart when you have 10s of tabs with hundreds of rows of data which are often copy/ pasted. You won't even notice that some intern hard-coded one value into cell F75 until you actually drill down to that cell.
Spreadsheets are great until you hit a certain complexity, then they are unmanageable messes.
Excel offers the ability to name any cell or range of cells. Don't even have to search through menus or the ribbon, it's right there to the left of the formula bar.
I'm well aware that the vast majority of Excel spreadsheets don't use named cells/ranges, but you can't really blame Excel for that. It couldn't be too much easier. Lots of Python programmers don't use comments or descriptive variable names either.
> Excel offers the ability to name any cell or range of cells
Except almost nobody does. It's not intuitive or the way it's taught.
> Lots of Python programmers don't use comments or descriptive variable names either.
You have to have variable names in Python. If you want to give them shitty names, that's your bag, but unlike Excel, it's not an extra step.
You also have to deal with the fact that every cell in a range has its own unique formula. It's like you have a special function for each and every cell. You have nice conveniences for it like copy/ paste and dragging, but ultimately you are copying formulas all over the place. And it's super easy to update all but one of those when you make a change.
Yes, you can create custom functions, but much like named ranges, it's not the default behavior, takes extra steps, and it isn't the way Excel is taught.
Spreadsheets are amazing for small to moderately complex things, but beyond a certain point, they are just an unmanageable mess regardless of who creates them.
Interesting, my comment was solely in response to your complaint about naming convention... you've expanded your criticism quite a bit. I already anticipated your issue with naming being that few people used named cells/ranges -- again, it's not even in a menu or ribbon, it's present at all times, what else do you want? Not "the way it's taught"? Well, blame your teacher.
> You have to have variable names in Python
sure, but `(i, j, k)` isn't any more descriptive than A1 or B7 or CQ85759. `intOrderTotal` may seem better initially, until the summer intern creates `intOrderFinalTotal` (after tax) and `intOrderAllInFinalTotal` (after shipping and tax)
> every cell in a range has its own unique formula
you could use array formulas, or were you never taught those either?
> it's not the default behavior, takes extra steps,
Creating a Python virtual environment is not the default behavior, and it takes extra steps. So does using any packages beyond the standard library. So does source control. Or running Jupyter. Using classes, or type hints, or imports, are all not the "default" of one long script in a single file.
Excel isn't superior to Python, or the best tool to solve every type of problem. Excel has its place, Python has its place. But your specific little nitpicks here are a reflection of the user (who I presume is you) not on the tool itself.
If the code was stored in source control with a separation between dev and production the intern might be able to raise such a Pull Request but it would never get approved without oversight from someone more senior. This Code Review process is almost completely impossible in Excel.
... and then you have two problems. I have never seen an environment converted to SharePoint that didn't suffer badly from the conversion.
And the Excel vs <other language> source control issue isn't history, it's "go ahead and try to diff between two versions of an Excel sheet" vs "diff two versions of that source file".
Unless I've never heard of the tool that can digest two Excel sheets and tell you what formulas differ, or cells. Please correct me, anyone who knows of one.
> Interesting, my comment was solely in response to your complaint about naming convention... you've expanded your criticism quite a bit.
My point was always that spreadsheets are poorly structured for complex problems and that the logic is obfuscated. Just pointing out additional issues. Nor is my previous post exhaustive.
You are comparing worst case Python programming to best case spreadsheet designs. As soon as you compare a typical moderately complex Python program to a similarly complex spreadsheet, things fall apart.
You can F9 on "A1 + SomeOtherTab:B2" (just that selection) and it'll calculate for you. You can do this for all the dependencies under Trace Dependencies.
I admit it's not perfect but I have found it much easier historically to follow a calculation through Excel than through untested pandas code (people inner join and drop rows; they groupby and lose null groups plus related rows; they filter string data without case insensitive matching, etc.).
In Excel there's a toolbar button to toggle showing formulae rather than their results. I realise this doesn't counter your overall objection, but it does mean chasing down logic isn't quite as bad as having to select individual cells one at a time.
I mean... sure? There are a fair number of ways you can mitigate these issues, but the way spreadsheets are structured does not lend itself to structured/ well managed code.
I fully agree, things can get out of hand in spreadsheets. I've built my fair share of such spreadsheets, and as penance, I'm learning R.
The thing with Excel is there's a low barrier to entry, but there are a lot of differences between a great spreadsheet and a bad one. Somewhat like a junior vs senior developer, the quality of code/spreadsheet depends on what they know, how well they can troubleshoot, and how good of a system they can imagine (to then replicate as much as possible).
For example, most people are entirely unaware that Tables exist in Excel. When you want the sum of a column, rather than writing =SUM(F7:F39) and cursing when you realize you added 10 more rows and that's why the sum is not updating, you can do =SUM(tbl_Sales[SalePrice]), and when you add 10 rows, the table will automatically expand. Suddenly your formulas are somewhat self-documenting, regardless of which sheet holds tbl_Sales. Crtl+T when you've selected your data, or Insert -> Tables -> Table.
You can also make named ranges, which I would say is an analog between using {a, b, c, tempVar} versus well named variables in normal programming.
You can also trace dependents/precedents, showing arrows for how the data flows throughout the spreadsheet. Formula -> Trace Precedents/Dependents.
Funny comment in a Python discussion. No type enforcement, no requirement for class/object declarations, circular imports/dependencies allowed, threading/gevent/async messes, variable/class scope weakly enforced
Python is great for a lot of things, but the language is not a beacon of well-managed code. Good Python programmers write nice, easy-to-follow code, just as good Excel builders create very nice, easy-to-follow spreadsheets.
I mean... sure? I never said they did. I even specifically said I wasn't disagreeing with your overall point, in the vain hope of avoiding this redundant discussion. I was just correcting one specific inaccuracy.
I think there could be a middle ground with an excel like tool with some kind of typing. For example I often get confused about what currencies particular cells are, and mixing this up causes a lot of pain. Not sure why the hard coded number can’t just have a “$” and every time it gets multiplied by EUR/USD changes to a euro, and is displayed as such. This little thing would save me so much time.
My first job was cleaning up the mess that was caused by hard coded yahoo finance urls in spreadsheets. One of them died, no one noticed and it cost the company millions of dollars in bad trades over three months.
The guys handling millions of dollars should be trained in finance / accounting, so they aren’t untrained.
Handing over to IT usually means their flexible spreadsheet that they can change as they require, turns into an expensive and inflexible black box that only IT can change and that doesn’t integrate with the rest of their decisions. Also the new solution also probably has errors and pulls currency info from the same endpoint. Excel isn’t perfect, but it’s used for a reason.
To use your analogy, You can wait for an electrician to change your lightbulb, but that means your going to be working in the dark for longer.
logs of what? if a programmer hard-codes URLs, you really think they are following best practices elsewhere? further, how do logs help get back the millions of dollars that were lost?
LOL! as I already stated, a weak coder isn't going to implement best practices for logging. At best, he hard-coded his own email address to receive an alert, however, this also goes unnoticed, since, as you stated initially, he's dead.
Look, whatever company this was, was relying on YAHOO as its data source for making million+ dollar trades -- not Reuters, or Bloomberg, or JP Morgan, but YAHOO -- and then, for MONTHS -- not hours, or days, but MONTHS -- nobody in its finance department or trading desk or whatever happened to notice that the incoming data feeds were not matching up with quotes from counterparties, market makers, CNBC, colleagues, Wall St Journal? Does this company not have auditors? A CFO? Any IT oversight whatsoever? I'm sorry to say that this particular company's problems are rooted much, much deeper than the loss of a particular Excel guy.
All of this really starts to fall apart when you have 10s of tabs with hundreds of rows of data which are often copy/ pasted. You won't even notice that some intern hard-coded one value into cell F75 until you actually drill down to that cell.
Spreadsheets are great until you hit a certain complexity, then they are unmanageable messes.