How do you "strip the formulae out of the XML" then? You sort of hide a lot of complexity there. The Excel file format isn't complicated because it's stored as a zip file, it's complicated because it's complicated.
Again, the use case is straightforward. You have an already-solved accounting problem. But, say, last years accounting solution is giving different answers than the one you just ran. So what changed? With the script in the linked article, the answer is trivially findable via the git history (and in larger software searchable via tools like "git bisect", etc...). And the reason for this is that source code is intended to be read by human beings, often including things like "comments" and "style guidelines" and "literate programming" to help the process. None of that exists for ad hoc GUI tools, and the result is that you can't meaningfully develop them as software.
That kind of tasks maps very poorly to "strip the formulae out of the XML, and diff them".
> "trivially findable via the git history (and in larger software searchable via tools like "git bisect", etc...)".
So put a pre-commit hook which dumps the spreadsheet as some kind of text file instead of a binary blob.
> "And the reason for this is that source code is intended to be read by human beings"
That's not "the reason" that's an unrelated popular saying. If it were true, reverse engineering, maintenance, refactoring, and porting between different languages would be easy. It isn't. Instead source code appears to be intended to be read by the compiler/interpreter the programmer is using, and if anyone else can make anything out of it, good luck to them.
> "often including things like "comments" and "style guidelines" and "literate programming" to help the process".
If you allow helpers external to the source code as part of the development process, that's good because it cuts off your incoming reply saying that a pre-commit hook writing the formulae from spreadsheet to text is too hard/too much work/unreasonable.
> "None of that exists for ad hoc GUI tools"
Ad-hoc gui tools aren't programmable. Notepad isn't programmable. Calculator isn't programmable. Excel isn't an ad-hoc tool, Excel is one of the most famous, most used, GUI tools on the planet with some of the largest ecosystem and community around it, and one of the most pluggable, scriptable, documented, standardised, systems going.
> "and the result is that you can't meaningfully develop them as software".
Humans wrote Microsoft Excel itself, wrote Windows search indexers for searching inside Excel documents, wrote SharePoint which can index and work with Excel content, wrote the Microsoft Graph API and M365 cloud which can integrate with Excel spreadsheets, wrote the OpenOffice/LibreOffice Excel importers/exporters, wrote the ImportExcel module and the DLLs it's based on, rewrote Excel in TypeScript for Office365. Claiming that humans can't meaningfully write code to work with ... weird formats? Excel? spreadsheets? Files that were once touched by a GUI? ... is such a throwing-hands-up-and-giving-up-without-trying take on things. People could if they wanted to. People running $16Bn departments with Excel sheets could if they took it seriously and invested an appropriate amount of money in doing so.
And what happens if the author moved the forumla that used to be in B3? How does your diff utility detect that?
I genuinely can't believe you're staking your argument on the idea that you can somehow track code deltas between versions of a Microsoft Excel spreadsheet when literally no one in the world does this.
B3 would be empty. This is the same as someone renaming any function or method and you asking "but how would the diff utility detect the rename?". The diff would show that the method used to exist and now doesn't. And it would show that the new method used not to exist and now does. The Excel-diff would show that B3 used to have a value and now doesn't. That B4 used to be blank and now has a value. It could show it in a rendering of the spreadsheet, even.
Your argument is that it's not possible. My argument is that it is possible. Nobody is arguing the strawmen positions that you claim I am arguing [it's a good idea, it's a way to build reliable software, everyone does it, etc.]
Well, at the risk of simply saying "why not" - why not? Preferably with an example.