Microsoft Excel. Manipulating tens or hundreds of thousands of rows, including cross-referencing across tables, is just so satisfying and much faster than doing it with a DB or code. Alternative spreadsheet software do some things better, but they don't come close as a complete package, I don't regret paying for it myself for the first time in my life. It such a life saver in a pinch.
Yeah, as someone who doesn't pay for MS, I'm basically using a combination of Calc and Gnumeric, because G can open larger files (handled larger files than Excel in fact), but Calc has more features, and it's more similar to Excel which I was used to. ONLYOFFICE's sheets can also be handy, if MS Excel compatibility is wanted.
So useful I like it despite how it always goes out of its way to waste my time by trying to misrepresent a number of different types of strings and numbers as American dates :-|
100% in agreement on Excel. Even when coding in Python I frequently save an intermediate file as xlsx to explore/debug, or even load into Tableau for viz.
The ability create a relational database in Excel with vlookups and hlookups, then capture it all into a macro is amazing.
I've really enjoyed using Excel as a Postgres frontend, with a real Postgres DB instance handling data, and then using the report functionality to dump to Word.
While a pro reporting engine and cutting out MS Office altogether would be a better longterm solution, it is hard to beat for quick & dirty results.
> The ability create a relational database in Excel with vlookups and hlookups,
Do yourself a favour and ditch vlookup and hlookup in favour of the recently introduced xlookup, which even obsoletes index/match !
I try to keep my exploratory joins out of Excel, but I admit that I often don't resist the immediacy of Excel's poor man's joins located right where I need them.
With just the tidyverse library (which includes dplyr), R can be very useful in a data analysis pipeline. It is great for data cleaning and aggregation, especially when a process needs to be done multiple times. It is much faster than excel/power query. I am an accountant in SaaS and spend a lot of my day waiting for excel/powerbi automations to refresh. Similar solutions in R/sql/python would be nearly instant. Also excel/powerbi automations are a bitch to troubleshoot, and are unnecessarily complex.
When following tidy principles, a framework designed by the tidyverse dev Hadley Wickham, R code can be very easy to interpret, similar to SQL. Additionally the R community has made libraries for everything, and I consider R a great general purpose language as well.
Note that different flavours of R have very different performance. 'Base R' is quite slow. But R + data.table is blinding fast. Power Query perforamnce is awful, even compared to base R. Some benchmarks of these plus other data wrangling software (including my own product) at:
I got to play around with Tableau when I was helping my wife in a collage programming course and though I don't have a current use to justify the significant cost, I must say that the tool was amazingly flexible and easy to use. I'd highly recommend it.
I work at a shop where a few people are JMP users. I think it's a problem when a proprietary app is only available to a handful of users at a site -- they tend to get a lot of work dumped on them, and nobody can use their stuff without going back to them. It's a different experience when a site has a site license.
This isn't specific to JMP of course. A good thing about Excel is that businesses are OK with just paying for "everybody" to have it.
Also, people use Excel for more than analysis. It's also a crude database and platform for creating small "apps" that do things, that are easy to share with others.
Yeah, just yesterday. Hacky output from `nslookup`, not ideal but it’s all we had. Tens of thousands of rows. No IDE with regex capabilities available.
Excel to the rescue! Took me about five minutes to extract exactly what I needed.
I know I should learn it but, I dunno, I just really dislike the PowerShell syntax. I get what they were going for with the Verb-Noun thing but it just reminds me of AppleScript and I can’t stand it.
Often useful CLI tools are linked on HN (like in this thread). And probably if you take the time to learn them or if you integrate them into your automation they are 100% useful.
But, given the problem “I have some unfortunately formatted data which I need to analyze”, there is no better solution than sanitizing it in VS Code and then analyzing it in Excel.
I don't have a Windows machine, and I'm having mixed feeling about Excel for Mac — is it as powerful as the Windows version? Are they exactly the same or are there big difference in functionality? The web / Office360 version doesn't seem much better, either though.
Obligatory Spolsky intro to Excel: https://www.youtube.com/watch?v=0nbkaYsR94c