Hacker News new | past | comments | ask | show | jobs | submit login

Yes, and? It should be a simple checklist item for any scientific journal to ask the author whether they used excel or not. If they say yes, reject the paper unless they can show their work hasn't been affected negatively by using excel.



On the practical side, how would people show that their work hasn't been affected negatively by using Excel? How would the journals evaluate that? I suspect this would just become another box to check - yes, we use Excel, yes, we checked the results, like the last 50 times.

Intellectually, it feels snobbish to single out Excel like this. I'm a software engineer in science, and I generally agree that scientists should learn some kind of coding. But you can make mistakes in Python or R as well - not to mention in physical experiments. We should check data and methods in general, not presume incompetence if people use one tool.


By providing a repo with raw data, and the code that runs on it that eventually produces the results that are in the manuscript. Anything else is just a bunch of handwaving.

It's not coincidental that big evil FDA/pharma requires people to sign - on paper - their lab notes (and to have lab notes, and to sign that they calibrated the equipment, and that they checked for common mistakes).

And yes, I know this costs a lot, and that this is sort of a pipe dream. And I'm not saying this is the best solution. Renaming genes might be a lot better, because it just allows the field to move on, and Excel will eventually die out, or at least researchers will move away from it - maybe to Jupyter, if we're lucky.

So, all in all, of course Excel is just a symptom, but society is already pretty stubborn when it comes to financing R&D and progress.


That doesn't solve the analysis problem downstream when a non-expert is getting started, doesn't know about Excel's anti-features, and starts doing analysis using the only tool that the world has ever told them is acceptable for tabular data.

Ideally excel would change, but since we know it won't, and we want to work with lots of people with minimal problems, we must adapt.


Why should Excel change? Most people mean March, the first, when they enter march1.

Researchers should use other, more appropriate tools, or at the very least specify the column type when importing data into Excel. It's not that hard.


Do they mean March 1 when they import data from a text file?

It's one thing to change typed in user text in real time. That's not causing any problems. It's another to randomly mutate cells amongst tens of thousands of rows. I don't think that has ever helped anyone.


How are these non-experts getting started? If they are in academia, academia should start teaching these practical things too. (Yes, I know the problem is that many old school bigwig researchers are doing even worse things.)


There are people of all sorts: amateurs that want to play with the data but will not make study of it their primary field, trainees that will make use of tabular data extensively in their career, and then experts in more rarefied fields (e.g. immunology, clinicians, etc.)

If I can get a trained immunologist looking at my data, I'd much rather have 5 more minutes of their analytical skills than teaching them about common data exchange pitfalls.


yeah but then you'll just make them use SAS and nobody wants that. just try convincing anyone who graduated two decades ago to use something reasonable like R




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: