As someone who worked with bits of scientific code: Does the code you write right now work on another machine might be the more appropriate challenge. If seen a lot of hardcoded paths, unmentioned dependencies and monkey-patched libraries downloaded from somewhere; just getting the new code to work is hard enough. And let's not even begin to talk about versioning or magic numbers.
Similar to other comments I don't mean to fault scientists for that - their job is not coding and some of the dependencies come from earlier papers or proprietary cluster setups and are therefore hard to avoid - but the situation is not good.
To me, that's like a theoretical physicist saying "My job is not to do mathematics" when asked for a derivation of a formula he put in the paper.
Or an experimental physicist saying "My job is not mechanical engineering" when asked for details of their lab equipment (almost all of which is typically custom built for the experiment).
On one hand, yes. But on the other hand, reuseable code, dependency management, linting, portability etc are not that easy problems and something junior developers tend to struggle with (and its not like that problem never pops up for seniors, either). I really can't fault non-compsci scientist for not handling that problem well. Of course, part of it (like publishing the relevant code) is far easier and should be done, but some aspects are really hard.
IMO the incentive problem in science (basically number of papers and new results is what counts) also plays into this, as investing tons of time in your code gives you hardly any reward.
> But on the other hand, reuseable code, dependency management, linting, portability etc are not that easy problems and something junior developers tend to struggle with
On the original hand, these are easier problems than all the years of math education they have. Once you're relying on simulations to get results to explain natural phenomena, it needs to be put on the same pedestal as mathematics.
There are tons of tutorials on using conda for dependency management, it's not rocket science. And using a linter is difficult? If a scientist needs to read and write code as part of their job then they should learn the basics of programming - that includes tools and 'best practices'.
The point is that as a scientist your code is a tool to get the job done and not the product. I can't spend 48 hours writing unit tests for my library (even though I want to) if it's not going to give me results. It's literally not my job and is not an efficient use of my time
If the code you base your work on is horrible it definitely makes me question your results. That's why it's called the reproducibility crisis.
Writing some tests, using a linter, commenting your code, and learning about best programming practices doesn't take long and pays off - even for yourself when writing the code or you need to touch the code again. "48 hours writing unit tests" is a ridiculous comparison.
This is the same as any other argument against testing. Unless you are actually selling a library, code is not the product. Customers are buying results, not your code base. Yet, we've discovered the importance of testing to make sure customers get the right results without issues.
If you want your results to be usable by others, the quality of the code matters. If all you care is publishing a paper, then I guess sure it doesn't matter if anyone else can build off your work.
But the results are usable by others, in most fields of science the code is not part of these results and is not needed to enjoy, use and build upon the research results.
The only case where the code would be used (which is a valid reason why it should be available somehow) is to assert that your particular results are flawed or fraudulent; otherwise the quality of the code (or its availability, or even existence - perhaps you could have had a bunch of people do all of it on paper without any code) is simply irrelevant if you want your results to be usable by others.
> The only case where the code would be used (which is a valid reason why it should be available somehow) is to assert that your particular results are flawed or fraudulent;
Not true. Code is often used and reused to churn out a lot more results than the initial paper. A flaw in the code doesn't just show one paper/result as problematic. It can show a large chunk of a researcher's work in his area of expertise to be problematic.
> The point is that as a scientist your code is a tool to get the job done and not the product.
Everything you say is as true for experimental equipment and mathematical tools. Physicists are fantastic at mathematics, yet are one of the most anti-math people I know - in the sense of "Mathematics is just a tool to get results that explain nature! Doing mathematics for its own sake is a waste of time!"
The equation is not the product - the explanation of physical phenomena is. If the attitude of "I don't need to show how I got this equation" is unacceptable, the same should go for code.
>Yeah, we build it with duck-tape and there's hot glue holding the important bits that kept falling off. Don't put anything metal in that, we use it as a tea heater, but there's 1000A running through it so it's shoots spoons out when we turn the main machine on.
Lots of people saying, it is the scientist's job to produce reproducible code. It is, and the benefits of reproducible code are many. I have been a big proponent of it in my own work.
But not with the current mess of software frameworks. If I am to produce reproducible scientific code, I need an idiot-proof method of doing it. Yes, I can put in the 50-100 hours to learn how to do it [1], but guess what, in about 3-5 years a lot of that knowledge will be outdated. People comparing it with math, but the math proofs I produce will still be readable and understandable a century from now.
Regularly used scientific computing frameworks like matlab/R/Python ecosystem/mathematica need a dumb guided method of producing releasable and reproducable code. I want to go through a bunch of next buttons, that help me fix the problems you indicate, and finally release a final version that has all the information necessary for someone else to reproduce the results.
[1] I have. I would put myself in the 90th percentile of physicists familiar with best practices for coding. I speak for the 50% percentile.
(1) Use a package manager, which stores hashsums in a lock file. (2) Install your dependencies from a lock file as spec. (3) Do not trust version numbers. Trust hash sums. Do not believe in "But I set the version number!". (4) Do not rely on downloads Again, trust hash sums, not URLs. (5) Hashsums!!! (6) Wherever there is randomness as in random number generators, use a seed. If the interface does not allow to specify the seed, thtow the trash away and use another generator. Careful when concurrency is involved. It might destroy reproducibility. For example this was the case with Tensorflow. Not sure it still is. (7) Use a version control system.
> in about 3-5 years a lot of that knowledge will be outdated
Yup, and most of the points you mentioned will probably not be outdated for quite some while. Every package manager I'm aware of with lock files that are that old can still consume them today.
Definitely makes you question it more. Does the paper not explain the contents of the MATLAB code? That's all that is usually needed for reproducibility. You should be able to get the same results no matter who writes the code to do what is explained in their methods.
Of course, I have no idea about the paper you're talking about and just want to say that reproducibility isn't dependent on releasing code. There could even be a case were it's better if someone reproduces a result without having been biased by someone else's code.
I think the idea that scientific code should be judged by the
same standards as production code is a bit unfair. The point when
the code works the first time is when an industry programmer
starts to refactor it -- because he expects to use and work on
it in the future. The point when the code works the first time
is when a scientists abandons it -- because it has fulfilled its
purpose. This is why the quality is lower: lots of scientific
code is the first iteration that never got a second.
(Of course, not all scientific code is discardable, large quantities
of reusable code is reused every day; we have many frameworks,
and the code quality of those is completely different).
But it often is. For most non-CS papers (mostly biosciences) I've read, there are specific authors whose contribution to a large degree was mainly "coding".
Similar to other comments I don't mean to fault scientists for that - their job is not coding and some of the dependencies come from earlier papers or proprietary cluster setups and are therefore hard to avoid - but the situation is not good.