I agree with everything you wrote. Software development is a mess.
To git specifically, git is hard for everyone, anyone that says otherwise is either lying or doesn't know what they don't know. I said as much at the software carpentry classes I helped with. The lead scientist trainer didn't really like me saying that, but it is the truth.
Humans have a fixed decision budget, we can only handle so much complexity before our brain turns off. Do we want to spend that solving domain problems or on huge software stacks?
I guess this is where the folks like us could do a better job to make simple, obvious ways to do things. There is a lot of value in standardization and a reduced amount of choice. Hobgoblins and all.
I knew there was a lot of complexity in our tech stacks, trying to bootstrap smart people from basically nothing made it even more apparent. But I don't think the answer is to get scientists out of writing code. Code is the mirror to science as science is the mirror to our understanding. It rigorously and formally, when used correctly, closes the loop.
For all the problems that Python notebooks have, I think they make an excellent way to convey ideas, but they need history and a way to make them reproducible.
Apologies for the mistaken identity, I thought you were a different seagull.
> Humans have a fixed decision budget, we can only handle so much complexity before our brain turns off. Do we want to spend that solving domain problems or on huge software stacks?
Bingo.
Sounds like we are fairly in agreement. Scientists sometimes have to write code, there's now way around it. But I wish there was more help for all the ancillary stuff.
I want scientists working on their simulation, not debugging why their docs build pipeline fails once in a while, or have to investigate some warnings about needing to upgrade their chosen distro in the CI pipeline.
> There is a lot of value in standardization and a reduced amount of choice.
The only issue here is today's standard is tomorrow's "old-fashioned way" - and I'm only half-serious about "tomorrow". So even if they get everything set up the standard way, they have to know enough at some point to migrate it. I still don't have a good answer for that.
What do you think about having a versioned "science stack" for various disciplines? Genomics, Neuroscience, Astronomy, Condensed Matter, Materials Science, etc. It would be somewhat prescriptive, but if you stay on the ~~rails~~ (not a put, not intended), happy path, then a lot of your needs and choices are made for you.
Everything is containerized. All the batteries are included. Each commit also records all the tests to assists in any upgrades that might occur over the life of the project.
The stacks would be versioned, but also include a mechanism to update individual components w/o breaking the whole system. The root-node containers would have source, and one should be able to "rebuild world", so we aren't relying on cached packages from a distro that is long dead. I am trying to revive some software that was last installable on Debian 4.
The whole workflow's ultimate product is a paper and an executable notebook in a way that is easily reproducible by anyone. The full software stack is rebuildable from scratch using only source (like a FreeBSD build world). One should be able to take a paper from an archive 50 years from now and recreate a bit for bit copy using one command.
To git specifically, git is hard for everyone, anyone that says otherwise is either lying or doesn't know what they don't know. I said as much at the software carpentry classes I helped with. The lead scientist trainer didn't really like me saying that, but it is the truth.
Humans have a fixed decision budget, we can only handle so much complexity before our brain turns off. Do we want to spend that solving domain problems or on huge software stacks?
I guess this is where the folks like us could do a better job to make simple, obvious ways to do things. There is a lot of value in standardization and a reduced amount of choice. Hobgoblins and all.
I knew there was a lot of complexity in our tech stacks, trying to bootstrap smart people from basically nothing made it even more apparent. But I don't think the answer is to get scientists out of writing code. Code is the mirror to science as science is the mirror to our understanding. It rigorously and formally, when used correctly, closes the loop.
For all the problems that Python notebooks have, I think they make an excellent way to convey ideas, but they need history and a way to make them reproducible.
Apologies for the mistaken identity, I thought you were a different seagull.