In the age of Github/Gitlab, federated database systems & search, plus social media, and many other advances... we need to move off the old model of publishing research. I offer to the HN community that modern research 'publishing' should include:
- providing all raw data
- provided a timestamped electronic lab notebook that documented what you and when you did it... if applicable.
- working code for your data analysis and all figures/graphs.
- a write up of your work that can be commented on by the public (questions, critique, supporting information, etc.). Markdown or Latex rendered online would be great. The ability to continue updating the same 'publication' in place with new information would be really helpful; for one, this would make review articles much more useful over a longer duration [ if they were maintained regularly].
- supporting pictures, video, audio. e.g. a video walk-through of your experimental setup.
- a means for other folks in the field of study to endorse or critique your work, like "weighted Github" stars.
We need a Github for science. Benefits:
- free access for everyone. more people having access and participating will hopefully increased pace of development.
- better feedback over time on the quality of research as each 'article' or body of research would have issues raised, people commenting whether or not they we are able to duplicate the findings, revisions over time to deal with comments/issues, etc.
- Potentially easier to search and find relevant research through knowledge graphs of inter-related work, meta tags, and the "awesome X" type of publications/sites that would pop up to aggregate solid research in a particular area of study.
- Low or no barrier to publishing research, scientific community bias will not stop work from publication.
- will encourage and make it easy to publish null results: i.e. "I tried this... didn't work? Here why I think it didn't work...anyone got ideas?" Null results are very important and seldom published.
- may reduce fraud. No Science or Nature... just publish your work, if it is awesome, it will get lots of views and stars. All data and code needing to be available will limit ability to get away with stuff.
Exploitative publishing is not a technical problem. Publishing a journal isn't very hard and in particular in the academic world where writing stuff and organizing committees is something that people do all the time. Tons of artifacts are already available on various software and data repositories, either public or institutional.
The problem is that through some hoops of history a couple companies hold quite firmly the political control of important (and less important) journals and publications. To attack this there is no other way than to organize and move the journals to either an alternative administrative "host" entity (university, association, ...) or setup a new one. Which takes time, lots of emails, solid reputation and some luck in convincing people that have a say. And it's not like the academic world is particularly well currently, with everybody having tons of time to bootstrap such things.
Just calling for change and starting your "github for research" won't do anything (semantic scholar exists, tons of startup exist in this space). Anyone a bit serious studying what's happening for 1h can come up with tons of deficiencies. The question to ask is why it stays in this obviously deficient state: corporate political control, time, support from employers, ..
In fields where there aren't too many adversarial players (i'm talking about big industrial companies) like math or compsci things are kind of ok already. Its in biology and medicine where it's the worst. It's obviously because their fields are much more penetrated by the industry and more generally by economical goals. At some point you gotta be blind not to see that the ones dragging us behind (or more generally holding levers) are the guys with the big bucks.
GitHub, the company with the closed-source platform, training CoPilot on their users' hosted code against license terms. That's to be our guiding star?
- providing all raw data
- provided a timestamped electronic lab notebook that documented what you and when you did it... if applicable.
- working code for your data analysis and all figures/graphs.
- a write up of your work that can be commented on by the public (questions, critique, supporting information, etc.). Markdown or Latex rendered online would be great. The ability to continue updating the same 'publication' in place with new information would be really helpful; for one, this would make review articles much more useful over a longer duration [ if they were maintained regularly].
- supporting pictures, video, audio. e.g. a video walk-through of your experimental setup.
- a means for other folks in the field of study to endorse or critique your work, like "weighted Github" stars.
We need a Github for science. Benefits:
- free access for everyone. more people having access and participating will hopefully increased pace of development.
- better feedback over time on the quality of research as each 'article' or body of research would have issues raised, people commenting whether or not they we are able to duplicate the findings, revisions over time to deal with comments/issues, etc.
- Potentially easier to search and find relevant research through knowledge graphs of inter-related work, meta tags, and the "awesome X" type of publications/sites that would pop up to aggregate solid research in a particular area of study.
- Low or no barrier to publishing research, scientific community bias will not stop work from publication.
- will encourage and make it easy to publish null results: i.e. "I tried this... didn't work? Here why I think it didn't work...anyone got ideas?" Null results are very important and seldom published.
- may reduce fraud. No Science or Nature... just publish your work, if it is awesome, it will get lots of views and stars. All data and code needing to be available will limit ability to get away with stuff.
etc...