Hacker News new | past | comments | ask | show | jobs | submit login

Well, however, we can argue here that a pre-print is not of the same quality as a published work. Not that published works cannot have errors (and be retracted), but at least they have passed a first-level scrutiny of an editorial board.



Yes, but passing through the editorial board takes time. In some fields research moves extremely fast and by the time papers is properly published it's already obsolete.


That's the price of scrutiny, man. Time. And these "extremely fast fields" are the same fields that need to backtrack and retract their claims. The loss of time, effort, and resources for other scientists if they use pre-printed claims and data from faulty publications can be really enormous and probably already wildly underestimated.


The flow is significantly slower than is required for checking work. I'm fairly sure my wifes papers didn't really take nearly two years to review.

Machine learning is a field that's moving quickly and can also largely be checked quickly by other people in the field too. The problems that can't be checked easily are also those that won't be checked by the journals anyway (no journal I know of will retrain one of googles deep nets to see if they get the same result before publishing).


If you think peer-reviewing is slow then it's probably because you don't get the PEER part of the reviewing process. The PEERS are simply colleagues who have their own lives, priorities, and research. They don't stop what they are doing to review your wife's papers.

And please, I see what's happening here: I have been portrayed as the "traditional journal apologist" although it's definitely what I wanted here. I simply wanted to express my doubts that pre-printing is the answer to it all.

Anyway, is there a study that shows how many pre-printed articles in ML have been retracted or refuted so far? Until this is done, and shown as small as in peer-reviewed journals, keep a small basket.


> If you think peer-reviewing is slow then it's probably because you don't get the PEER part of the reviewing process.

I understand the process fully. It doesn't make it fast, nor does it make it a necessary cost to pay. Delaying access to content for several years does not solve a problem. A not-insignificant time was spent bouncing between people to sort out who was paying for the costs, then there is also a delay between acceptance and publication. This now averages just a month in pubmed, but papers can bounce around this point for a lot longer.

I am not arguing for pre-prints to replace traditional publishing, but the speed of spreading information is undeniably faster, and that's what started this whole chain of comments.

> Anyway, is there a study that shows how many pre-printed articles in ML have been retracted or refuted so far? Until this is done, and shown as small as in peer-reviewed journals, keep a small basket.

I don't get the phrase "keep a small basket", but no I've not seen this. The point is simply that PEERS (if we need to shout the word) can in many cases replicate the work and assess the results much more quickly than the traditional review & edit process. I picked the field because I see people re-implement work described in the papers very quickly, or the original authors share the trained models and code.

I'd also caution against using "retracted" papers as a measure, some journals charge for a retraction.


[flagged]


> You don't understand a thing and that was the proof of it, at least for me. The process of peers evaluating a paper takes a lot of time because it cannot be automated and is serious, especially for the better journals. Of course bouncing people (referees) is part of the process, to find the better and/or most available one.

The time of the actual reviewing does not alter either of the two other time sinks that I posted. The median post acceptance to publication time for the journal of clinical neuroscience is over three months, and other journals head over a year. [0]

> Of course you can cut corners and pre-print,

Preprints are not an alternative to publication. They are something you can do before publication. Hence the name.

> Pre-printing might be the case in fields like CS and its subfields where verification is a very quick thing, but totally unsuitable in fields such as biology or medicine

There is absolutely nothing unsuitable about releasing your work early in any field. There is a problem in assuming un-vetted work is vetted, but preprints don't make any claim to have been vetted.

0 http://www.nature.com/news/long-wait-for-publication-plagues...


"The time of the actual reviewing does not alter either of the two other time sinks that I posted. The median post acceptance to publication time for the journal of clinical neuroscience is over three months, and other journals head over a year."

Hey, you started claiming two years, no it's three months. Three months is perfectly acceptable for quality peer-reviewing. For good papers from experience authors (who know what critic to expect) this waiting might even be lower. High-quality publishing demands this.

"Preprints are not an alternative to publication. They are something you can do before publication. Hence the name."

Yeah, but the pre-print paper is almost never retracted if it has been completely revamped for the final publication, after corrections through peer-reviewing.

This makes a disservice to science cause in the meantime many scientists might have used the wrong data/methods found in the pre-printed version. That's the price of speed publishing. In some fields (biology, medicine) that price is very high.

"There is absolutely nothing unsuitable about releasing your work early in any field. There is a problem in assuming un-vetted work is vetted, but preprints don't make any claim to have been vetted."

Here we go again. I never said it was unsuitable. I, myself, sometimes attempt to pre-print. I just try to explain to you that pre-prints are not the silver bullets you imagine for high-quality spreading of scientific advances.

Yes, they spread fast and with no control, but high-quality leaves much to be desired. That's why it is probably prudent to accept as readers pre-prints only from established scientists who are known for their work ethos (by past results) and they will most probably submit their work to peer-reviewed journals as well.

Last but not least, everyone should be cautious about what he or she reads on pre-prints, especially on "slow" fields.


> Hey, you started claiming two years, no it's three months. Three months is perfectly acceptable for quality peer-reviewing. For good papers from experience authors (who know what critic to expect) this waiting might even be lower. High-quality publishing demands this.

You're not interpreting this correctly. This is not the time for review, this is once the paper has been reviewed and accepted. The journal has agreed to publish the paper, but there is still a significant delay before anyone actually gets to read it. This is why I'm saying it doesn't take two years to review the papers. It won't have done, but it still took that long between submission and the time others could actually read it.


This happens because you forget there is a que, a pipeline so to speak. Papers pile up, especially for highly desirable journals that command many eyeballs.

If yo do not want that, choose another journal. Most new ones can guarantee maximum time. All things considered, things get better here but still you get that fundamental step of peer-reviewing, which is the main difference between traditional journals and pre-print servers. In the latters you are essentially alone.


That's not necessarily true across all "fast fields". There's definitely selection pressure against fields sensitive to faulty experimentation to "move fast." Computer Science is one of the bigger subjects within Arxiv. Retraction rates within this discipline is extremely low and the field as a whole moves very rapidly. Having access to preprints makes it much more scalable for researchers to stay at the cutting edge.


Look, you are free to abstain from using pre-printed articles for your research if you do have doubts. Why start a crusade though?

ArXiv is working very well for a lot of scientists in Machine Learning, since must results can be doublechecked by simply running the code. I won't disregard a proved approach that I've myself seen working just because it hasn't been published yet.


> In some fields research moves extremely fast and by the time papers is properly published it's already obsolete.

Could you provide some examples for a layman? To me it always seems that science moves far slower than one could hope for (e.g.: Every 'breakthrough' in batteries/Graphene of the last ten years and still no products)


Think of it in terms of the speed of the field relative to the size of the paper - you aren't going to see no-graphene to industrial-graphene in a month, but you might see Johnson's 8% graphene be replaced by Smith's 8.01% graphene.


Hmm actually in my experience the versions of papers uploaded to preprint servers are the final versions, including reviewers comments. This is in cryptography, but I think the same can be said for complexity theory.


It is pretty controversial. The value of peer review has never really been demonstrated and despite peer review, the median scientific paper is wrong.


> the median scientific paper is wrong

What does "wrong" mean in this context? Does it mean that most scientific papers will be shown to be inaccurate in the due course of time? That is bound to happen given the nature of scientific progress.

But peer review aims to assess the methodology and rigour of the papers being presented. I agree that it is often debatable whether this happens in reality, but that doesn't explain your statement I quoted above.


How do you want this value to be demonstrated? Peer-reviewing is exactly this: peers evaluating the methodology and results of your experiment or idea. If this is faulty, imagine how much more faultier is an attempt to pre-print without even passing that step, which mind you, is pretty important for major journals such as Nature or Science.

Peer-reviewing has a lot of weak points, but saying that pre-print is the answer to them all is plainly wrong.


It doesn't seem that anyone is saying that we need to get rid of the (admittedly valuable and proven) peer-review model of publishing in journals. Or that this is uniformly better for all use-cases, esp. those that peer-review excels at.

It's merely a supplement to peer-reviewed journals that has some nice characteristics, for some use-cases, which has been beneficial, to some researchers, in some fields.


>the median scientific paper is wrong.

Are you being literal about that or is that just a figure of speech?

Could you please provide a source for those claims, if true?


I assume he's alluding to Ioannidis's 2005 paper "Why Most Published Research Findings Are False"[0] and/or the replication crisis[1] in general.

[0] http://journals.plos.org/plosmedicine/article?id=10.1371/jou...

[1] http://www.nature.com/news/1-500-scientists-lift-the-lid-on-...


Partly that and also based on my experience getting a PhD in physics where I found: (i) published physics papers with 30 pages of math in them frequently had errors, (ii) if you have a large number of signs the odds are 50-50 you will get the sign right in a long calculation, and (iii) you can't say high energy physics papers are "right" or "wrong" anyway... I mean, does anybody think we really live in an anti-DeSitter space?

Medicine on the other hand has the problem that you can't really afford large enough sample sizes to have sufficient statistical power.


>"Medicine on the other hand has the problem that you can't really afford large enough sample sizes to have sufficient statistical power."

This really isn't the major problem w/ modern medical research. In fact, if they had properly powered studies there would be far too many "discoveries" and the real problem would become obvious.

The real issue is that the efforts to come up with and study models capable of precise predictions (eg Armitage-Doll, SIR, Frank-Starling, Hodgekin-Huxly) have been all but choked out in favor of people testing the vague hypothesis "there is a correlation". There is always some effect/correlation in systems like the human body, so it is only a matter of sample size. As explained long ago by Paul Meehl, this is a 180 degree about-face from what was previously called the scientific method: http://www.fisme.science.uu.nl/staff/christianb/downloads/me...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: