Hacker News new | past | comments | ask | show | jobs | submit login

By the time I was in year 4 of my PhD, reading a paper mostly involved: look at title, directly look at figure 1 and figure 8 to judge if the title actually matches the result. If the result is of genuine interest then read the abstract, and then go through the figures carefully. Otherwise for the most part the reminder of the paper is left alone. This only applies for papers where I already know the field and am only trying to gauge the incremental addition to the knowledge the paper makes. The style will be different for a paper in an unknown field, there it's best to just read it as prose from top to bottom with a marker at hand.

To add more - not reading any text in the paper makes you laser focussed on trying to figure out what the data means without the authors trying to sugar coat anything with their perspective/agenda.




I've had many people suggest just reading the figures, but I've found that most scientists hide their sins in the methods section, and the figures cannot be properly interpreted without careful inspection of the methods (which often demonstrates the authors didn't really do a good job).

Also, I've noticed that a very large range of papers with faked image data in figures has gone unnoticed by most readers. People look at the figures hoping to see what they want to see and aren't critical enough about the process used to generate them (when I wrote my phd thesis, all figures were programmatically generated by version-controlled code on well-managed data.


I agree with this. The results and discussion are usually the least interesting parts and figures are usually meaningless without seeing how the data used to generate those figures were collected.

The methodology tells you exactly what the people who wrote the paper did. It's where you can see whether they used a sample size of 10 vs 1000, what methods they used to sample their data, the accuracy levels they used in analyzing their data, it's also how a study may be replicated.

Without reading the methods, you can't be sure about any other thing in the paper.

That's the problem i see with a lot of news reporting on science papers. They use the gp's method of reading. Abstract, images and maybe results if they're really trying.

That's how you end up with a lot of sensationalist, contradictory science headlines.


It is good you labeled the stage you were (4th year PhD), the article misses this. I would say in the 1st year of my PhD the path was more like this: read the paper slowly top to bottom, understand nothing, now look for the part which looks the most interesting/approachable, read that again. Go to the citations of this section and repeat the process (recursive algorithm). Take note of any papers beiing cited by most of the papers, invest time in this paper. Look for papers citing the paper you read (not possible for cutting edge of course). This process takes probably 2-3 month. Then do own research on that topic, then go back to the paper and now things are clearer. Do the same process with a related paper in the field. Then one more. Probable now you are in year two and one can transition to your approach.


Bugger me, that describes the process I (not a PhD) have gone through trying to digest some papers so well it is eerie.

I've been damned lucky. The worst paper, the one were I had to read most of the citations and quite a few of their citations before I got it was http://conferences.sigcomm.org/sigcomm/1997/papers/p011.ps and yes it took over a month before all the pieces settled into the right corners of my mind. Perhaps more accurately, it took months for my mind to create the right corners for the concepts to settle into.

But that was fine - I knew before I started it was the seminal paper on the subject, and so it would be worth whatever effort it took. The idea of wasting that inordinate amount of time going down that path with one dud after another makes me shudder in horror.


Did you read the link to "Adam Ruben’s tongue-in-cheek column"? It had me laughing, and is much more like the process you described.


Now I read it. Actually I think it is not 'tongue-in-cheek' (apart from making light of a hard thing).


Can you please share the link? I reached his homepage but no dice: http://adamruben.net/


It's the first hyperlink in the text of the linked-to article, going to http://www.sciencemag.org/careers/2016/01/how-read-scientifi... .


Looking now at various definitions of 'tongue-in-cheek', I think you're right.


Haven't done a PhD but took some time off from work to read/research on my own and this year 1 PhD resonate well with me. Read top to bottom of a paper, feel the despair (feeling dumb) and look for part that I understand.

The process was utterly a time waste except for when the paper in question was a survey paper.


Indeed! Even after a decade you still have to do this for any new field, though you feel less anxious about it! I loathe and love journal club topics that make me do this!


cautionary tale from my CS desk - first dozen papers, just what was described above.. next, complete actual work using the knowledge - hey! I belong here! .. next, read a few more and decide "I can do this" and collect three or six dozen additional papers from the reference notes, new discoveries, and latest pubs, mix them all in the same collection of PDF (!)

now you have sixty+ complex papers, at least a third of which are not actually very important, useful or thorough.. and where are those original, carefully chosen ten you started with ?

side note - the "focus on the figures" reading advice does not scale, since most search is first and foremost with text. Which of the now-eighty papers (and growing, the field is hot) are the ones you cared about and understand.. ? "piled higher and deeper - PhD" indeed!


As a layman my impression is most papers are too verbose. At least the ones that I can understand. After reading tons of steganography papers (for example) I found they tend to begin with the same retelling of the history of steganography from the beginning of time up to now, even though it's completely irrelevant to the topic of the paper (which is not history).


While that is true, it also helps people new to a scientific area. I can read 3-5 papers on a topic and the first chapters of each will give me enough to understand the gist of the topic, the status quo.

It's a question of who you prioritize while writing a paper. Experts or the interested layman.

If you prioritize people who are already experts in the topic, you write little to no introduction, you get to the results quickly.

If you prioritize the "curious, interested layman" (or university students in their first years), a short introduction to a topic with references will provide enough information to understand the basics and the reader can continue reading the rest of the paper with enough context to understand why the topic is relevant.


The way I write a paper for "curious, interested layman" is different than the way I write for experts. It's hard to keep both in mind when writing.

Laymen, for example, might need some figures to help understand a topic that experts internalized years ago.

I prefer having occasional well-written review papers, meant for getting non-experts up to speed. Then the domain experts - who are often not experts at writing for laymen - can refer people to those review papers, possibly also with a history delta for what's new since the review.

"For a comprehensive review, see" https://scholar.google.com/scholar?q=%22for+a+comprehensive+...


There's an added difficulty of writing to "tangential experts". I've had papers in the past that were 'tweeners, between two related but different fields. Depending on the audience, background information was needed that may be tedious for the other group.


Does content like this ever get published in two different journals, for different audiences, with different intro/lit reviews?


As far as I know, not really. Most journals require you to not submit to another journal until they've made a decision to reject so you can't submit to multiple. Even though they may have different intros, they’re presenting the same data.


> While that is true, it also helps people new to a scientific area.

That's what surveys are for.


I also have the impression that CS papers all too often recapitulate the topic's history. I don't see the point. Other fields seem to leave it as "for a review of the topic see", cite someone(s) else, and get on with the paper.

My field of cheminformatics, while CS-adjacent, inherits more strongly from the chemistry traditions. The occasional outsider papers from a CS department generally cause my eyes to glaze over if they follow that history-retelling CS tradition.


I personally find it useful. As a CS researcher, if I want to delve into a new topic (for example, because a specific line of work comes out where I think my research could be applicable) I typically can start by directly reading the paper I'm interested in, and in the introduction I'll find a brief history and some references I need to look at for context.

It's much better than having to look through all the relevant papers in that topic in the last decades in order because they all assume knowledge of the past ones.

Of course, I know there are survey papers, and they can be very useful, but they're typically going to be too general and not specifically oriented to what you need to understand that specific paper you're interested in. Plus, you inherit the biases of whomever compiled the survey, which in my experience are often significant.


I rarely read the CS literature so I can't says much about that. From reading cheminformatics papers by CS authors, I see that their history section is often only incompletely understood.

For example, in paper I reviewed, the CS authors misread one a paper completely. They wrote something like "method X has been used in cheminformatics before [cite] but the CS literature has improved on that with method Y." But the paper they cited actually used method Y.

Now, the paper itself didn't need that level of detail about the history. That error, and others like it, bugged me because they came across as dilettantes, writing with more assurance than they actually had, and because their history was biased towards the CS methods they knew, which made it feel like they snubbed the cheminformatics methods and treated the previous work in this field as second-class material.

I can't help but think that the process you describe, where someone new to topic must write a history section just to publish, results in a lot of half-baked history, as new people just don't have the experience to really give a good treatment of the history. Instead, they'll see that 15 other papers covered points A-F so they follow the tradition that they need to cover points A-F but with a different slant.

I'm not saying my field is immune to that! There's a well-known observation along the lines that "similar structures tend to have similar properties." Many people will cite a 1990 as the source of that quote. Except that that book doesn't contain that quote. Most people instead know about it second- or third-hand, which has resulted in the common but incorrect practice of making that citation. It's a litmus test I use to tell if the authors really know their history.

Q: If you write multiple papers on a new topic, do you still write histories for each one? Or can you refer to your previous publications for the history?


What you say does happen, and it's an interesting perspective (I guess I had always seen the "half-baked history" sections as something annoying but inevitable). Pick your poison, I guess.

And the answer is that in general, we do write (short) histories for each paper, unless maybe in short conference papers (limited to 4 pages or so) while it's OK not to write any.


I've wondered for a while now how various funding and other constraints affect fields of science. In math, CS, or SWE it's easy to pick up a new topic, but in biology and chemistry people seem to have overwhelming tendency to work for decades at a time on singular problem areas. CS papers likely go over history because it tends to be more useful, with chemistry, all interested readers may have 5+ years in that field already.


> I also have the impression that CS papers all too often recapitulate the topic's history.

It's not really about history so much as context. Usually, you want to set up a paper with "This is the state of X as it exists right now, but there exists this problem Y. We solve Y by starting from X and making Z advancement."

I think part of the issue is CS is exceptionally young (less than 100 years total, really), and the other part is how rapidly it's advanced and diversified in that time (in terms of individual disciplines even within subfields). Without the context, I can't readily jump to a paper from an adjacent-but-not-directly-relevant subfield without needing to look up a bunch of other stuff. And it's not as straightforward to know where to look to find the relevant information. A paper 10 years old might be state-of-the-art or it might be completely outdated, and it's hard to know which if you're not immersed in that discipline. Having the context in the paper itself is a big help to ameliorating this.


In biology the intro is typically not too long and generally gives just enough info so that even if a person was just browsing the journal they can get enough context about the field and topic and get a list of pre read references they can look back to if needed. The perfect introduction would ideally give you an introduction and a reading list that by itself would bring you up to speed to where the paper is starting from, without any need for second level reference digging!


I can attest to this fact. I've noticed often that papers in biology (systems biology in my experience at least) go directly to the point with some amount of context and history leading upto the main results of the paper. This is something that's suprisingly lacking in engineering, where reading a paper or grasping the context usually requires at least some amount of prior learning.


That section is helpful for researchers both to contextualize the paper and to leave pointers to other papers the current one builds on. You’re right it’s not directly about the method in the current paper but it’s an important section nonetheless, even if it’s a bit meta.


> most papers are too verbose...After reading tons of steganography papers

Is it possible they were secretly reporting something else too?


Agreed. Although the order i learned was title, figures, materials and methods. Sometimes there's useful stuff in the materials and methods, and sometimes that's where the bodies are buried.


At least in bio papers, the good ones make sure to give the most important method details in the legend itself so it's rare I have to refer the methods section. But yes, that does happen! Especially anything where they cure cancer in mice (the running joke is that you can also cure cancer in mice by stomping them)


You forgot to mention reading the Related Works section to make sure you're referenced.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: