Hacker News new | past | comments | ask | show | jobs | submit login

If anyone is interested, here are my old blog posts on the requirements for getting an interview for a professorship at a top-50 school:

[1] http://www.chriskanan.com/planning-for-life-after-your-phd/

[2] http://www.chriskanan.com/statistics-on-computer-science-pro...

[3] http://www.chriskanan.com/statistics-on-ucsds-computer-scien...




Paul Litvak has also written a few interesting posts, both about moving into industry from academia and about the current academic job market:

[1] http://www.paullitvak.com/2013/05/16/why-social-science-grad... [2] http://www.paullitvak.com/2013/02/18/simgradschool-a-study-i...


From your [3] above, do theory PhD graduates usually have more or fewer publications?

What would be your guesstimates of the mean/median number of first author publications of faculty candidates at a top school?

"Number First Author Publications (excluding theory folks)

Mean: 12.8±6.5

Median: 14.5

Range: 2 – 25"


In other areas of computer science, first author counts for a lot. It indicates a person did the majority of the implementation work, probably wrote the first draft of the paper, etc. Typically the last author is the PI that came up with the idea, mentored the first author, and funded the work. The contributions of those in the middle are less well defined.

In CS theory (and math), they go by alphabetical author order, so none of the aforementioned rules work unless there is only a single author. So, I excluded the theory folks from the first author analysis.


This is changing, thankfully. First authors should be the ones leading the paper, according to contributions. Some journals are asking for the actual contributions performed by each author, which I think is very good.

Alphabetical order should be banned, period.


Littlewood and Hardy disagree. Ranking/attributing contributions to a joint paper in mathematics is very hard especially when the results grow out of conversations between all of the authors.


You state your opinion, but don't justify it -- could you share why you think it is important we list first author by contribution? What does this accomplish?

Obviously not everyone agrees (or you wouldn't have felt the need to post), so why not share why you think you're right, so we can discuss the issue?

Edit:

Well, we've proven that people feel very strongly about this, but I'm no closer to understanding why people care so much about it. (ie, what they think it accomplishes.)

C'est la Internet, I suppose.


I don't publish papers, so I don't have a strong opinion about it. But like you, I am genuinely curious why alphabetical order was chosen as the convention rather than the amount of contributions (like GitHub does[1]).

If the answer is that contributions to scientific papers can be very fuzzy, e.g. it is hard to determine which contribution is more important than others, then I agree alphabetical order makes sense. But is it really so in most of the papers?

For example, in the RSA paper, Adleman clearly knew his contribution was less than Rivest and Shamir and he insisted that he put his name in the end. I believe, in most papers it would be obvious to the co-authors that someone's contribution is more significant than the others. If this is true, why do they still choose to follow the alphabetical order convention rather than the contribution order convention?

[1]: https://github.com/tensorflow/tensorflow/graphs/contributors (example)


> If the answer is that contributions to scientific papers can be very fuzzy, e.g. it is hard to determine which contribution is more important than others, then I agree alphabetical order makes sense. But is it really so in most of the papers?

I'm probably biased from being in industry, but even when people contribute unequal code to a module or writing in a report, we just list the authors alphabetically because it's obnoxious to do anything with the list otherwise and it stops a huge amount of politicking when the first three out of ten authors contribute equal amounts. At the end of the day, stable collaborations will outpace individual contributors -- so we should focus on what allows us to openly collaborate and tears down barriers to doing that.

But from my point of view, we have a lot of the same incentives between industry and academia:

1. If you want to get promoted, you need projects associated with you and to get noticed.

2. There needs to be some method to decide who gets what amount of bonuses.

3. Funding allocation is tied to past successes.

...and probably some more I'm forgetting.

So I think my experience isn't totally irrelevant -- you get enough signal out of the distribution of the person's name that the loss of whatever from not getting "first" information is eclipsed by the gains from less politics (and if you have a lot of documents, easier parsing).

Again, perhaps it's the industry talking, but I'd rather hire an author that contributed the third most to a large number of papers than someone who contributed first most to a few. One of them seems to be a "force multiplier" in that they're keeping several other ICs moving (and contributing on all those efforts), while one guy just seems a specialist in a particular work.

I don't think removing the "first" signal harms my ability to find those "force multipliers", because they'll still have the same markers (as they weren't showing up first under the other system anyway).

So perhaps I simply have different uses for the rankings than people who feel differently?


I would suggest you consider the emergent properties of these systems of attribution, rather than just the primary affect. Could there be a reason why theoretical fields have small numbers of authors given alphabetically, and experimental fields sometimes have ten or more authors, given in order of "contribution"?


Isn't GitHub just ranking by number of commits? That's not a very good measure of contribution. After all, I don't think tensorflow-gardener is the primary contributor of Tensorflow.


You don't see how ordering purely on name is extremely unfair to those whose first letter of name is one of the last in the alphabet..?


I have such a name -- doesn't particularly bother me.

I understand how people can feel that way though, but my question is -- why is that feeling anything but obsessing over a vanity metric, and actually a useful mechanism for author evaluation?


It's important in the way papers are referenced.

https://library.leeds.ac.uk/skills-citations-harvard#activat...

> If a source has three or more authors, the name of the first author should be given, followed by the phrase "et al".

It might be good if this wasn't the case, but since it is, it's a realistic concern for authors.

Beyond that, really the reason will be that this is what people the people who decide your funding see.

Perhaps a useful example would be like saying if you have more than a few people in your team, anyone referring to your project must say "Alan's team" because Alan's name is the first in the list, even if they're a junior or part time. This is what your bosses see come promotion time, all that great work like Alan's team (2016) and who can forget the amazing piece of work Alan's team (2015).


In alphabetical fields people avoid saying "Author et. al." for this reason. Lots of famous papers are known by their three- or four-letter acronyms.

If you called "BCS" (superconductivity) "Bardeen et. al." it would take everyone a minute to figure out what you meant, and they'd wonder about your background.


It shows who did the most work in an unambiguous way. If you have never collaborated on a paper, you'll probably think this is not important, but in reality it is.

Interestingly, some fields where alphabetical order is used have started adding a note that explicitly shows who did how much work.


I think the problem is that it's really not that unambiguous. There are general rules to the order but it's not explicit, and being second author doesn't explain in any kind of detail what was done.

I'm quite a fan of explicitly tagging authors for what they did. That way you avoid deciding whether the person who did the stats did "more work" than the person who did the code, etc.


Nor does it really explain how much was done: was it 90% of the first author's effort, or 9%?

It's an incredibly noisy measure that people nevertheless take seriously.


Explicit tagging is the best option, yes, but contribution-based is significantly better than just alphabetical.

Typically, the first author did most of the intellectual work. Rarely if ever would the first author be a person who did a lot of practical work but little to no intellectual work.


I agree it's largely better than nothing, though there was that paper recently that got pulled because the authors couldn't agree on the ordering of the names.


In this case, being the last author should matter more. As in, you have a number of first-author publications to prove you can do research yourself, then start publishing as the last author with different first authors, to show you're into management now and can train new first authors.

As an old Russian saying goes, 'a great engineer would never be promoted'.


This is true in life sciences as well although IIRC this is more of a tradition than something written and defined.


Thank you for the explanation!


No wonder there is a reproducibility crisis. This one of the reasons why so many papers out there are so scientifically unimportant and statistically dubious and are only adding noise to human knowledge. Slow down scientists!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: