More papers should have their algorithms in rhyme!
Algorhyme
I think that I shall never see
A graph more lovely than a tree
A tree whose crucial property
Is loop-free connectivity.
A tree which must be sure to span
So packets can reach every LAN.
First the Root must be selected
By ID it is elected.
Least cost paths from Root are traced.
In the tree these paths are placed
A mesh is made by folks like me
Then bridges find a spanning tree.
How about his master's thesis. In "A Symbolic Analysis of Relay and Switching Circuits" Shannon proved that you could use boolean algebra to simplify the arrangement of electromechanical relays in telephone routing networks and then in the same thesis inverted the reasoning to show that by using these relays in the proper arrangement you could solve boolean algebra problems. All digital computers have some DNA that leads back to this paper he wrote when he was 21...
As comic-book guy would say: Best. Master's Thesis. Ever.
Widely known as the best masters thesis ever, yeah, but I still put his Info Theo paper as my favorite. Maybe he just had a little time to age, the theoretical ground he covers in that one paper is incredible and yet the text remains conversational and approachable.
The magic ink paper was on top of HN 2 years ago[1]. I remember bouncing off it the first half dozen times I ran across it on the blogosphere. Eventually I focused, and boy was it worth it:
It's still my #1 paper of all time, for both exposition and presentation (layout, typography, figures), and the paper I have quoted most often on my linkblog (http://akkartik.name/search_results?q=bret+victor)
Not to knock the author, but this essay is basically a rehash of Edward Tufte’s ideas, with a couple examples from computer software, and his own BART schedule program as an extended case study. It’s interesting enough, but too long, unfocused, and disorganized (which is probably why you bounced off a half dozen times).
If you’re interested in the topic (information design), you’ll get more mileage just going straight to Tufte’s books. Scott McCloud’s book, which the author recommends, is also excellent.
Of course I've heard of and read Tufte! But he doesn't emphasize nearly as much the idea that you can use visual design to eliminate interactions. The closest I can think of is his critique of iPhone UI:
Also agree - but the wake up call for me was that my lack of understanding of a topic biased me towards thinking that the topic was trivial - ie: I was unable to appreciate how ignorant I was. This was a world shaker for me. My own cognitive filters were denying me understanding.
I agree. The impact of this paper on my understanding of my own nature, the nature of those around me and the overall social-political structure of the World is immense.
A few of my favorites, chosen from computer science and from physics:
D. E. Knuth, "An empirical study of FORTRAN programs," Software: Practice and Experience, vol. 1, no. 2, pp. 105-133, 1971. http://dx.doi.org/10.1002/spe.4380010203
A choice quote: "A first idea for obtaining 'typical' programs was to go to
Stanford's Computation Center and rummage in the waste baskets and the recycling
bins. This gave results but showed immediately what should have been obvious:
waste-baskets usually receive undebugged programs."
Einstein's seminal paper "On the Electrodynamics of Moving Bodies" that proposed
the theory of special relativity. A good English translation is available online
at http://www.fourmilab.ch/etexts/einstein/specrel/specrel.pdf
Ben Fry (of Processing fame)'s PhD thesis: "Computational Information Design".
Abstract:
The ability to collect, store, and manage data is increasing quickly, but
our ability to understand it remains constant. In an attempt to gain
better understanding of data, fields such as information visualization,
data mining and graphic design are employed, each solving an isolated
part of the specifi c problem, but failing in a broader sense: there are
too many unsolved problems in the visualization of complex data. As a
solution, this dissertation proposes that the individual fi elds be brought
together as part of a singular process titled Computational Information
Design.
This dissertation first examines the individual pedagogies of design,
information, and computation with a focus on how they support one
another as parts of a combined methodology for the exploration,
analysis, and representation of complex data. Next, in order to make the
process accessible to a wider audience, a tool is introduced to simplify
the computational process for beginners, and can be used as a sketching
platform by more advanced users. Finally, a series of examples show
how the methodology and tool can be used to address a range of data
problems, in particular, the human genome.
Abstract: The exponential dependence of resistivity on temperature in germanium is found to be a great big lie. My careful theoretical modeling and painstaking experimentation reveal 1) that my equipment is crap, as are all the available texts on the subject and 2) that this whole exercise was a complete waste of my time.
It's in the field of population biology, but the principles are broadly applicable to any system where self-interested actors can gain more than they lose by actions that hurt the community.
378 page attempt to proove 1+1=2
http://scienceblogs.com/goodmath/2006/06/extreme_math_1_1_2....
"after 378 pages, <Bertrand Russell and Alfred North Whitehead> were able to talk about how you could prove that 1+1=2. But they couldn't actually do it yet, because they hadn't yet managed to define addition."
"Massive IQ Gains in 14 Nations: What IQ Tests Really Measure" by James R. Flynn in Psychological Bulletin. Here is the correct APA citation form, modernized with the doi reference:
Flynn, J. R. (1987). Massive IQ gains in 14 nations: What IQ tests really measure. Psychological Bulletin, 101, 171-191. doi: 10.1037/0033-2909.101.2.171
This was a path-breaking paper, which, as N.J. Mackintosh comments, introduced many psychologists to long-ignored data that could have been noticed long ago: "the data are surprising, demolish some long-cherished beliefs, and raise a number of other interesting issues along the way."
Cavendish's Experiments to determine the Density of the Earth is still worth reading. In the 18th century, it took some impressive hacking to isolate two balls sufficiently that their mutual gravity was the strongest force acting on them.
This paper describes a visual object detection framework that is capable of processing images extremely
rapidly while achieving high detection rates. There are three key contributions. The first is the introduction
of a new image representation called the “Integral Image” which allows the features used by our detector
to be computed very quickly. The second is a learning algorithm, based on AdaBoost, which selects a small
number of critical visual features and yields extremely efficient classifiers [6]. The third contribution is a
method for combining classifiers in a “cascade” which allows background regions of the image to be quickly
discarded while spending more computation on promising object-like regions. A set of experiments in the
domain of face detection are presented. The system yields face detection performace comparable to the best
previous systems [18, 13, 16, 12, 1]. Implemented on a conventional desktop, face detection proceeds at 15
frames per second.
Dietary pesticides (99.99% all natural). By Bruce Ames, inventor of the Ames test on how virtually all of our toxic chemical ingestion comes from natural, plant sources. And completely destroying the myth of organic food being less harmful.
A Meta-Analytic Examination of Assumed Properties of Child Sexual Abuse Using College Samples. The only paper in history to be condemned by the US House of Representatives.
At some point in its development, every industry can be considered a growth industry, based on the apparent superiority of its product. But in case after case, industries have fallen under the shadow of mismanagement. What usually gets emphasized is selling, not marketing. This is a mistake, because selling focuses on the needs of the seller, whereas marketing concentrates on the needs of the buyer. In this widely quoted and anthologized article, first published in 1960, Theodore Levitt argues that "the history of every dead and dying 'growth' industry shows a self-deceiving cycle of bountiful expansion and undetected decay." But, as he illustrates, memories are short.
My current favorite paper of the month: "An efficient implementation of graph grammars based on the RETE matching algorithm" by H. Bunke, T. Glauser, T. Tran
http://dx.doi.org/10.1007/BFb0017389
Abstract: This paper is concerned with the efficient determination of the set of productions of a graph grammar that are applicable in one rewriting step. We propose a new algorithm that is a generalization of a similar algorithm originally developed for forward chaining production systems. The time complexity of the proposed method is not better than that of a naive solution, in the worst case. In the best case, however, a significant speedup can be achieved. Some experiments supporting the results of a theoretical complexity analysis are described.
"Estimating betas from nonsynchronous data" by Myron Scholes and Joseph Williams in Journal of Financial Economics, December 1977, pgs. 309-327.
Abstract is below.
Nonsynchronous trading of securities introduces into the market model a potentially serious econometric problem of errors in variables. In this paper properties of the observed market model and associated ordinary least squares estimators are developed in detail. In addition, computationally convenient, consistent estimators for parameters of the market model are calculated and then applied to daily returns of securities listed in the NYSE and ASE.
One of my favorites as well. What really fascinates me about it is it's length, it can't be more than a couple of pages printed yet it introduces a very powerful concept.
- Codd 1969/1970 Relation algebra, operations, model papers
- Crick, Watson, (Franklin, Wilkin) Nature article on DNA.
- Shannon, Information Theory as listed/ mentioned.
I am quite surprised that I don't see Roy Fielding's "Architectural Styles and the Design of Network-based Software Architectures" on this list (aka his REST PHD paper). I have not read many dissertation, in fact i think this is the only one, but I found it very approachable.
It isn't quite a scientific paper, but Feynman's "Personal Observations on the Reliability of the Shuttle", http://www.fotuva.org/feynman/challenger-appendix.html, published as an appendix to the commission's report is an excellent piece of work.
abstract: We consider the question of how mathematicians view themselves and how non-mathematicians view us. What is our role in society? Is it effective? Is it rewarding? How could it be improved? This paper will be part of a forthcoming volume on this circle of questions.
I'm reading a book at the moment called 'the mathematical experience' by Davis and Hersh which looks at some of these same issues. I haven't got very far through it yet but it seems like a good book so if you're interested in this sort of thing you might want to check it out.
http://www1.cs.columbia.edu/~ji/F02/ir02/p44-perlman.pdf
More papers should have their algorithms in rhyme!