Claude Shannon's original 1948 paper "A Mathematical Theory of Communication" launched the entire field of information theory. It's 50 pages, highly readable, and pedagogical. The source of its magic is that Shannon introduces and concretely grounds an essentially new ontological concept of vast applicability. And it has 100,000 citations.
This could have easily been 3-4 landmark papers, but instead its packed into one cogent idea.
That common interview question about query autocomplete/sentence completion? Shannon solved it and demonstrates it in this paper, almost a decade before FORTRAN existed. New grads still struggle with that problem. PhD's still struggle with that problem.
Pretty much every machine learning classifier is using a loss function described in that paper.
I always thought it'd be a really cool start-up idea to have a service that prints, binds, etc.. and mails you within a few days a printed paper like this.
I probably have 10 papers floating around, loose pages. Annoying. I print them when I want to read them but of course rarely can immediately.
Modern Higher Algebra by A. Adrian Albert (1937, Dover/Cambridge). It covers both abstract algebra and linear algebra.
Most modern textbooks tend to approach linear algebra from geometric perspectives. Albert's text is one of the few that introduce the subject in a purely algebraic approach. With a solid algebraic foundation, the author was able to produce some elegant proofs or results that you don't often see in modern texts.
E.g. Albert's proof of Cayley-Hamilton theorem is essentially a one-liner. Some modern textbooks (such as Jim Hefferon's Linear Algebra) try to reproduce the same proof, but without setting up the proper algebraic framework, their proofs become much longer and much harder to understand. Readers of these modern textbooks may not realize that the theorem is simply a direct consequence of Factor Theorem for polynomials over non-commutative rings.
With only about 300 pages, the book's coverage is amazingly wide. When I first read the table of content, I was surprised to see that it not only covers undergraduate topics such as group, ring, field and Galois theory, but also advanced topics such as p-adic numbers. I haven't read the part on abstract algebra in details. However, if you want to re-learn linear algebra, this book may be an excellent choice.
For those who are reading this, let me stress that it is not that Prof. Hefferon's proof of Cayley-Hamilton theorem is bad (it is actually better than some really horrible proofs that appear in some well-received textbooks), but that Albert's treatment is superb --- it is far better than the treatments of the theorem in most modern textbooks, including Prof. Hefferon's. Also, I was certainly not commenting on the overall quality of Prof. Hefferon's book, and I thank him for offering his textbook for free.
Oh, forgive me, no offense taken. I should have put a smiley. I read your post with interest and shall check out the book.
(As you no doubt know, different books have different audiences. Before I wrote my Linear book, when I looked at the available textbooks I thought that there were low-level computational books that suited people with weak backgrounds, and high-level beautiful books that show the power of big, exciting, ideas. I had a room with students who were not ready for high. I wrote the book hoping that it could form part of an undergraduate program that deliberately worked at bringing students along to where they would be ready for such things. Naturally, with that mindset I read your post as meaning that the audience for the book you described is just different. Anyway, thanks again for the pointer.)
This is the piece I love about HN. A comment refers to an accomplished person and he/she happens to be right there. And either supplying a correction or taking the feedback positively.
"The Art of Electronics" Paul Horowitz, Winfield Hill. Electronics for people who want to do stuff.
I loved this book as a teenager and recently got reacquainted with this after years by the almighty AvE on youtube when he took apart this an old Helicopter Radio/Telephone here: https://www.youtube.com/watch?v=6eoBj5W7Vdc
When I was an undergrad (enrolled in 2011) I decided to screw my professors book recommendations and lookup what the internet seemed to think was the best book in each subject and read that instead. Most of them were fairly old. Some examples that I remember off the top of my head:
Intro to CS: SICP (1979)
Algorithms/data structures: CLRS (1989)
Theory of computation: Sipser (1996)
Compilers: Dragon book (1986)
Calculus: Spivak (1967)
Linear Algebra: Dover's by Shilov (1971)
The given year is for the first publication, some of them are still being updated and I probably read a newer edition.
Code: The Hidden Language of Computer Hardware and Software by Charles Pretzold
Reading the book is the most beautiful and simple way that a person can really understand what a computer and come to the realization that it is not black magic.
World of Mathematics - An amazing compendium of accessible content straight from the masters - Poincare, Jonathan Swift, Neumann, Bishop Berkeley, Cayley etc https://www.amazon.com/World-Mathematics-James-Newman-Hardco... I believe it is available on archive.org
What Is Mathematics? by Richard Courant and Herbert Robbins published in 1941. One of the most beginner friendly yet rigorous books out there for a survey of many areas in mathematics.
I worked with an ex-ecology professor on some 'data science' projects at work, who suggested this older book I enjoyed called "The Ecological Detective: Confronting Models with Data." Good read imho for ideas about generating hypotheses, exploring data, and comparing models to explain the data, and not just in ecology (though that is the context obviously).
Computer Graphics: Principles and Practice in C (2nd Edition) is an incredibly deep look at the cutting edge of computer graphics technology as it stood in the late 1980s.
It's full of beautiful renderings and diagrams, covers the core algorithms of 2D and 3D graphics, introduces the mathematics required, and many other related subjects such as user interface design.
Apparently there is a 3rd edition from 2013 which looks at modern GPU-based rendering, though I don't own a copy.
I’ve studied both. They are totally different books. The 2nd edition is the “Art of Computer Programming” for computer graphics. The 3rd edition is a fantastic overview of GPU based graphics libraries (the fundamentals not the APIs). Which is to say, they have almost nothing in common.
The Structure of Scientific Revolutions (1962) by Thomas Kuhn is a fantastic book that explores the history of science while also debunking the commonly held belief that discoveries (of gravity, oxygen gas etc) are instantaneous observations, instead of a gradual weaving together of several seemingly contradicting observations.
The art of getting money (1) by PT Barnum is definitely one of those books. Even though written in 1880 most of the advice transcends time and technology and still evergreen as ever.
Well, I don't know if mine will help. I love Goffman's book, it's super-readable, enlightening. It's about the different roles and..settings people operate in, and the rules and customs of those places, e.g. backstage, shopfront, military ranks, hospitality etc. Full of great stories quoted from an impressive number of sources, very diverse.
Korzybski's book used to be huge, recommended by all kinds of famous people. I spent a few hours reading in it one day, to see for myself. (Plus had heard a fair bit about it before.) Korzybski basically seems a huge crank, who thought himself and his baby General Semantics[0] as important as Aristotle. The quote one always hears from it is "the map is not the territory", and well, that's about the only thing worth quoting from it. Plus he tried to get rid of "is" from the language, i.e. "A is B".[1] Seemingly because such sentences are deceptive - if you say "The car is red", well, it's many things besides red, so the sentence is a lie is many ways. It's a very strange objection. As if it's bad because it doesn't say everything, just one thing. Aristotle he's not.
Also there's an interesting contraption featured in the book, made of metal with holes, strings, plugs, used to make maps of levels of concepts. I don't know if it's practically useful.
Apart from that, what makes it a big slab of a book, are a host of chapters on different academic subjects serving as introductions to those subjects, e.g. one on maths, calculus I think, supposedly illustrating general semantics applied there. These seem mostly intended to give the impression Korzybski is a genius polymath. People who didn't know anything about that subject might learn something from that, and feel the book taught them something. But it's nothing to do with Korzybski's theories.
Not sure you’re looking for philosophy, but I keep a translation of the Tao Te Ching nearby at all times. It’s helped me stay centered and humble.
Just to go meta: whatever book you learned something from originally/in college you should keep. It might not always be the best, but keeping the context of your original understanding can really help and speed up recollection when needed. (This probably applies most to textbooks used for whole classes as opposed to minor topic references.)
Good idea! I am a complete sucker for tracking down CS textbooks from my courses 20 plus years ago. And I like the hard copy versions. My outlook is, an engineer is known by the books he keeps.
Mechanics and Thermodynamics of propulsion by Peterson and Hill explains most of the rocketry systems still in use today and the first edition was published in 1965. I think it is interesting that fundamental boost rocketry has changed very little since that time.
Previous posts have nicely covered almost all of the textbooks and papers I would have mentioned myself, including K&R's _C Programming_, Brooks' _Mythical Man-Month_, Feynman's Lectures, etc.
The only glaring omission was any mention of the classic (1965) FFT paper by Cooley and Tukey: "An algorithm for the machine calculation of complex Fourier series." _Mathematics of Computation_, 19(90), 297–297. doi:10.1090/s0025-5718-1965-0178586-1 (https://www.eit.lth.se/fileadmin/eit/courses/eit085f/Cooley_...)
I might also have added the _CRC Handbook of Chemistry and Physics_ that seems to have been on every working scientists's and engineer's bookshelf (including mine) during the mid-to-late 20th century. Still in print, nearing the 100th edition.
Also, some of my old favorite textbooks that I used at University were not mentioned, including Thomas' _Calculus and Analytical Geometry_. I still have my red cloth covered Addison-Wesley 3rd edition, no longer in print, but a much later edition might still be in print.
Chandrasekhar's "Newton's Principia for the common reader" is a reading of Newton's book in modern mathematical notation and with commentary on the methods Newton was using.
"The Design of the UNIX operating system" by Bach holds up well. Feynman's lectures on physics (3 volume set). Electrodynamics by Jackson is also up there.
Something I had read long time ago and also recently recommended by Ed Witten is John Wheeler's essay - "INFORMATION, PHYSICS, QUANTUM: THE SEARCH FOR LINKS" also famously known as the "It from Bit"essay (which now probably is "It from Q-bit" as our current understanding).
"The History of Fortran I, II, and III" (1979) - because this historical piece by the author of the first high level language brings home the core principles of language design [https://archive.org/details/history-of-fortran]
George E. Forsythe and Cleve B. Moler,
Computer Solution of Linear Algebraic
Systems
Paul R. Halmos, Naive Set Theory, Van
Nostrand, Princeton, NJ, 1960.
More has been done since this book, but
this book is a gorgeous introduction to
axiomatic set theory. So, even people
who want to dig into the latest work would
do well to have this as the first book.
And for people wanting to read any of the
more advanced material here, knowledge of
this book will be from good to have to
important.
The third edition is a lot better than the
first two.
H. L. Royden, Real Analysis: Second
Edition
Beautifully written, elegant, but maybe
don't work way too hard on the exercises
about upper/lower semi-continuity, and
there is a better summary than
Littlewood's three principles.
Bernard R. Gelbaum and John M. H. Olmsted,
Counterexamples in Analysis
John C. Oxtoby, Measure and Category: A
Survey of the Analogies between
Topological and Measure Spaces
Walter Rudin, Real and Complex Analysis
Walter Rudin, Functional Analysis
Leo Breiman, Probability
Kai Lai Chung, A Course in Probability
Theory, Second Edition
Jacques Neveu, Mathematical Foundations
of the Calculus of Probability
Erhan Cinlar, Introduction to Stochastic
Processes
J. L. Doob, Stochastic Processes
I. I. Gihman and A. V. Skorohod, The
Theory of Stochastic Processes I, II
Donald E. Knuth, The TeX book
Donald E. Knuth, The Art of Computer
Programming, Second Edition
Leo Breiman, "Statistical Modeling: The
Two Cultures," Statistical Science, Vol.
16, No. 3, 199–231, 2001.
Paul R. Halmos, "The Theory of Unbiased
Estimation", Annals of Mathematical
Statistics, Volume 17, Number 1, pages
34-43, 1946.
Paul R. Halmos and L. J. Savage,
"Application of the Radon-Nikodym Theorem
to the Theory of Sufficient Statistics",
The Annals of Mathematical Statistics,
Volume 20, Number 2 (1949), 225-241.
FDVS was written in 1942 when Halmos had just gotten his Ph.D. from J. Doob, author of Stochastic Processes in my list, at U. Illinois, and was an assistant to John von Neumann at the Institute of Advanced Study in Princeton.
IIRC Hilbert space was a von Neumann idea: It is first, just a definition -- complete inner product (dot product in much of physics and engineering) space. But the good stuff is (1) importance of the examples and (2) the theorems that show the consequences, e.g., in Fourier theory.
Well, the vector spaces of most interest in linear algebra are actually (don't tell anyone) finite dimensional Hilbert spaces. So, one role of FDVS is to provide a text on linear algebra that is also an introduction to Hilbert space, that is, that tries to use ideas that work in any Hilbert space to get the basic results in linear algebra.
The treatment of self-adjoint transformations and spectral theory are likely the most influenced by this role.
This role is accomplished so well that sometimes physics students starting on quantum mechanics are advised to get at least the start they need on Hilbert space from FDVS.
Sure, a better start is the one chapter on Hilbert space in Rudin's Real and Complex Analysis. The chapter there on the Fourier transform is also good, short, all theorems nicely proved, the main, early results made clear.
Also a good start on the basic results of self-adjoint matrices are the inverse and implicit function theorems given as nice exercises in the third edition of Rudin's Principles .... And spectral theory is in Rudin's Functional Analysis. Also get a bonus of a nice treatment of distributions, that is, replace the Dirac delta function usage in quantum mechanics.
For how to get the eigen value and orthogonal eigen vector results for self-adjoint matrices from the inverse and implicit (these two go together like ice cream and cake) function theorems is in Fleming, Functions of Several Variables. Then you will be off and running on factor analysis, principle components, the polar decomposition, the singular value decomposition, and more.
I once read original works of Sir William Rowan Hamilton and was amazed at the clarity of thought he demonstrated, a truly open mind to understanding nature.
Books by Max Born (Atomic Physics, and Theory of Relativity) are absolute awesomness.
A mathematical theory of communication (information theory, Claude E. Shannon) [1]
Three approaches to the quantitative definition of information (A. N. Kolmogorov) [2]
And from inductive inference and computational learning theory:
A formal theory of inductive inference (Ray Solomonoff, 1964) [3]
Language identification in the limit (Mark E. Gold, 1967) [4]
Inductive Inference of formal languages from positive data (Dana Angluin, 1980) [5]
A theory of the learnable (PAC learning, Leslie Valiant, 1984) [6]
Occam's Razor (Blumer et al, 1987) [7]
Bonus: a deep learning paper
Long Short-Term Memory (Hochreiter and Schmidhuber, 1997)
The first two papers - well, one launched information theory and the other is Kolmogorov's
paper where he introduced the idea of Kolmogorov complexity.
The second batch of papers start with Solomonoff's inductive inference papers, kinda
important if you want to learn things from other things. Mark Gold's paper proves that
it is impossible to learn a non-finite automaton from examples. Dana Angluin's follow
up extends this with learnability results about various classes of CFG. Any time someone
claims that their deep neural net has learned a CFG, point them to these two papers.
Valiant's paper is the theroy of machine learning as we know it today. It basically
relaxes the assumptions made in inductive inference and introduces the notion of error.
If you can't learn some concept perfectly, what degree of error is likely from some
set of training data? Blumer's paper discusses a further bound on that amount of
error that follows an Occamist bias (simplest truths are better) and is a basis
for understanding overfitting (error increases as the hypothesis space does).
These two sets of papers probably look disconnected - but, learning is compression.
Compression, with generalisation, I guess. Anyway, no, they're not unrelated.
The final paper is the one that introduced LSTMs and the, er, "constant error carousel"
(the solution to vanishing gradients, which this paper is worth reading for).
These are papers that one must read if they're interested in machine learning. Carefully
so. They're not even "old" papers- more like, essential ones.
I'm totally omitting a whole bunch of others, obviously.
http://math.harvard.edu/~ctm/home/text/others/shannon/entrop...