One course I've been going through lately is the MIT course on Abstract Interpretation ( http://web.mit.edu/16.399/www/ ). Programmers tend to break execution into "cases" to reason about it; e.g.: when writing a routine to reverse a string, you might think about the cases where the string is of even or odd length. Abstract Interpretation lets you capture this kind of reasoning precisely, and thereby automate it.
Olin Shivers's "Control-Flow Analysis of Higher-Order Languages" (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.9.39...) is all about using abstract interpretation for compiler analysis. It's focused on Scheme, but is applicable to higher-order languages in general.
MIT's OpenCourseWare is a fantastic project. It's been a while since I was involved with it, but I contacted them a few years ago and helped TeX up some of the notes (for Physics courses, not CS). Anyhow, if you feel inspired or just want to learn a subject even better by reading its notes carefully enough to typeset them, consider contacting OCW and asking if they've got anything available. It's a good experience and it's nice to think about how many people benefit from it.
Why does a HS/freshman-level discrete math course count as "advanced"? Same with that undergrad algorithms course at UIUC (though it does have great lecture notes).
You're right, it doesn't really. When I started building this list it was more carefully given over to graduate-level courses. It evolved into a bit of a repository of courses that aligned with my interests and thought were good.
M'eh, I think that's more comparing Apples to Oranges. I really like this list. It's a bit more academic, and like someone else here said, it's a great resource for those who have a weakness in one subject they'd like to tighten up, or just learn more about what's on the frontier or CS classes these days. The 4chan one seems more like a how-to guide. Just my 2 cents :)
Ha, I'm currently wearing my 251 course staff ("yer sould = pwned") shirt as I type this.
The most recent website is contained in https://colormygraph.ugrad.cs.cmu.edu/ . One of the TAs used the course as a testing grounds for his pedagogy research, which has students grading each other (doing "verifications") as part of their assignment, and spent a while building the course infrastructure to support it. (We then grade them on their grading.) After a bit of tweaking (e.g.: we quickly realized the students would primarily benefit from verifying a couple problems rather than all of them), it wound up working really well!
We begin with the notion of a finite probability distribution D, which consists of a finite set S of elements, or samples, where each x in S has a weight, or probability, p(x) in [0,1]."
Sorry, guys. They blew it. That sentence is without a doubt the most mixed up, confused, uninformed, misinformed, just plain wrong mess I've ever seen in what purports to be some important mathematics. We're talking total upchuck here. Don't read that garbage.
(1)
"finite probability distribution"
Likely what he means is a discrete distribution.
(2)
"distribution D, which consists of a finite set S of elements, or samples"
Total nonsense. A "distribution" does NOT consist "of a finite set".
The rest is also nonsense.
He wants to discuss random variables but gets off on distributions far too soon.
Here is a much better way to proceed:
Suppose we perform an experiment and measure some number X. If we do the experiment again, then the number we get for X might be different. We call X a 'real random variable'.
For a real number x, we can consider the probability that X <= x. We write this probability as P(X <= x). We also write as the 'cumulative distribution' of X F_X(x) = P(X <= x). [Note: Here F_X borrows from Knuth's TeX notation for F with a subscript X.]
If X takes on only finitely many values, then we might say that X and its cumulative distribution F_X are 'discrete'.
Here is a still better way to proceed: We have a non-empty set S (usually denoted by capital omega) of 'trials'. Each experiment we perform is one 'trial' and corresponds to some point s in S (usually a trial is denoted by a lower case omega).
Given a subset A of S, we call A an 'event'. We have a probability P defined on events. The 'probability' of an event A is written P(A) and is a number in [0,1].
If in our experiment we observe a number, that number is a real random variable; call it X. Then X is a function from the set of trials S to the set of real numbers R. So, X: S --> R.
Then for a real number x, there is the event
{s | s is in S and X(s) <= x}
with shorthand notation {X <= x}. That is, we usually suppress mention of a trial s.
Then the probability that X <= x is written
F_X(x) = P(X <= x)
and is the 'cumulative distribution' of X.
For more details, we ask that the set of all events includes S and is closed under complements and countable unions. Usually the set of all events is denoted by script upper case F.
And we ask that for disjoints events A(i), i = 1, 2, ..., the probability of the union of the A(i) is the sum of P(A(i)). That is, we ask that P be 'countably additive'.
Suppose X is a real random variable with cumulative distribution F_X, and suppose for some positive integer n and i = 1, 2, ..., n Y(i) is a real random variable. Suppose the set
{Y(i) | i = 1, 2, ..., n}
is independent. And suppose for each i, the cumulative distribution of Y(i) is F_X. Then we can regard
{Y(i) | i = 1, 2, ..., n}
as a 'sample' of size n from cumulative distribution F_X.
Full details are in each of:
M. Loève, 'Probability Theory, I and II, 4th Edition', Springer-Verlag, New York.
Jacques Neveu, 'Mathematical Foundations of the Calculus of Probability', Holden-Day, San Francisco.
Leo Breiman, 'Probability', ISBN 0-89871-296-3, SIAM, Philadelphia.
Kai Lai Chung, 'A Course in Probability Theory, Second Edition', ISBN 0-12-174650-X, Academic Press, New York.
Loève was long at Berkeley, and Neveu and Breiman were among his students. Neveu has long been in Paris, and Breiman has long been at Berkeley. Chung has long been at Stanford. Other experts in such math include Avellaneda at Courant, Bertsekas at MIT, Çinlar at Princeton, Dynkin at Cornell, Karatzas at Columbia, Karr at UNC, Shiryaev at Moscow, Shreve at CMU, Wierman at Johns Hopkins, among others.
This disaster illustrates an important lesson: Computer science is out of gas, that is, doesn't know what to do next. It really has only one promising way out now, and that way is to 'mathematize' the field. So, the progress needs to be essentially applied math. For this progress, computer science needs to know some appropriate math. Basically each person trying to do such work needs a good undergraduate major in pure math together with some selected graduate work in pure and applied math. However, only a tiny fraction of professors of computer science have these prerequisites. Thus their work that needs math is often upchuck as in the example here although usually not quite this bad.
Net, anyone who wants to make progress in computer science should f'get about current academic computer science, study math, and then attack problems in computing as an applied mathematician. Bluntly, the alternative is just upchuck as here. Sorry 'bout that.
Any student trying to learn some topics in math in a computer science department is likely wasting time and money and filling in much needed gaps in his knowledge. To learn math, go to a math department. To learn probability, start with one of the books and/or professors above or equivalents.
15-251 TA here. Probability is one of my favorite topics, but I agree with (this part of) the course's treatment. If we really wanted to do probability the right way, we'd tell them that a random variable is a structure-preserving map between measure spaces. You can't do that with freshmen. Indeed, we deliberately avoided any mention of continuous probability spaces.
Your conclusion may be generally true, but it's definitely off the mark here. The associate math head, who guest-taught a few lectures, has joked that "251 is listed in the wrong department. Many math majors take it to satisfy their discrete math requirement, and it's not because they're being lazy. We frequently borrow homework problems from the USAMO and other olympiad-level math competitions. 251 is far harder than any undergraduate math course at CMU.
251 is far harder than any undergraduate math course at CMU.
I had to register an account just to say one thing ... BULLSHIT.
Math Studies was an order of magnitude harder than 251. Even factoring in the two homeworks and lectures of math studies, it's still a night and day difference. I took both last semester, and halfassed 251 and did well. There's no possible way to do that with math studies.
Just read your profile and realized who you were ... but I still hold to my opinion.
I considered adding an exception, but then I remembered Math Studies is really two classes. I definitely found Math Studies less than twice as hard as 251.
Of course, I might also say that I five-sixths assed Math Studies and three-halves-assed 251.
Maybe I shouldn't have rushed to make that statement. It certainly made graph theory with Mackey look like kindergarten, but I'm now remembering the stories about graph theory with Pikhurko and Set Theory.
P.S.: Hacker News is a proven way to break reddit addictions. Welcome!
I got off easy with doing Set Theory with Greggo. I know that Cummings once taught it ... and assigned 6 homeworks! That class must have been impossible.
EDIT: I just read over their final ... and it looks as hard as either of the math studies algebra finals.
"Probability is one of my favorite topics, but I agree with (this part of) the course's treatment."
This statement is an unfortunate juxtaposition: Glad probability is one of your "favorite topics", but you make serious mistakes right at the beginning and show that you don't know anything significant at all about the subject, not at any level from freshman to the texts I listed.
That you "agree" with that "part of the course's treatment" is absurd, especially after what I wrote. That "treatment" is a total upchuck. You, the professor, and CMU should all be humiliated and ashamed. It would be tough to get worse even from a storefront for-profit 'college'. For probability at CMU, I listed an excellent source, Steve Shreve.
"If we really wanted to do probability the right way, we'd tell them that a random variable is a structure-preserving map between measure spaces."
There you go again. You are digging your hole longer, wider, and deeper. You are spouting nonsense.
A "random variable" is definitely not a "structure-preserving map between measure spaces". Not a chance. You won't find "structure-preserving" mentioned anywhere in the definition of a random variable in any of the four texts I listed. Your "structure-preserving" is just gibberish you got from some source you should not touch and then burn outdoors and then flush.
The most advanced definition, as in the four texts I listed, is that a 'random variable' is a measurable function from a probability space into a measurable space. What I described for the set of all events is called a 'sigma algebra'. Then a 'measurable space' is a non-empty set and a sigma algebra of subsets of it. A function is 'measurable' if for each set in the sigma algebra in the range its inverse image under the function is a set in the sigma algebra of the domain. A 'probability space' is a measurable space and a probability measure on that space, and a 'probability measure' is a non-negative measure with total mass 1. For a 'measure', see any of, say,
Paul R. Halmos, 'Measure Theory', D. Van Nostrand Company, Inc., Princeton, NJ.
Walter Rudin, 'Real and Complex Analysis', ISBN 07-054232-5, McGraw-Hill, New York.
H. L. Royden, 'Real Analysis: Second Edition', Macmillan, New York.
I omitted the requirement for a random variable being measurable because for the more important measurable spaces, say, the real numbers with the Borel sets or the Lebesgue measurable sets, finding a function that is not measurable is super tough: The usual construction uses the axiom of choice.
"You can't do that with freshmen."
Well, no one should do "that" with anyone, but with your "agree" with that "part of the course's treatment" here shows that you have been doing even worse, apparently with freshman.
You are flatly refusing to take me at all seriously and refuse to 'get it': Your definition of a random variable is sewage, and you are even defending it.
For what to do with freshman, I gave more than one appropriate, accurate enough, and easy to take, definition of a random variable. There are also other sources. You very much need to take some such source seriously.
You just won't stop digging your hole longer, wider, and deeper: There you go again with your:
"Indeed, we deliberately avoided any mention of continuous probability spaces."
You are spouting more gibberish from some source that should not be touched and then burned and flushed. Your comments continue to fill much needed gaps in the teaching of probability.
There is no such thing as a "continuous probability" space.
Continuity is based on a topology, and there is not necessarily, and usually never is, any topology on the probability space in question. It is true that the Borel sets are the smallest sigma algebra that contain a given topology, usually the 'usual topology' of the real line or Euclidean n-space.
Where continuity enters is in a continuous density or an absolutely continuous cumulative distribution.
In practice what this means is that a 'density' is a continuous function f on, usually, the reals where f is non-negative and its integral is 1. Then the corresponding cumulative distribution F is the 'indefinite' integral of f, in TeX:
F(x) = \int_{-\infty}^x f(x) \; dx
Copy this line into TeX, and the result will look nice, and, more importantly, be correct.
It's crucial to discuss continuous densities in order to discuss random variables with Gaussian, uniform, exponential, chi-squared, etc. distributions. Gaussian is crucial for the central limit theorem. Uniform is crucial for discussing the usual random number generators, e.g., as in the recipes in Knuth's TACP that pass the Fourier test. The exponential distribution is crucial for discussing arrival processes, e.g., when the next Web page request will arrive at a Web server. Chi-square is one of the first distributions encountered in elementary statistical tests, e.g., for independence of two random variables. These distributions are commonly taught in first courses in statistics to students in the social sciences. Looks like the CMU computer science students are falling behind the sociology students!
More details on distributions involve the Lebesgue decomposition and absolute continuity, and you will find solid treatments in each of the four texts I listed along with both Rudin and Royden.
Loève also outlines the bizarre 'singular continuous' case, as I recall, based on the Cantor function.
"Your conclusion may be generally true, but it's definitely off the mark here."
I'm not "off the mark"; I'm dead on target. I gave some good, elementary definitions. Elementary should not mean sewage, but your elementary definition of a random variable is sewage.
"251 is far harder than any undergraduate math course at CMU."
Sounds like the CMU math department doesn't teach, say,
Walter Rudin, 'Principles of Mathematical Analysis, Third Edition', McGraw-Hill, New York.
or the equivalent. Tough to believe.
The material you are teaching is easy, plenty easy enough for a moderately easy course for freshman. Any difficulty is from the need for students to make sense out of sewage content such as your definition of random variable.
Again, for probability done seriously, see the texts I listed; for good elementary texts, there are many, but I have none at hand; often there are good, elementary treatments of random variables in the better texts on statistics for the social sciences; consider also texts on signals in EE; for probability at CMU, see Shreve.
For your course 251, do everyone a big favor and quit teaching it, burn it, and flush the ashes. Then start with some people who actually know the relevant math and develop a good course. And, really, from the list of topics, the course belongs in a math department. And, the course need not be difficult.
Bluntly, computer science very much needs to know math, especially probability, but generally gets a grade of D or less and on probability, an F.
I listed some world class experts in probability (I learned from one of them), but it is true that probability is not popular in US pure math departments, and a significant fraction of math professors will never have seen the more advanced definition of a random variable I have given here; they may not know the strong law of large numbers, conditional expectation, the definition of a Markov process or a martingale, etc.
You are badly wrong. Many people don't know anything about probability; that you don't is not so bad. But that you spout total nonsense about probability is bad; that you defend this nonsense is much worse. Keep fighting me on this and you will dig a seriously big hole for yourself and CS at CMU.
You want me to write Jerry and advise him that 251 needs to be cleaned up?
Yes, and the definition of a measurable map is a function between two measure spaces which takes measurable sets to measurable sets, i.e.: a structure-preserving map between measure spaces. I stopped reading after that. I have no need to defend my mathematical knowledge, and no time to listen to someone who wants me to.
I sincerely apologize for my above post; I did not realize I was dealing with a troll.
"Yes, and the definition of a measurable map is a function between two measure spaces which takes measurable sets to measurable sets, i.e.: a structure-preserving map between measure spaces."
That's not at all what I wrote. You really don't even know how to read a definition in math, do you? Do you know any math at all?
You are wrong again; a counterexample is trivial to construct.
Here you have no need to defend your knowledge of math in general, just on one point, the definition of a random variable.
You are seriously, flatly wrong mathematically. Name calling and refusing to read won't make your nonsense correct.
Enjoy looking like a fool before the world of computing, forever.
- Systems, especially performance testing. Needs some statistics, but it's not hardcore math. No more than psychology or economics.
- Systems architecture. Not at all mathy. More like the biology of computer systems.
- Best practices. Software engineering. Not really mathy. Not really science either.
- OOP. Theology?
Of course, if you are good at math, then the most mathematical parts of computer science may be the ones that catch your eye. So you equate computer science with math + a bit of fluff. But there are things to study that aren't just math, even in the field of computer science.
Well, can do what Knuth did in TACP. There he did a lot with combinatorial formulas. To make much more progress, will have to get serious about math. E.g., the leading question in algorithms is just P versus NP, and that is now darned serious math. Don't attack that or even parts of it without a good background in math.
Other new and challenging questions in algorithms promise to need math for progress.
As I look at algorithms in 'advanced computer science', commonly they want to treat optimization. Tilt! Optimization is a huge field from applied math, operations research, and electrical engineering. There is deterministic and stochastic optimal control, Kalman filtering, integer linear programming, and much more. It's darned good applied math, and the math background I outlined is needed.
"numerical analysis"
That's a field of applied math. E.g., quickly get into advanced parts of matrix theory. E.g., consider R. Horn's books. E.g., quasi Newton quickly becomes an exercise in matrix norm theory. Long one of the more important tools in numerical analysis has been functional analysis. Likely the leading reason to pursue numerical analysis is just to get solutions to partial differential equations, and don't go there without a good background in math and likely the corresponding mathematical physics.
For the last two, they are just fields of applied math.
"Data mining and AI"
The way computer science pursues these two, they are nonsense fields. The first should be just mathematical statistics, and for that the background I gave in probability is crucial. E.g., will want to know sufficient statistics, and that is based on the Radon-Nikodym theorem, and that is graduate pure math.
For AI, if someone can write a computer program that really has 'intelligence', fine. If all they use are intuitive ideas, good for them. But so far, the field of AI is very far from this goal in spite of decades of DARPA funding.
For now, if want a system that solves a problem well enough to look 'intelligent', then just engineer the system with the usual role for applied math. I gave a paper at an AAAI IAAI conference with the "25 best applications of AI in the world", and the best applications, really, were just good engineering.
"Systems, especially performance testing. Needs some statistics, but it's not hardcore math. No more than psychology or economics."
For some simple applications, yes, can just borrow statistics from the social sciences.
But for progress, it's back to "hardcore math": E.g., I published a paper on 'performance monitoring', that is, zero day anomaly detection in server farms and networks, and the math was based on some of the more advanced parts of the texts I listed. Basically I found a collection of multivariate, distribution free hypothesis tests. The social sciences have been using univariate distribution free hypothesis tests for over 60 years; my work was apparently the first good progress to multivariate distribution free tests. Multivariate tests are just crucial for analyzing performance data. I used a finite group (from abstract algebra) of measure preserving transformations something like in ergodic theory. My work was similar to some of what Diaconis at Stanford has done with exchangeability in distribution free statistics. This material needs all the background I outlined and more.
"Systems architecture"
For the future, before we build a large system, we will want the 'architecture' to have some known properties. We do this for bridges, buildings, dams, ships, airplanes, etc. So, we will want to do it for systems. We will want to know about reliability and security, at least. And we may want to 'optimize', that is, get what we need at minimum price.
E.g., consider part of the core of the Internet: Suppose we are given nodes and flows at the nodes. Then our mission, should we decide to accept it, is to connect the nodes at minimum cost to provide desired capacity, performance, and reliability.
Dean of Engineering at MIT T. Magnanti gave a Goldman lecture at Johns Hopkins on this problem; first-cut it's a super tough problem in integer linear programming. Actually, it was such problems in network design, and integer linear programming, that got Bell Labs going on the work that resulted in
Michael R. Garey and David S. Johnson, 'Computers and Intractability: A Guide to the Theory of NP-Completeness', ISBN 0-7167-1045-5, W. H. Freeman, San Francisco, 1979.
which is one of the key books early in the problem P versus NP.
It is common now to throw together a system, have various intuitive approaches to parallelism and redundancy, and then get a problem and see the whole server farm go south. Recall that this is just what happened at Amazon a few weeks ago.
Maybe some people, say, in national security, finance, or air traffic control would like such things not to happen?
A common situation is parallelism that is not more reliable but worse: E.g., there was a parallel transaction processing system. The load balancing sent the next transaction to the least busy computer. Then one day, one of the computers got a little sick, started throwing all its work into the bit bucket, was not very busy, was getting nearly all the transactions, and, thus, killed the whole 'cluster'. Bummer.
Something similar happened with Google's e-mail servers.
Instead, we need some theory with some theorems and proofs that provide some guarantees that we are getting the performance and reliability we need. That work will be mathematical; without a good background in math, don't try. And, yes, likely the work will need probability such as I discussed.
"Best practices. Software engineering. Not really mathy. Not really science either."
Not really computer science research either.
"OOP. Theology?"
There have been some connections with category theory, but I'm reluctant to take those seriously.
Mostly OOP in practice is simple and works well for some simple things. Otherwise OOP is not well thought out. And long OOP was a 'theology'.
My old view of programming languages is that they should be designed so that they 'admit' some source code transformations that have some useful properties. E.g., suppose we have such a language with such properties and have two pieces of code. Can we use the transformations to check if the two pieces of code are equivalent? So, right, we will have 'equivalence classes' of code. Some of these will do better on processor time, main memory usage, exploitation of multiple cores, etc. than others. So, with some code and these transformations, we have an optimization problem: Transform the code to equivalent code with the 'resource' properties we need. Sounds like math to me.
"But there are things to study that aren't just math, even in the field of computer science."
The problems are from computer science. But the good solutions are necessarily applied math because our civilization knows no other way to proceed. This situation is much that same as in all fields that have been 'mathematized', especially theoretical physics.
I feel like you may be ignoring special functions, although you did mention NP completeness. The un-mathematized areas of CS make liberal use of intuitive formulations of feedback systems, and are more or less unassailable to analysis. Computer Engineering can tackle some of these, but there is a lot of pressure to swap these intuitively good solutions to worse, but more tractable ones.
CS can and has the taken head-on a large number of problems mathematics cannot handle. These are hard to sciencify, which is why the prominent direction of CS seems to instead be expressiveness. You can't take engineering out of computing, and CS seems to be yet another attempt at systems study.
Some progress can only be made using advanced math. But that doesn't mean that no progress can be made without advanced math. That said, I wouldn't recommend a PhD in computer science to anyone afraid of mathematics.
Either way, I think that more advanced math needs to be taught earlier. Otherwise you get big cargo cult disciples with their own re-invented wheels and funny terminology, and no interest in math from other fields. Economics in particular.
We tell our 251 students that the class should be called "Some Theoretical Ideas for Computer Science." Last semester, the professor and six of eight TAs had math majors. We definitely don't disagree with you. You're barking up the wrong tree.
Some of these courses look really interesting. Having just passed my Algorithms Qual it is interesting to see the different ranges of topics covered in equivalent courses.
Nice, but seems to me that a thorough study of Knuth V-1-4 plus computational Mathematics might be a better approach. Then cherry pick the offered list.
I don't think that's why. I think it's typical linkbait.
Anyone who is serious about learning will follow the obvious trails of references in papers they read, course listings and book requirements for those courses at university websites, Wikipedia links, and so on. Anybody who found HN can find CS courses.
If you'd look closely these resources are grouped around a single topic, and carefully selected by an expert in a highly specialized field. It saves me, personally, a ton of time reading marginal papers and browsing through a clusterfuck of references.
I did look closely at this list. I commented because it's such a typical example of unearned upvote candy.
I could have assembled this list (as a non-computer scientist) in less than an hour just by looking at the usual places I look for lecture notes and following the typical linkbait-maker advice of leaving a brief comment and headings. The author admits s/he is still working through the material and the details on each are minimal, so this curator ≠ expert.
Nor is the field "highly specialized". arxiv/math.QA is specialised. Welding less-than-1cm aluminium for a particular part in powerboat engines is highly specialised. Computer Science Major is an extremely broad area of study.
To me this is exactly a "clusterf_ck of references".
Ok, I should have renamed it from the original title given by the author to "Excellent List of Distributed Systems and Relevant CS Courses With Good Lecture Notes, Assembled by a University of Cambridge PhD in Mobile Distributed Systems, Zookeeper Committer and a Glorious Systems Engineer Who Lives and Breathes Distributed Systems at Cloudera" to get less upvotes.
Just read the rest of his blog. This is not a trivial list of references that you can compile by googling without thorough understanding of the subject matter, trust me, I've been digging this field for too long [1]
Alternatively you can ask hundreds of people who re-tweeted this link[2], bookmarked it on their Delicious [3], or those who study it now, why did they think it was good. Look at their profiles, what percentage of them are dumb link-followers?
Or simply try to compile and post a list of courses with the same title and see how many upvotes/retweets/bookmarks it gets and how quickly it disappears from the front page if it gets there in the first place. HN users are largely linkbait-averse, that's what I like about this community.
helwr, I respect the fact that you're intelligently and politely continuing this debate. Maybe it is not worth your time because of my cynical attitude.
I have very little respect for Quora, HN, retweets, or bookmarks which are accomplished with a click. If it makes you feel better, something I wrote also reached the front page of HN and received an undue number of upvotes, retweets, and social bookmarks by people who I guarantee have not read even 1% of the works I pointed to -- all because the blog post followed the linkbait formula.
I Truly Believe that people bookmark/upvote/retweet things they "intend to read" but never actually read, and that this systematically increases the level of noise masquerading as signal on the web.
Having stated that background - some of the response you got on Quora looks fine; the intelligent professors retweet just as sheeply as everyone else since the private cost is $0; and bookmarks signify nothing more than that this story reached the front page of HN.
Cheers, thank you for the rational back-and-forth.