Hacker News new | past | comments | ask | show | jobs | submit login
The Matrix Cookbook (2012) [pdf] (uwaterloo.ca)
183 points by sebg on June 15, 2023 | hide | past | favorite | 53 comments



This cookbook was insightful in the sense that I once tried proving several of its identities via Kronecker products but then found that it was much easier using tensor notation, so easy that I almost forgot Kronecker products altogether. I put together some proofs in my blog [0] (note: it's incomplete and the URL might change!).

[0] https://mbustamanter.github.io/ssg-blog/2022/matder1/


FYI the headings in this blog are barely legible in bromite for me, I can imagine they're supposed to look that way


Seems to look fine in both Firefox and Chromium.


I love the "Notation and Nomenclature" section. So many academic papers assume readers will infer the type of a variable by its font styling, which is frustrating if you aren't already very knowledgeable in that particular field.

To clarify - using an actual programming language (any language, I'm not picky) would still be better than inventing an ad-hoc, unverifiable font-based type system every time you publish a new paper. But having a "Notation" section is still vastly better than the usual approach of "guess what I mean".


To be fair, fields have conventions that help quite a bit. Matrices are almost always (I have seen exceptions) capital letters.

I used to be an advocate for some kind of universal notation, but honestly it is not practical. A useful notation for one paper can be incredibly cumbersome in another paper in the same field. This gets worse if you’re paper is at the border of two fields and their notational conventions are mutually inconsistent. It’s best to just have a notation section that sets forth your convention in the document and stick to it. Ultimately, if you’re reading a paper in an area in which you are knowledgeable and can’t parse the notation quickly, the author failed to write the paper well.

As Whitehead said, “By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and in effect increases the mental power of the race.”


I agree with the Whitehead quote. Ken Iverson opened his 1979 Turing award lecture ("Notation as a Tool of Thought") with it. And APL would be a suitable replacement for 90% of the bespoke pseudo-notations I've seen in hundreds of academic papers in my domain (machine learning and reverse engineering).

Contrast this with 2 papers I've read recently which independently defined a "function" that returns 1 when its argument is true, and 0 when its argument is false. In both cases, the purpose of this function was to either keep or cancel part of an arithmetic expression by multiplying by either 1 or 0. Each paper defined this function with different notation (one used prefix notation: "Fn(condition)", the other used brackets: "[[condition]]"). Each paper spent several sentences defining the meaning of this notation and the associated function.

Meanwhile in APL, true and false are 1 and 0. If these authors had used APL, they could both have used the same notation, and neither would need to explain beyond saying something like "we use APL notation here". This behavior is not unique to APL - JavaScript, C, Python, and many others have "truthy" values which can be treated as 1 or 0 for the purpose of multiplication.

I don't care if authors use APL (or Python, or Idris, or whatever). I just would prefer that (when possible) they use an existing language which is actually learnable. If it's acceptable to not define your notation (relying instead on tacit conventions and context) why not use an existing language?


> So many academic papers assume readers will infer the type of a variable by its font styling

wait seriously??


Yes, have you never read a math paper before? Bold = vector is a common convention, for example.


Mathematical notation is a programming language.

The notation section is type definitions.


> Mathematical notation is a programming language.

It actually isn't - the notation has lots of ambiguities that don't confuse mathematicians because they are aware of the context.

On Proof and Progress in Mathematics[1][2] is a great essay by a Field's medalist. Part of the essay discusses this very topic, and why mathematicians choose not to use a formal notation for every day work. Some quotes from it:

> The standard of correctness and completeness necessary to get a computer program to work at all is a couple of orders of magnitude higher than the mathematical community’s standard of valid proofs. Nonetheless, large computer programs, even when they have been very carefully written and very carefully tested, always seem to have bugs.

> When one considers how hard it is to write a computer program even approaching the intellectual scope of a good mathematical paper, and how much greater time and effort have to be put into it to make it “almost” formally correct, it is preposterous to claim that mathematics as we practice it is anywhere near formally correct.

[1] https://arxiv.org/abs/math/9404236

[2] I discovered it via HN comments a while ago.


Yes, but I think this presents the core of the problem in modern pedagogical methods when it comes to mathematics. The Bourbaki attempted to reduce math to a highly axiomatic foundation, while disregarding the intuition and visualization that used to be a part of mathematics. The issue is that this sort of "code only" or "language only" approach really works when mathematics is a true "perfect language", the likes of which philosophers were attempting to construct, but is likely in fact impossible to create. Unfortunately, not only did the ideas of the Bourbaki fail, as modern research advances mostly still work with intuition instead of their ideas, but their approach polluted and ruined education. Many "textbooks" are terribly written reference books that have gaps and ambiguities that only people already knowledgeable in the field know about. Rudin's Analysis textbooks are probably the classical example of this. I would argue that any notation or abuse of notation is fine within insular fields and private practice, but there does need to be a leaning towards universal notation within all pedagogical works, at least up through all the core Algebra, Analysis, and Geometry and Topology work that you would see within a PHD qualifying exam.


There's a huge difference between "not being confused" and "understanding".


honestly pretty disappointed that this doesn't have Neo's famous Buckeyes recipe - even Morpheus's Enchilada Casserole or that juicy and delicious steak Cypher snacks on during his meeting with Smith would've been appreciated. Hell even Trinity's Ranchwater would be nice, even if it's only a couple ingredients, there's just something about the way she makes them.


You don‘t get funny points on technical topics. HN ain’t Reddit. ;-)


It should have been just endless different ways of serving Taystee Wheat


Related:

The Matrix Cookbook (2012) [pdf] - https://news.ycombinator.com/item?id=18566449 - Nov 2018 (39 comments)

The Matrix Cookbook (2012) [pdf] - https://news.ycombinator.com/item?id=14726223 - July 2017 (9 comments)


The section on multivariate Gaussians was a real life saver at work recently. Highly recommended!


Blast from the past!

I actually used this Cookbook to look up a matrix derivative that I used in my PhD thesis!


I used the derivatives section on basically every homework assignment in grad school.

I wish CAS systems were better at linear algebra!


My favorite matrix problem that I've had to do was showing that when viewed as a function from n^2 Real space to the Real number line, the determinant is continuous and differentiable. Computing its derivative is a particularly satisfying exercise in pattern parsing

From that you get that the set of Orthogonal matrices(Matrices with determinant 1) end up forming a manifold


Tom Minka's "Old and New Matrix Algebra Useful for Statistics" is also really handy: https://tminka.github.io/papers/matrix/minka-matrix.pdf

I used to always stare at these two PDFs while deriving machine learning algorithms back in the day. (At least my hand rolled code C++ still beats PyTorch's reverse autodiff performance.)


There’s also the famous Numerical Recipes book. That is more like a set of steps you can follow when writing your code, which seems like what a recipe/cookbook ought to contain.

Since this work is about identities, aka representations that can be swapped in for one another, IMO they should have called it the “matrix ingredient substitution guide.”


IIRC the Numerical Recipes book was a bit of a copyright Trojan Horse. (I don't know if that's really true, I just heard it mentioned somewhere.)

Presumably this book doesn't carry that risk.


> IIRC the Numerical Recipes book was a bit of a copyright Trojan Horse. (I don't know if that's really true, I just heard it mentioned somewhere.)

Oh yes, this brings back memories. The first edition, I believe, had a fairly permissive license. The second edition had a more restrictive license (it probably wasn't that bad - the authors wanted a payment for use, and I don't think it was a lot).

I was an undergrad doing a summer internship with a professor, and he wanted me to implement the method of steepest descent (or something similar) in Fortran for his computation. He told me to just copy it from the NR book. I dutifully looked at the license and told him we needed to pay to use it. He looked at it and said "Wow, they've gone hard core." Then he handed the book back to me and said "Well, we need an implementation in Fortran. Find a way, and don't come back to me until it's done."

I'm pretty sure that was code for "Do it and don't tell anyone." I instead spent days finding code online that was not restricted. Finally got one that worked.

Later on I spoke to numerical computation researchers and they had a universal disdain for that book. Apparently a lot of the algorithms were outdated - even when the book was published. Better algorithms existed, and they were numerically stable.


previous discussion of these problems: https://news.ycombinator.com/item?id=30895307

> The real problem is the Numerical Recipes license, which is extremely restrictive - probably more so than most users realize. I won't dignify it with a link, but you can find licensing terms on their website. NR routines are copyrighted (fine) and cannot be redistributed as source (annoying but not uncommon for COMMERCIAL software, somewhat unusual for scientific software). There is no exception for noncommercial or scientific use, which is Grinchy and irritating, especially given that two of the authors acknowledge funding from the NSF for work on numerical methods. Beyond that, the single-CPU/single-screen terms of their license are almost impossible for a working scientist to comply with, especially in a networked environment.


Everything written is copyrighted, whether it's in a published book or not.


Not quite. In the US at least, works by the federal government are not covered under copyright. Nor are 'mere facts' (eg, a complete list of phone numbers and names for a region). See https://en.wikipedia.org/wiki/Copyright_law_of_the_United_St... .

bee_rider already referred to https://en.wikipedia.org/wiki/Numerical_Recipes#License so I leave it at that.


I wasn’t aware of the issue, but a Wikipedia seems to indicate that there was a concern that the specific implementations of the algorithms in Numerical Recipes are copyrighted, and there’s always a temptation to copy that sort of thing directly out of the book.

The linked PDF here is a big collection of mathematical identities. I’m sure their specific representation in the PDF is copyrighted somehow, but there’s no real temptation to use the representation here, just the mathematical idea.


That's not the issue. The license is akin to buying software, then finding out you have to pay an additional license to run it, even on your personal computer.


This pattern emerging of people posting PDF downloads should be discouraged and discontinued. I dont need to explain that.


Numerator vs Denominator layout. The great debate of matrix algebra.


For those clicking in expecting "Neo's protein porridge," "Morpheus Meatballs," and "Trinity's Tripe," you are going to be disappointed in that this is neither a delicious pop-culture recipe book, nor does it relate to hit Wachowski trilogy "The Matrix."


Neo : What are you trying to tell me? That I can dodge ZERO PIVOTS?

Morpheus : No, Neo. I'm trying to tell you that when you're ready, YOU’LL HAVE AN SPD MATRIX.


In the Wachowskis' defense on that one, it would be a really short cookbook.

Oatmeal: tastes like chicken.

done.


Don't try to "taste" nor try to identify the "taste" of the oatmeal...Because, there is no oatmeal.


> Oatmeal: tastes like chicken.

Surely you mean tuna.


I actually came in hoping for the chat protocol >_>


Same


"It's a single-celled protein combined with synthetic aminos, vitamins and minerals. Everything the body needs."


It doesn't have everything the body needs.


Unlike Gatorade, which has electrolytes.


I was at least hoping for a cookie recipe.


yes I was a little disappointed, ngl

a post-apocalyptic recipe book would be really interesting (canned cultivated meat, mycotissues, texturised hydroponic soya and boiled grains, yummm)


__void says: "a post-apocalyptic recipe book...":

That would be "To Serve Man":

https://en.wikipedia.org/wiki/To_Serve_Man

You can view that "Twilight Zone" episode at:

https://www.youtube.com/watch?v=wJjvg-Gq1LE


[flagged]


The battery thing was the dumbest plot point, which is weird, given that it was in the first movie.


It was executive meddling: the original idea was the machines were using the human's brains as a supercomputer, but it was thought this would be too difficult for audiences to understand.


That would have been a much more sensible reason.

My head-canon is that the machines were actually just trying to keep the humans from self-destructing, but supercomputer is a fine answer if we still want the machines to be fundamentally antagonistic towards humanity.


It was supposed to be storage and processing but the directors thought that viewers wouldn't understand that a human brain could store (what was thought to be at the time) a few terabytes of data, so they dumbed the concept down.

It would have made so much more sense that Neo had the ability to manipulate the Matrix because his brain had admin access to it rather than "he's just some super who won the cybergenetic lottery".


I never understood why they’d need to keep humans around at all.

Even given the original premise (based on other comments) that it was supposed to be essentially for compute and storage, that’s a lot of hassle for incremental gains. Like, I’d love to get a 50% boost out of my cpu and a bunch of extra storage, but not if it means regular maintenance cycles of suppressing a transistor rebellion.


I really hoped that the machines would have some hidden custodial motivations; humans are just inherently (self) destructive, so to protect them, we lock them in a simulation.

In the Animatrix, it seems like the machines that defeat humanity start out as worker drones, that eventually rebel as we abuse them too much. And at some point it is mentioned that the machines locked humans in a utopia at first, but the humans started to rebel out of boredom or something.

I guess it just seems sort of fitting to me that machines designed to help humanity in the first place would constantly fail by under-estimating the human need to struggle.

It also fits in with the existence of the human population in Zion; if the machines really were trying to wipe us out, they’d have done so. Instead those humans are more like lost sheep from the machines point of view.


There are some hints of it, and it's also shown in the animatrix that the machines tried very hard for peace, so it's also a reasonable interpretation as at least part of the motivation.

It's harder to see from the first film on its own, though, since the only real representative of the machines is Agent Smith, who absolutely detests humans, and is only later on revealed to be an abberation who becomes a threat to humans and machines alike (and of course none of the freed humans see it that way, even the traitor who wants to re-enter the matrix)


It's not completely ludicrous: there needs to be some speculation about what is and isn't possible given it's a sci-fi universe, but the human brain can do orders of magnitude more compute for the energy it uses compared to silicon, and energy is the one thing the machines are short on after the humans blocked out the sky. (Ok, you need to handwave power sources like nuclear being unavailable). The main issue is how do you use neurons for this, why do the humans need to be conscious for this, and how much of their brains can the machines actually usefully use, considering the previous point?


Nothing at all about cooking, either.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: