omg just had a look and this one is just everything I hate about mathematics and academia.
Starts with lots of random definitions, remarks, axioms and introducing new sign language while completely disregarding introducing what it‘s supposed to do, explain or help with.
All self-aggrandization by creating complexity, zero intuition and simplification. Isn‘t there anybody close to the Feynman of Linear Algebra?
Yeah, a good example is on the second page of the first chapter:
> Remark. It is easy to prove that zero vector 0 is unique, and that given
v ∈ V its additive inverse −v is also unique.
The is the first time the word "unique" is used in the text. Students are going to have no idea whether this is meant in some technical sense or just conventional English. One can imagine various meanings, but that doesn't substitute for real understanding.
This is actually why I feel that mathematical texts tend to be not rigorous enough, rather than too rigorous. On the surface the opposite is true - you complain, for instance, that the text jumps immediately into using technical language without any prior introduction or intuition building. My take is that intuition building doesn't need to replace or preface the use of formal precision, but that what is needed is to bridge concepts the student already understands and has intuition for to the new concept that the student is to learn.
In terms of intuition building, I think it's probably best to introduce vectors via talking about Euclidean space - which gives the student the possibility of using their physical intuitions. The student should build intuition for how and why vector space "axioms" hold by learning that fundamental operations like addition (which they already grasp) are being extended to vectors in Euclidean space. They already instinctively understand the axiomatic properties being introduced, it's just that the raw technical language being thrown at them fails to connect to any concept they already possess.
> This is actually why I feel that mathematical texts tend to be not rigorous enough, rather than too rigorous.
The thing that mathematicians refuse to admit is that they are extremely sloppy with their notation, terminology and rigor. Especially in comparison to the average programmer.
They are conceptually/abstractly rigorous, but in "implementation" are incredibly sloppy. But they've been in that world so long they can't really see it / just expect it.
And if you debate with one long enough, they'll eventually concede and say something along the lines of "well math evolved being written on paper and conciseness was important so that took priority over those other concerns." And it leaks through into math instruction and general math text writing.
Programming is forced to be extremely rigorous at the implementation level simply because what is written must be executed. Now engineering abstraction is extremely conceptually sloppy and if it works it's often deemed "good enough". And math generally is the exact opposite. Even for a simple case, take the number of symbols that have context sensitive meanings and mathematicians. They will use them without declaring which context they are using, and a reader is simply supposed to infer correctly. It's actually somewhat funny because it's not at all how they see themselves.
> The thing that mathematicians refuse to admit is that they are extremely sloppy with their notation, terminology and rigor. Especially in comparison to the average programmer.
Not sure why you say that. Mathematicians are pretty open about it. The well known essay On proof and progress on mathematics discusses it. It is written by a Fields medalist.
This drove me mad when I had to do introductory maths at uni. Maths as written did seem pretty sloppy and not at all like a programming language whose expressions I could parse as I expected. Obv most simple algebra looks about as you'd expect but I clearly recall feeling exactly what you describe in some cases, and commented upon it to the lecturer about it asking why it was that way during a tutorial. He humoured me, was a good guy.
But I think mathematicians probably have a point - it did evolve that way over a long time and anyone practicing it daily just knows how to do it and they're not going to do a thorough review and change now.
It's us tourists that get thrown for a loop, but so it goes. It's not meant for us.
> Maths as written did seem pretty sloppy and not at all like a programming language whose expressions I could parse as I expected.
Look at Lean's mathlib: that's what fully formal mathematics looks like. It's far too verbose to be suitable for communicating with other people; you might as well try to teach an algorithms course transistor by transistor.
You’re confusing syntax and semantics. Programmers write code for syntax machines (Turing machines). The computers care a lot about syntax and will halt if you make an error. They do not care at all about semantics. A computer is happy to let you multiply a temperature in Fahrenheit times a figure in Australian dollars and subtract the volume of the earth in litres, provided that these numbers are all formatted in a compatible enough way that they can be implicitly converted (this depends on the programming language but many of them are quite liberal at this).
If you want the computer to stop you from doing such nonsense, you’ve got to put in a lot of effort to make types or contracts or write a lot of tests to avoid it. But that’s essentially a scheme for encoding a little bit of your semantics into syntax the computer can understand. Most programmers are not this rigorous!
Mathematicians, on the other hand, write mathematics for other humans to read. They expect their readers to have done their homework long before picking up the paper. They do not have any time to waste in spelling out all the minutiae, much of which is obvious and trivial to their peers. The sort of formal, syntax-level rigour you prefer, which can be checked by computers, is of zero interest to most mathematicians. What matters to them, at the end of the day, is making a solid enough argument to convince the establishment within their subfield of mathematics.
But programmers are expected to get the semantics right. Sure, it happens to mismatch temperatures and dollars, but it’s called a bug and you will be expected to fix it
Why do mathematicians hide their natural way of thinking ? They provide their finished work and everyone is supposed to clap. Why can't they write long articles like about false starts, dead ends and so on. It's only after magazines like Quanta and YouTube channels that we get to feel the thinking process. Math is not hard. Atleast the mathematics we are expected to know.
Mathematics is extremely hard. The math people are expected to know for high school is not hard, but that is such a minuscule amount of math compared to what we (humans) know, collectively.
Mathematicians do speak and also write books about the thinking process. It’s just very difficult and individualized. It’s a nonlinear process with false starts and dead ends, as you say.
But you can’t really be told what it feels like. You have to experience it for yourself.
>Even for a simple case, take the number of symbols that have context sensitive meanings and mathematicians. They will use them without declaring which context they are using, and a reader is simply supposed to infer correctly.
Yes!! Like, oh, you didn't know that p-looking thing (rho) means Pearson's correlation coefficient? That theta means an angle? Well just sit there in ignorance because I'm not going to explain it. And those are the easy ones!
My experience with the average programmer is...different from yours. The software development field is exceptionally bad in this regard. Physicists are mathematically sloppy sometimes (why, yes, I will just multiply both sides by `dy` and take as many liberties with operators, harmonics/series, and vector differential operations as I care to, thanks).
Mathematics, like any other academic field, has jargon (and this includes notation, customary symbols for a given application, etc.), and students of the field ought to learn the jargon if they wish to be proficient. On the other hand, textbooks meant to instruct ought to teach the jargon. It's been forever since I've opened a mathematics textbook; I don't recall any being terribly bad in this regard.
Well I have a different approach. Sometimes I write and hack it to solve a particular problem. The code might be elegant or not, but if you understand the problem you can probably grok the code.
Next I generalize it a bit. Specific variables configurable parameters. Something that happened implicitly or with a single line of code gets handled by its own function. Now it’s general but makes much less sense at first because it’s no longer tied to one problem, but a whole set of problems. It’s a lot less teachable and certainly not self-evident any more.
The problem with math education is that we think the solution approach would be inherently superior to the first, and would make a better textbook—because it’s more generic. But that is not how real people learn—they would all “do” math the first way. By taking away the ability of the student to do the generalization themselves we are depriving them of the real pleasure of programming (or math).
Maybe back when paper was scarce this approach made sense but not any more.
Ideally I would love to present a million specific solutions and let them generalize themselves. That is exactly how we would train a ANN. Not be regurgitating the canned solution but by giving it all sorts of data and letting it generalize for itself. So why don’t we do this for human students? When it comes to education I think people have a blind spot towards how learning is actually done.
Notation and terminology? Sure, some explanations and mechanical manipulations are elided in mathematics because with context they're clear.
Rigor? Ha, you have got to be kidding me. Math is rigorous to a fault. Typical computer programming has no rigor whatsoever in comparison. A rigid syntax is not rigor. Syntax, in turn, is certainly not the difficult part of developing software.
This, really. Sometimes, when reading math papers, you find that they end up being very hand-wavy with the notation, e.g. with subscripting, because "it's understood". But without extensive training in mathematics, a lot of it is not understood.
Programmers will use languages with a lot of syntactic sugar, and without knowing the language, code can be pretty difficult to understand when it is used. But even then, you can't be sloppy, because computers are so damn literal.
> The thing that mathematicians refuse to admit is that they are extremely sloppy with their notation, terminology and rigor.
The refuse part is imo ver dependent on the person. Nearly all of my professors for theoretical cs courses just plainly said that their "unique" notation is just their way because they like it.
It's more or less just a simple approach to alter the language to fit your task. This is also not unfamiliar to the programmers who may choose a language based on the task, with, e.g. Fortran for vector based calculus or C for direct hardware access.
Bro I know this feel. Even books teaching algorithms being written by mathematicians are error everywhere.
They never state the type, class, no comment, no explanation, read exceed the last index... This list can go endlessly. When they say "lets declare an empty set for variable T", you don't know whether the thing is a list, set, tuple, ndarray, placeholder for a scalar, or a graph.
Some even provide actual code, however, never actually run the code to verify their correctness.
Try this guy then, he's got a PhD in mathematics from the California Institute of Technology from a thesis Finite Semifields and Projective Planes but he's written a bunch of stuff on algorithms and will write you a check for any errors you find in his work: https://en.wikipedia.org/wiki/Donald_Knuth
Church organist actually - serious enough to have a two story pipe organ built into his own home.
Enough of the True Scotsman .. it's clear as day from the introduction and opening chapters of TAOCP that he approaches programming as a mathematician.
I believe that any mathematician that took a differential geometry class must have already realized this, the notation is so compressed and implicit that some proofs practically are "by notation" as it can be a dauting prospect to expand a dozen indexes.
The average computer scientist (not only "programmer", as a js dev would be) never wrote lean/coq or similar, and is not aware of the Curry-Haskell like theorems and their implications.
I think you entirely missed the point. GP put it well:
>> They are conceptually/abstractly rigorous, but in "implementation" are incredibly sloppy.
Maturity in concept-space and the ability to reason abstractly can be achieved without the sort of formal rigor required by far less abstract and much more conceptually simple programming.
I have seen this first hand TAing and tutoring CS1. I regularly had students who put off their required programming course until senior year. As a result, some were well into graduate-level mathematics and at the top of their class but struggled deeply with the rigor required in implementation. Think about, e.g., missing semi-colons at the end of lines, understanding where a variable is defined, understanding how nested loops work, simple recursion, and so on. Consider something as simple as writing a C/Java program that reads lines from a file, parses them according to a simple format, prints out some accumulated value from the process, and handles common errors appropriately. Programming requires a lot more formal rigor than mathematical proof writing.
You have a valid point, which is that we are not even being rigorous enough about the meaning of the word “rigor” in this context.
- One poster praises how programming needs to be boiled down into executable instructions as “rigor,” presumably comparing to an imaginary math prof saying “eh that sort of problem can probably be solved with a Cholesky decomposition” without telling you how to do that or what it is or why it is even germane to the problem. This poster has not seen the sheer number of Java API devs who use the Spring framework every day and have no idea how it does what it does, the number of Git developers who do not understand what Git is or how it uses the filesystem as a simple NoSQL database, or the number of people running on Kubernetes who do not know what the control plane is, do not know what etcd is, no idea of what a custom resource definition is or when it would be useful... If we are comparing apples to apples, “rigor” meaning “this person is talking about a technique they have run across in their context and rather than abstractly indicating that it can be used to fix a problem without exact details of how it does that, they know the technique inside and out and are going to patiently sit down with you until you understand it too,” well, I think the point more often goes to the mathematician.
- Meanwhile you invoke correctness and I think you mean not just ontic correctness “this passed the test cases and happens to be correct on all the actual inputs it will be run on” but epistemic correctness “this argument gives us confidence that the code has a definite contract which it will correctly deliver on,” which you do see in programming and computer science, often in terms of “loop invariants” or “amortized big-O analysis” or the like... But yeah most programmers only interact with this correctness by partially specifying a contract in terms of some test cases which they validate.
That discussion, however, would require a much longer and more nuanced discussion that would be more appropriate for a blog article than for an HN comment thread. Even this comment pointing out that there are at least three meanings of rigor hiding in plain sight is too long.
>> Programming requires a lot more formal rigor than mathematical proof writing.
> This is is just wrong? Syntax rigour has almost nothing to do with correctness.
1. It's all fine and well to wave your hand at "Syntax rigour", but if your code doesn't even parse then you won't get far toward "correctness". The frustration with having to write code that parses was extremely common among the students I am referring to in my original post -- it seemed incidental and unnecessary. It might be incidental, but at least for now it's definitely not unnecessary.
2. It's not just syntactic rigor. I gave two other examples which are not primarily syntactic trip-ups: understanding nested loops and simple recursion. (This actually makes sense -- how often in undergraduate math do you write a proof that involves multiple interacting inductions? It happens, but isn't a particularly common item in the arsenal. And even when you do, the precise way in which the two inductions proceed is almost always irrelevant to the argument because you don't care about the "runtime" of a proof. So the fact that students toward the end of undergraduate struggle with this isn't particularly surprising.)
Even elementary programming ability demands a type of rigor we'll call "implementation rigor". Understanding how nested loops actually work and why switching the order of two nested loops might result in wildly different runtimes. Understanding that two variables that happen to have the same name and two different points in the program might not be referring to the same piece of memory. Etc.
Mathematical maturity doesn't traditionally emphasize this type of "implementation rigor" -- even a mathematician at the end of their undergraduate studies often won't have a novice programmer's level of "implementation rigor".
I am not quite sure why you are being so defensive on this point. To anyone who has educated both mathematicians and computer scientists, it's a fairly obvious point and plainly observable out in the real world. Going on about curry-howard and other abstract nonsense seems to wildly and radically miss this point.
Having taught both I get what you are saying, but the rigor required in programming is quite trivial compared to that in mathematics. Writing a well structured program is much more comparable to what is involved in careful mathematical writing. It's precisely the internal semantic coherence and consistency, rather than syntactic correctness, that is hardest.
You need more rigour to prove let’s say Beppo Levy theorem than writing a moderately complex piece of software.
Yet you can write it in crappy English, the medium not being the target goal, the ideation process even poorly transcribed in English needs to be perfectly rigorous. Otherwise, you proved nothing.
> Syntax rigour has almost nothing to do with correctness.
I see your point: has almost nothing correctness with rigour do to Syntax.
Syntax rigor has to do with correctness to the extent that "correctness" exists outside the mind of the creator. Einstein notation is a decent example: the rigor is inherent in the definition of the syntax, but to a novice, it is entirely under-specified and can't be said to be "correct" without its definition being incorporated already...which is the ultimate-parent-posts' point and I think the context in which the post-to-which-you're-replying needs to be taken.
And if you're going to argue "This is just wrong?" (I love the passive-aggressive '?'?) while ignoring the context of the discussion...QED.
but programmers don't write underspecified notational shortcuts, because those are soundly rejected as syntax errors by the compiler or interpreter
this is not about semantics (like dependent types etc) this is just syntax. it works like this in any language. the only way to make syntax accepted by a compiler is to make it unambiguous
... maybe LLMs will change this game and the programming languages of the future will be allowed to be sloppy, just like mathematics
Yep, but for any notational shortcut the compiler infers a single thing for it. It's still unambiguous as far as computers are concerned (but it may be confusing reading it without context)
It’s not only a notational shortcut as in syntactic sugar though.
It can apply a set of rewrite rules given you were able to construct the right type at some point.
It’s type inference on steroids because you can brute force the solution by applying the rewrite rules and other tactics on propositions until something (propositional equality) is found, or nothing.
> Remark. It is easy to prove that zero vector 0 is unique, and that given v ∈ V its additive inverse −v is also unique.
I'm sorry, this book is meant for the audience who can read and write proofs. Uniqueness proofs are staple of mathematics. If word "unique" throws you off, then this book is not meant for you.
I'd go a bit further and say that if you're not comfortable with the basics of mathematical proofs, then you're not ready for the subject of linear algebra regardless of what book or course you're trying to learn from. The purely computational approach to mathematics used up through high school (with the oddball exception of Euclidean geometry) and many introductory calculus classes can't really go much further than that.
Or, you know, mathematics can be viewed as a powerful set of tools…
Somehow I seem to remember getting through an engineering degree, taking all the optional extra math courses (including linear algebra), without there ever being a big emphasis on proofs. I’m sure it’s important if you want to be a mathematician, but if you just want to understand enough to be able to use it?
> taking all the optional extra math courses (including linear algebra), without there ever being a big emphasis on proofs
Sorry to break it to you, but you didn't take math classes. You took classes of the discipline taught in high school under the homonymous name "math". There is a big difference.
It's the same difference as there is between what you get taught in grade school under the name "English" (or whatever is the dominant language where you live): the alphabet, spelling, pronunciation, basic sentence structure... And what gets taught in high school under the name "English": how to write essays, critically analyze pieces of literature, etc. The two sets of skills are almost completely unrelated. The first is a prerequisite for the second (how can you write an essay if you can't write at all?), so somehow the two got the same name. But nobody believes that winning a spelling bee is the same type of skill as writing a novel.
I know it's a shock to everyone who enters a university math course after high school. Many of my 1st year students are confounded about the fact that they'll be graded on their ability to prove things. They expect the equivalent of cooking recipes to invert matrices, compute a GCD, solve a quadratic equation, or whatever, and balk at anything else. I want them to understand logical reasoning, abstract concepts, and the difference between "I'm pretty sure" and "this is an absolute truth". There's a world of difference, and most have to wait a few years to develop enough maturity to finally get it.
> Sorry to break it to you, but you didn't take math classes. You took classes of the discipline taught in high school under the homonymous name "math". There is a big difference.
If you look at the comments below, you’ll see that this can’t be strictly true. At least, not 20+ years ago in Australia when I was a student. Some of the courses I took were in the math faculty with students who were going on to become mathematicians. At that time this would have been a quarter load of a semester, and was titled “Linear Algebra”, but I can’t remember if it was 1st/2nd or even 3rd year subject (it’s been too long).
Perhaps the lack of emphasis on proofs (I am not saying proofs were absent, I made another comment with more explanation), was a combination of these being introductory courses, the universities knowledge that there were more than just math faculty students taking them, or changes with time in how the pedagogy has evolved.
What is more interesting to me, is what do you think a student misses out on, from a capability point of view, with an applications focused learning as opposed to one focused on reading and writing proofs?
Would a student who is not intending to become a mathematician still benefit from this approach? Would a middle aged man who was taught some “Linear Algebra” benefit from picking up a book such as the one referenced here?
> What is more interesting to me, is what do you think a student misses out on, from a capability point of view, with an applications focused learning as opposed to one focused on reading and writing proofs?
The generalizable value is not so much in collecting a bunch of discrete capabilities (they're there, but generally somewhat domain-specific) as it is in developing certain intuitions and habits of thought: what mathematicians call "mathematical maturity". A few examples:
- Correcting trivial errors in otherwise correct arguments on the fly instead of getting hung up on them (as demonstrated all over this comment section).
- Thinking in terms of your domain rather than however you happen to be choosing to represent it at the moment. This is why math papers can be riddled with "syntax errors" and yet still reach the right conclusions for the right reasons. These sorts of errors don't propagate out of control because they're not propagated at all: line N+1 isn't derived from line N: conceptual step N+1 is derived from conceptual step N, and then they're translated into lines N+1 and N independently.
- Tracking, as you reason through something, whether your intuitions and heuristics can be formalized without having to actually do so.
- More generally, being able to fluently move between different levels of formality as needed without suffering too much cognitive load at the transitions.
- Approaching new topics by looking for structures you already understand, instead of trying to build everything up from primitives every time. Good programmers do the same, but often fail to generalize it beyond code.
> Would a student who is not intending to become a mathematician still benefit from this approach?
If they intend to go into a technical field, absolutely.
> Would a middle aged man who was taught some “Linear Algebra” benefit from picking up a book such as the one referenced here?
Depends on what you're looking for. If you want to learn other areas of math, linear algebra is more or less a hard requirement. If you want to be able to semiformally reason about linear algebra faster and more accurately, yes. If you just want better computational tricks, drink deep or not at all: they're out there, but a fair bit further down the road.
The sibling comment answered most of what you wrote. So, I'll just add that I'm talking about the present day, not 20+ years ago. I don't know about your experience in Australia 20+ years ago, but I'm teaching real students, today, who just got out of high school, in Western Europe. Not hypothetical students 20 years ago in Australia. And based on what the Australian colleagues I met at various conferences told me, their teaching experience in Australia today isn't really different from mine.
FWiW I started out in Engineering and transferred out to a more serious mathematics | physics stream.
The Engineering curriculum as I found it was essentially rote for the first two years.
It had more exams and units than any other courses (including Medicine and Law which tagged in pretty close) and included Chemistry 110 (for Engineers) in the Chemistry Department, Physics 110 (for Engineers) in the Physics Department, Mathematics 110 (for Engineers) in the Mathematics Department, and Tech Drawing, Statics & Dynamics, Electrical Fundementals, etc in the Engineering Department.
All these 110 courses for Engineers covered "the things you need to know to practically use this infomation" .. how to use Linear Algebra to solve loading equations in truss configurations, etc.
These were harder than the 115 and 130 courses that were "{Chemistry | Math | Physics} for Business Majors" etc. that essentially taught familiarity with subjects so you could talk with the Engineers you employed, etc.
But none of the 110 courses got into the meat of their subjects in the same way as the 100 courses, these were taught to instruct people who intended to really master. Maths, Physics, or Chemistry.
Within a week or two of starting first year university I transfered out of the Maths 110 Engineering unit and into Math 100, ditto Chem and Physics. Halfway through second year I formally left Engineering the curriculum altogether (although I later became a professional Engineer .. go figure).
The big distinction between Math 100 V. Math 110 was the 110 course didn't go into how anything "worked", it was entirely about how to use various math concepts to solve specific problems.
Math 100 was fundementals, fundementals, fundementals - how to prove various results, how to derive new versions of old things, etc.
Six months into Math 100 nothing had been taught that could be directly used to solve problems already covered in Math 110.
Six months and one week into Math 100 and suddenly you could derive for yourself from first principals everything required to be memorised in Math 110 and Math 210 (Second year "mathematics for engineers").
I'm incredulous that a linear algebra course taught by mathematics faculty didn't have a lot of theorem proving.
Maybe that would be the case if the intended audience is engineering students. But for mathematics students, it would literally be setting them up for failure; a student that can't handle or haven't seen much theorem-proving in linear algebra is not going to go very far in coursework elsewhere. Theorem proving is an integral part of mathematics, in stretching and expanding tools and concepts for your own use.
Maybe the courses are structured so that mathematics students normally go on to take a different course. In that case, GP's point would still have been valid - the LA courses you took were indeed ones planned for engineering, not for those pursuing mathematics degrees. At my alma mater, it was indeed the case that physics students and engineering students were exposed to a different set of course material for foundational courses like linear algebra and complex analysis.
Just like compiler theory, if you don't write compilers maybe it's not that useful and you shouldn't be spending too much time on it, but it would be presumptuous to say that delivering a full compiler course is a fundamentally incorrect approach, because somebody has to make that sausage.
I can only speak to my own experiences, but the math courses were not customised for engineering students. I sat next to students who were planning to become mathematicians. Linear Algebra was an optional course for me.
Having said that, I’m sure theorem proving was part of it (this was many years ago), I just don’t recall it as being fundamental in any sense. I’m sure that has something more to do with the student than the course work. I liked (and like), maths, but I was there to build my tool chest. A different student, with a different emphasis, would have gotten different things out of the course.
But I think my viewpoint is prevalent in engineering, even from engineers who started with a math degree. The emphasis on “what can I do with this”, relegates theorem proving to annoying busywork.
I can second this, in my Engineering degree the Linear Algebra course (and the Calculus course) were both taught by the Math Faculty at my Uni.
The textbook we used was "Linear Algebra: And its Applications" by David C Lay 20 years later I still keep this textbook with me at my desk and consult it a few times a year when I need to jog my memory on something. I consider it to be a very good textbook even if it doesn't it doesn't contain any rigorous proofs or axioms...
Engineers can learn linear algebra from an engineering perspective, i.e. not emphasizing proofs, and that’s fine, but the books being discussed are not intended for that audience.
I don't know whom to agree with. Maybe there need to be two tracks, and it might not even depend on discipline, but just personal preference. Do you love math as an art form, or as a problem solving tool? Or both?
I went back and forth. I was good at problem solving, but proofs were what made math come alive for me, and I started college as a math major. Then I added a physics major, with its emphasis on problem solving. But I would have struggled with memorizing formulas if I didn't know how they were related to one another.
Today, K-12 math is taught almost exclusively as problem-solving. This might or might not be a realistic view of math. On the one hand, very few students are going to become mathematicians, though they should at least be given a chance. On the other hand, most of them are not going to use their school math beyond college, yet math is an obstacle for admission into some potentially lucrative careers.
At my workplace, there's some math work to be done, but only enough to entertain a tiny handful of "math people," seemingly unrelated to their actual specialty.
> I'd go a bit further and say that if you're not comfortable with the basics of mathematical proofs, then you're not ready for the subject of linear algebra regardless of what book or course you're trying to learn from.
Engineers frequently need to learn some fairly advanced mathematics. Are you suggesting they can’t use the same textbooks?
By the way; I don’t think the original poster is wrong as such (every similar textbook is undoubtedly full of proofs), I’m just suggesting a different viewpoint. Not everyone learning Linear Algebra is intending to become a mathematician.
Most engineers don’t learn linear algebra, they learn a bit of matrix and vector math in real Euclidean spaces. They don’t learn anything about vector spaces or the algebra behind all of it. What they learn would only be considered “advanced mathematics” 200 years ago, when Gauss first developed matrices.
You can see with a quick skim that the content is very application focused. I just don’t know enough to know what I don’t know. If one were to learn Linear Algebra using this textbook, would it be a proper base? Would you have grasped the fundamentals?
It covers most of the topics covered in a first course in linear algebra, it’s just very application-specific. It has some basic proofs in the exercises, but nothing overly difficult, and the more involved proofs give you explicit instructions on the steps to take.
There is a chapter on abstract vector spaces and there are a few examples given besides the usual R^n (polynomials, sequences, functions) but there is almost no time spent on these. There is also no mention of the fact that the scalars of a vector space need not be real numbers; that you can define vector spaces over any number field.
There is only a passing discussion of complex numbers (as possible Eigenvalues and in an appendix) but no mention of the fact that vector spaces over the field of complex numbers exist and have an even more well-developed theory than for real numbers.
But more fundamental than a laundry list of missing or unnecessary topics is the fact that it’s application focused. Pure mathematics courses are proof and theory focused. So they cover all the same (and more) topics in much richer theoretical detail, and they teach you how to prove statements. Pure math students don’t just learn how to write proofs in one or two courses and then move on; all of the courses they take are heavily proof-based. Writing proofs, like programming, is a muscle that benefits from continued exercise.
So if you’re studying (or previously studied) science or engineering and learned all your math from that track, switching to pure math involves a bunch of catch up. I’ve met plenty of people who successfully made the switch, but it took a concerted effort.
There seems to be a fundamental difference in mindset between the “applications” based learning of mathematics, and this pure math based version. Are there benefits to be had for a person that only intends to use mathematics in an applied fashion?
This depends on who you ask. Personally, I found studying pure math incredibly rewarding. It gave me the confidence to be able to say that I can look at any piece of mathematics and figure it out if I take the time to do so.
I can't speak for those who have only studied math at an applied level directly. My impression of them (as an outsider but also a math tutor) is that they are fairly comfortable with the math they have been using for a while but always find new mathematics daunting.
I have heard famous mathematicians describe this phenomenon as "mathematical maturity" but I don't know if this has been studied as a real social/educational phenomenon.
Are there courses/books on "applied linear algebra"? You are right in some sense, but wrong in some sense. Linear algebra at a 100 level without any really deep understanding is still incredibly useful. Graphics (i guess you sort of call this out), machine learning etc.
Most university math curriculums have a clear demarcation between the early computation-oriented classes (calculus, some diff eq.) and later proof-oriented classes. Traditionally, either linear algebra or abstract algebra is used as the first proof-oriented course, but making that transition to proof-based math at the same time as digesting a lot of new subject matter can be brutal, so many schools now have a dedicated transition course (often covering a fair bit of discrete mathematics). But there's still demand for textbooks for a linear algebra course that can serve double-duty of teaching engineering students a bag of tricks and give math students a reasonably thorough treatment of the subject.
>I'd go a bit further and say that if you're not comfortable with the basics of mathematical proofs, then you're not ready for the subject of linear algebra
I can't write a proof to save my life, but I'm going to keep using linear algebra to solve problems and make money, nearly every day. Sorry!
We had this discussion about Data Science years ago: "you aren't a real Data Scientist unless you fully understand subjects X, Y, Z!"
Now companies are filled to the brim with Data Scientists who can't solve a business problem to save their life, and the companies are regretting the hires. Nobody cares what proofs they can write.
There are (at least) two different things we're calling "linear algebra here", roughly speaking one is building the tools and one is using the tools.
The mathematicians need to understand the basics of of mathematical proofs to learn how to prove new interesting (and sometimes useful) stuff in linear algebra. You have to do the math stuff in order to come up with some new matrix decomposition or whatever.
The engineers/data scientists/whatever people just need to understand how to use them.
You don't need to know how to build a car to drive one. The mathematicians are building the cars, you're using them.
I don't think I've ever done more rote manual calculation than for my undergrad linear algebra class! On tests and homework just robotically inverting matrices, adding/subtracting them (I think I even had to do some of that in high school algebra), multiplying them (yuck). It was tedious and frustrating and anything but theoretical.
I've learned linear algebra course quality varies substantially. One acquaintance whom I met after they graduated a big university in Canada reported having to do things like by-hand step-by-step reduced row echelon form computations for 3x4 matrices or larger. I had to do such things in "Algebra 2" in junior high (9th grade), until our teacher kindly showed us how to do the operations on the calculator and stopped demanding work steps. If we had more advanced calculators (he demoed on some school-owned TI-92s, convincing me to ask for a TI-89 Titanium for Christmas) we could use the rref() function to do it all at once.
In my actual linear algebra class in freshman year college we were introduced to a lot of proper things I wish I had seen before, along with some proofs but it wasn't proof heavy. I did send a random email to my old 9th grade teacher about at least introducing the concept of the co-domain, not just domain and range, but it was received poorly. Oh well. (There was a more advanced linear algebra class but it was not required for my side. The only required math course that I'd say was proof heavy was Discrete Math. An optional course, Combinatorial Game Theory, was pretty proof heavy.)
Linear algebra is usually a required (or at least strongly encouraged) course for an undergraduate degree in basically any engineering discipline, and it is usually not preceded by a course in "the basics of mathematical proofs".
> this book is meant for the audience who can read and write proofs
It seems like the opposite is true:
"It is intended for a student who, while not yet very familiar with abstract reasoning, is willing to study more [than a] "cookbook style" calculus type course."
(from the link).
If your point is one can't learn linear algebra before learning "abstract [mathematical] reasoning"...don't think you're the main target audience of a subject as practical as linear algebra.
> Besides being a first course in linear algebra it is also supposed to be a first course introducing a student to rigorous proof, formal definitions---in short, to the style of modern theoretical (abstract) mathematics.
So I think it's fair to say that the book (ought to) assume zero knowledge of proofs, contra your parent's claim that the audience is expected to be able to read and write proofs.
From the second paragraph of the introduction to the book we are discussing:
> Besides being a first course in linear algebra it is also supposed to be a first course introducing a student to rigorous proof, formal definitions---in short, to the style of modern theoretical (abstract) mathematics.
So it's certainly meant to be the first math book one sees in their life that discusses rigorous proofs.
A vector space is defined as having a zero vector, that is, a vector v such that for any other vector w, v + w = w.
Saying the zero vector is unique means that only one vector has that property, which we can prove as follows. Assume that v and v’ are zero vectors. Then v + v’ = v’ (because v is a zero vector). But also, v + v’ = v’ + v = v, where the first equality holds because addition in a vector space is commutative, and the second because v’ is a zero vector. Since v’ + v = v’ and v’ + v = v, v’ = v.
We have shown that any two zero vectors in a vector space are in fact the same, and therefore that there is actually only one unique zero vector per vector space.
We used this in my Discrete Mathematics class (MATH 2001 @ CU Boulder) (it is a pre-requisite for most math classes). The section about truth tables did overlap a bit with my philosophy class (PHIL 1440 Critical Thinking)
> The above statement of zero vector is unique, I have no idea what is that means.
In isolation, nothing. (Neither does the word “vector”, really.) In the context of that book, the idea is more or less as follows:
Suppose you are playing a game. That game involves things called “vectors”, which are completely opaque to you. (I’m being serious here. If you’ve encountered about some other thing called “vectors”, forget about it—at least until you get to the examples section, where various ways to implement the game are discussed.)
There’s a way to make a new vector given two existing ones (denoted + and called “addition”, but not the same as real-number addition) and a way to make a new vector given an existing one and a real number (denoted by juxtaposition and called “multiplication”, but once again that’s a pun whose usefulness will only become apparent later) (we won’t actually need that one here). The inner workings of these operations in turn are also completely opaque to you. However, the rules of the game tell you that
1. It doesn’t matter in which order you feed your two vectors into the “addition” operation (“add” them): whatever existing vectors v and w you’re holding, the new vector v+w will turn out to be the same as the other new vector w+v.
2. When you “add” two vectors and then “add” the third to the result, you’ll get the exact same thing as when you “add” the first to the “sum” of the second and third; that is, whatever the vectors u, v, and w are, (u+v)+w is equal to u+(v+w).
(Why three vectors and not four or five? It turns out that you have the rule for three, you can prove those for four, five, and so on, even though there are going to be many more ways to place the parens there. See Spivak’s “Calculus” for a nice explanation, or if you like compilers, look up “reassociation”.)
3. There is [at least one] vector, call it 0, such that adding it to anything else doesn’t make a difference: for this distinguished 0 and whatever v, v+0 is the same as v.
Let’s now pause for a moment and split the last item into two parts.
We’ll say a vector u deserves to be called a “zero” if, whatever other vector we take [including u itself!], we will get it back again if we add u to it; that is, for any v we’ll get v+u=v.
This is not an additional rule. It doesn’t actually tell us anything. It’s just a label we chose to use. We don’t even know if there are any of those “zeros” around! And now we can restate rule 3, which is a rule:
3. There is [at least one] “zero”.
What the remark says is that, given these definitions and the three rules, you can show, without assuming anything else, that there is exactly one “zero”.
(OK, what the remark actually says is that you can prove that from the full set of eight rules that the author gives.
But that is, frankly, sloppy, because the way rule 4 is phrased actually assumes that the zero is unique: either you need to say that there’s a distinguished zero such that for every v there’s a w with v+w= that zero, or you need to say that for every v there’s a w such that v+w is a zero, possibly a different one for each v. Of course, it doesn’t actually matter!—there can only be one zero even before we get to rule 4. But not making note of that is, again, sloppy.
This kind of sloppiness is perfectly acceptable among people who have seen this sort of thing before, say done finite groups or something like that. But if the book is supposed to be give a first impression, this seems like a bad idea. Perhaps a precalculus course of some sort is assumed.
Read Spivak, seriously. He’s great. Not linear algebra, though.)
Yes, and this book: https://openlibrary.org/books/OL28292750M (no opinion about differences between the editions, this is the final one). Yes, it’s a doorstopper, but unlike most other people’s attempts at the kitchen sink that is the traditional calculus course, it actually affords proper respect to the half-dozen or so different subjects whose basics are crammed in there. The discussion of associativity I was referring to is in Chapter 1, so perhaps it can serve as a taster.
I think a lot of people just need an opportunity to see math demonstrated in a more tangible way.
For example, I learned trig, calculus, and statistics from my science classes, not from my math classes (and that's despite getting perfect A's in all of my math classes). In math class, I was just mindlessly going through the motions and hating every second of it, but science classes actually taught me why it worked and showed me the beauty and cleverness of it all.
I think most college math depts have "applied math" majors. I like both sides of math, but I found it incredibly frustrating when I would try to study just the equations for that chapter, only to be tested on a word problem. The whole "trying to trick you" conspiracy turned me off to college in general. If I'm trying to teach someone how to do something, I would show them "A, then, B, and you get C" , then assign a variety of homework of that form, and on the test, say "A, then B, then _____" and they would be correct if they concluded C. But for some reason this method isn't used much in university. If I wanted to teach a student how to start with C and deconstruct into A, B , thats what I would have taught them!
If you study mathematics at a rigorous level then you learn by writing proofs. Then you will rack your brain for hours or even days trying to figure out how to prove some simple things. It is not at all “going through the motions” at that point!
A first course in linear algebra still assumes background information, because linear algebra is not a basic topic. It’s not meant to be a first course in math. Math builds on itself and it would be incredibly inconvenient if every course everywhere would have to include a recap of basic things. And proofs are among the most fundamental things in math!
Programming courses or articles or books, beyond the 101 level, don’t teach you again and again the basics of declaring a variable and writing a loop either! No field does that.
Wrt linear algebra in particular, there are plenty of resources aimed at programmers thanks to its relevance in computer graphics and so on. They typically skip proofs and just tell you that this is how matrix multiplication is defined, but they don’t teach you math, merely using math. Which can be plenty enough to an engineer.
This is not snobbery, some subjects just have prerequisites.
You can't learn computer science without having a good sense of what an "algorithm" is, you have to know how to read and write and understand algorithms. Similarly you can't learn math without having a good sense of what a proof is, reading, writing and understanding proofs is the heart of what math is.
Even more strongly, trying to learn math without a solid understanding of how proofs work is something like studying English literature while refusing to learn how to read English.
> Even more strongly, trying to learn math without a solid understanding of how proofs work is something like studying English literature while refusing to learn how to read English.
It depends why you're trying to learn math. Are you interested in math for math's sake, or are you trying to actually do something with it?
If it's the former, then yeah, you need proofs. Otherwise, like in your analogy, it's like studying English literature without knowing any english grammar rules.
But if you're trying to apply the math, if you're studying linear algebra because it's useful rather than for its own sake, then you don't need proofs. To follow the same analogy, it's like learning enough English to be conversational and get around America, without knowing what an "appositive" is.
The software industry, similarly, is full of people who make use of computer science concepts, without having rigorously studying computer science. You can't learn true "computer science" without an understanding of discrete math, but you can certainly get a job as an entry-level SWE without one. You don't need discrete math to learn python, see that it's useful, and do something interesting with it.
The same applies to linear algebra. Everyone who does vector math doesn't need to be able to prove that the tools they are using work. If everyone who does vector math is re-deriving their math from first principles, then something's gone terribly wrong. There's a menu of known, well-defined treatments that can be applied to vectors, and one can read about them and trust that they work without having proven why they work.
EDIT: it occurs to me, an even stronger analogy of this point, is that it is entirely possible to study computer science, without having any understanding of electrical engineering or knowing how a transistor works.
> But if you're trying to apply the math, if you're studying linear algebra because it's useful rather than for its own sake, then you don't need proofs. To follow the same analogy, it's like learning enough English to be conversational and get around America, without knowing what an "appositive" is.
Sure, but then you're not studying math, you're studying applications of math or perhaps you're even studying some other subject like engineering which is built on top of applications of math.
To add an extra analogy to the pile, its like learning to drive a car vs learning how to build a car. Sure, its completely valid to learn how to drive a car without knowing how to build one, but no one says they're learning automotive engineering when they're studying for their driving test. Its a different subject.
Mathematicians are well aware of complaints like these about introductions to their subjects, by the way.
It is for a reason that this book introduces the theory of abstract vector spaces and linear transformations, rather than relying on the crutch of intuition from Euclidean space. If you want to become a serious mathematician (and this is a book for such people, not for people looking for a gentle introduction to linear algebra for the purposes of applications) at some point it is necessary to rip the bandaid of unabstracted thinking off and engage seriously with abstraction as a tool.
It is an important and powerful skill to be presented with an abstract definition, only loosely related to concrete structures you have seen before, and work with it. In mathematics this begins with linear algebra, and then with abstract algebra, real analysis and topology, and eventually more advanced subjects like differential geometry.
It's difficult to explain to someone whose exposure to serious mathematics is mostly on the periphery that being exposed forcefully to this kind of thinking is a critical step to be able to make great leaps forward in the future. Brilliant developments of mathematics like, for example, the realisation that "space" is an intrinsic concept and geometry may be done without reference to an ambient Euclidean space begin with learning this kind of abstract thinking. It is easy to take for granted the fruits of this abstraction now, after the hard work has already been put in by others to develop it, and think that the best way to learn it is to return back to the concrete and avoid the abstract.
The point of starting with physical intuition isn't to give students a crutch to rely on, it's to give them a sense of how to develop mathematical concepts themselves. They need to understand why we introduce the language of vector spaces at all - why these axioms, rather than some other set of equally arbitrary ones.
This is often called "motivation", but motivation shouldn't be given to provide students with a reason to care about the material - rather the point is to give them an understanding of why the material is developed in the way that it is.
To give a basic example, high school students struggle with concepts like the dot and cross products, because while it's easy to define them, and manipulate symbols using them, it's hard to truly understand why we use these concepts and not some other, e.g. the vector product of individual components a_1 * b_1 + a_2 * b_2 ...
While it is a useful skill to be adroit at symbol manipulation, students also need an intuition for deciding which way to talk about an unfamiliar or new concept, and this is an area in which I've found much of mathematics (and physics) education lacking.
Physical intuition isn’t going to help when you’re dealing with infinite-dimensional vector spaces, abstract groups and rings, topological spaces, mathematical logic, or countless other topics you learn in mathematics.
Not at all! I fully endorse learning. My point is that physical intuition will only get you so far in mathematics. Eventually you have to make the leap to working abstractly. At some point the band-aid has to come off!
You just visualize 2 or 3 and say "n" or "infinite" out loud. A lot of the ideas carry over with some tweaks, even in infinite dimensions. Like spectral theorems mostly say that given some assumption, you have something like SVD.
Now module theory, there's something I don't know how to visualize.
>... rather than relying on the crutch of intuition from Euclidean space
Euclidean space is not a good crutch, but there are other, much more meaningful, crutches available, like (orthogonal) polynomials, Fourier series etc. Not mentioning any motivations/applications is a pedagogical mistake IMO.
I think we need some platform for creating annotated versions of math books (as a community project) - that could really help.
On that of course I agree, but mathematicians tend to "relegate" such things to exercises. This tends to look pretty bad to enthusiasts reading books because the key examples aren't explored in detail in the main text but actually those exercises become the foundation of learning for people taking a structured course, so its a bit of a disconnect when reading a book pdf. When you study such subjects in structured courses, 80%+ of your engagement with the subject will be in the form of exercises exploring exactly the sorts of things you mentioned.
Axler serves as an adequate first introduction to linear algebra (though it is intended to be a second, more formal, pass through. Think analysis vs calculus), but it isn't intended to be a first introduction to all of formal mathematics! A necessary prereq is understanding some formal language used in mathematics- what unique means is included in that.
Falling entirely back on physical intuition is fine for students who will use linear algebra only in physical contexts, but linear algebra is often a stepping stone towards more general abstract algebra. That's what Axler aims to help with, and with arbitrary (for instance) rings there isn't a nice spacial metaphor to help you. There you need to have developed the skill of looking at a definition and parsing out what an object is from that.
> This is actually why I feel that mathematical texts tend to be not rigorous enough, rather than too rigorous.
This is _precisely_ the opinion of Roger Godement, French mathematician and member of the Bourbaky group.
I would highly recommend his books on Algebra. They are absolutely uncompromising on precision and correctness, while also being intuitive and laying down all the logical foundations of their rigor.
Overall, I cannot recommend enough the books of the Bourbaky group (esp. Dieudonne & Godement). They are a work of art in the same sense that TAOCP is for computer science.
Unfortunately, some of the Bourbaki books need to be read in French, because the typesetting on the English translations is so atrocious as to be unreadable; as a consolation, the typesetting on the original French is, as always, immaculate.
I absolutely agree about additional rigor and precision making math easier to learn. Only after you're familiar with the concepts can you be more lazy.
That's the approach taken by my favorite math book:
> This is actually why I feel that mathematical texts tend to be not rigorous enough, rather than too rigorous. On the surface the opposite is true - you complain, for instance, that the text jumps immediately into using technical language without any prior introduction or intuition building. My take is that intuition building doesn't need to replace or preface the use of formal precision, but that what is needed is to bridge concepts the student already understands and has intuition for to the new concept that the student is to learn.
If you read the book in the original post you may find it's absolutely for you.
Axler assumes you know only the real numbers, then starts by introducing the commutative and associative properties and the additive and multiplicative identity of the complex numbers[1]. Then he introduces fields and shows that, hey look we have already proved that the real and complex numbers are fields because we've established exactly the properties required. Then he goes on to multidimensional fields and proves the same properties (commutativity and associativity and identities) in F^n where F is any arbitrary field, so could be either the real or the complex numbers.
Then he moves onto vectors and then onto linear maps. It's literally chapter 3 before you see [ ] notation or anything that looks like a matrix, and he introduces the concept of matrices formally in terms of the concepts he has built up piece by piece before.
Axler really does a great job (imo) of this kind of bridge building, and it is absolutely rigorous each step of the way. As an example, he (famously) doesn't introduce determinants until the last chapter because he feels they are counterintuitive for most people and you need most of the foundation of linear algebra to understand them properly. So he builds up all of linear algebra fully rigorously without determinants first and then introduces them at the end.
[1] eg he proves that there is only one zero and one "one" such that A = 1*A and A = 0 + A.
A lot of people think Gil Strang was that. Certainly his 18.06SC lecture series is fabulous.[1]
I really like Sheldon Axler and he has made a series of short videos to accompany the book that I think are wonderful. Very clear and easy to understand, but with a little bit more of the intuition behind the proofs etc.
This, betterexplained, ritvikmath, SeeingTheory will give you a very solid math background(I think they are better than 90% of the intro math classes in colleges).
> Isn‘t there anybody close to the Feynman of Linear Algebra?
No. The subject is too young (the first book dedicated to Linear Algebra was written in 1942).
Since then, there have been at least 3 generations of textbooks (the first one was all about matrices and determinants). That was boring. Each subsequent iteration is worse.
What is dual space? What motivates the definition? How useful is the concept? After watching no less than 10 lectures on the subject on youtube, I'm more confused than ever.
Why should I care about different forms of matrix decomposition? What do they buy me? (It turns out, some of them are useful in computer algebra, but the math textbook is mum about it)
My overall impression is: the subject is not well understood. Give it another 100 years. :-)
Gilbert Strang (already mentioned by fellow commenters).
> The subject is too young
"The first modern and more precise definition of a vector space was introduced by Peano in 1888; by 1900, a theory of linear transformations of finite-dimensional vector spaces had emerged." (from Wikipedia)
The first book was written in 1942 - it's mentioned explicitly in LADR.
It doesn't mean the concepts didn't exist - they did, Frobenius even built a brilliant theory around them (representation theory), but the subject was defined quite loosely - apparently no one cared to collect the results in one place.
It doesn't even matter much: I remember taking the course in 1974, and it was totally different from what is being taught today.
What? Linear Algebra is easily one of the best understood fields of mathematics. Maybe elementary number theory has it beat, but the concepts that drive useful higher level number theory aren't nearly so clear or direct as those driving linear algebra. It's used as a lingua franca between all sorts of different subjects because mathematicians of all stripes share an understanding of what it's about.
From what you said there, it seems like you tried to approach linear algebra from nearly random directions- and often from the end rather than the beginning. If you're in it for the computation, Axler definitely isn't for you. There are texts specifically on numeric programming- they'll jump straight to the real world use. If you want to understand it from a pure math perspective, I'd recommend taking a step back and tackle a textbook of your choosing in order. The definition of a dual space makes a lot more sense once you have a vector space down.
I sympathize with the person you're responding to a lot more than you.
It's very easy to understand what a dual space is. It's very hard to understand why you should care. Many of the constructions that use it seem arbitrary: if finite vector spaces are isomorphic to their duals, why bother caring about the distinction? There are answers to this question, but you get them somewhere between 1 and 5 years later. It is a pedagogical nightmare.
Every concept should have both a definition and a clear reason to believe you should bother caring about it, such as a problem with the theory that is solved by the introduction of that concept. Without the motivating examples, definitions are pointless (except, apparently, to a certain breed of mathematicians).
I've read something like 100 math textbooks at this point. I would rate their pedagogical quality between an F and a D+ at best. I have never read a good math textbook. I don't know what it is, but mathematicians are determined to make the subject awful for everybody who doesn't think the way they do.
(I hope someday to prove that it's possible to write a good math textbook by doing it, but I'm a long way away from that goal.)
I absolutely see what you're saying with that. I think I'm definitely the target audience of the abstracted definition, but I've long held that every new object should be introduced with 3 examples and 3 counter-examples. But you said it yourself- that's the style pure math texts are written in! Saying that "we" as a species don't have a good understanding of linear algebra is unbelievable nonsense. I can't conceive of the thought process it would take to say that with a straight face. The fact is, 10 separate YouTube lectures disconnected from anything else is just the wrong way to try and learn a math topic. That's going to have as much or more to do with why dual spaces seem unmotivated as the style of pedagogy does.
It's not that we don't have a good understanding of linear algebra at all. It's that we don't understand how to make it simple. It's like a separate technological problem than actually building the theory itself.
I'm not the person you were originally replying to, but I have taken all the appropriate classes and still find the dual space to be mostly inappropriately motivated. There is a style of person for whom the motivation is simply "given V, we can generate V* and it's a vector space, therefore it's worth studying". But that is not, IMO, sufficient. A person the subject can't make sense of that understanding the alternative: not defining it, and discarding it, and ultimately why one approach was stolen over the others.
I think in 50 years we will look back on the way pure math was written today as a great tragedy of this age that is thankfully lost to time.
> I think in 50 years we will look back on the way pure math was written today as a great tragedy of this age that is thankfully lost to time.
That could very well be true. I mean just a 100 years ago mathematics (and most education) consisted almost exclusively of the most insane drudgery imaginable. I do sometimes wonder what the world could have been like if we didn't gate contributions in math or physics behind learning classical greek.
I do think that some of the issues come down to different learning styles. I personally like getting the definition up front- it keeps me less confused, and I can properly appreciate the examples down the line. The way Axler introduces the dual space was really charming for me, and it clicked in a way that "vectors as columns, covectors as rows" never did. But that's not everyone! It's by no means everyone in pure math, and its definitely not everyone who needs to use math. I've met people far better than me who struggled just because the resources weren't tuned towards them- there's a huge gap.
My arguments is: whoever understands linear algebra has to be able to explain it to anyone having a sufficient math background. The failure to do so signals the lack of understanding. Presenting it as a pure algebraic game cleverly avoids the problems of interpretation, but when you proceed to applications, it leads to conceptual confusion.
One "discovery" I made while learning LA is that most applications are based on mathematical coincidence. Namely, the formula for the scalar product of 2 vectors is identical to the formula for the correlation between 2 series of data. There's no apparent connection between the "orthogonality" in one sense and "orthogonality" (as a lack of correlation) in another.
I submit that not only the subject is not well understood, but even the name of the subject is wrong. It should be called "The study of orthogonality". This change of perspective will naturally lead to discussion of orthogonal polynomials, orthogonal functions, create a bridge to representation theory and (on the other end) to the applications in data science. What say you? :-)
I think that "when you proceed to applications" is the issue there. Applications where? For applications in field theory, the spatial metaphor is exactly incorrect! For applications in various spectral theories, it's worse than useless.
What you say regarding the seeming coincidental nature of "real world" applications is basically correct (with correlation specifically there's some other stuff going on, it isn't that surprising, but in general), but unavoidable for any aspect of pure mathematics. Math is the study of formal systems, and the real world wasn't cooked up on a black board. If we can demonstrate that some component of reality obeys laws which map onto axioms, we can apply math to the world. But re-framing an entire field to work with one specific real world use (not even imo the most important real world use!) is just silly.
I love the idea of encouraging students early on to look at different areas of math and see the connections. But linear algebra is connected in more ways to more things than just using an inner product to pull out a nice basis. Noticing that polynomials, measurable functions, etc are vectors is possible without reframing the entire field, and there are lots of uses of linear algebra that don't require a norm! Hell representation theory only does in some situations.
You start with a controversial statement ("Math is the study of formal systems"), and the rest follows. Not everyone agrees with this viewpoint. I think algebraic formalization provides just one perspective of looking at things, but there are other perspectives, and their interplay (superposition) constitutes the "knowledge". Focusing just on albegraic perspective is a pedagogical mistake IMO.
Some say it's all a kind of hangover from bourbakinism though.
(Treating math as a game of symbols is equivalent to artificial restriction to use just 1% of your brain capacity IMO)
Hmm, I do see where you're coming from. To me, saying math is the study of formal systems is a statement of acceptance and neutrality- we can welcome ultrafinitists and non-standard analysts under one big tent. But you correctly point out that it's still a boundary I've drawn, and it happens to be drawn around stuff I enjoy. I'm by no means saying that there isn't room for practical, grounded math pedagogy with less emphasis on rigor.
However, there's plenty of value in the formal systems stuff. Algebraic formalization is just one way of looking at the simplest forms on linear algebra, but there really isn't any other way of looking at abstract algebra. Or model theory, or the weirder spectral stuff. Or algebraic topology. And when linear algebra comes up in those contexts (which it does often, it's the most well developed field of mathematics), it's best understood from an abstract, formal perspective.
And, just as a personal note, I personally would never have pursued mathematics if it were presented any other way. I'm not trying to use that as an argument- as we've discussed, the problem with math pedagogy certainly isn't a lack of abstract definitions and rigor. But there are people who think like me, and the reason the textbooks are written like that is because that's what was helpful to the authors when they were learning. It wasn't inflicted on our species from the outside.
> the reason the textbooks are written like that is because that's what was helpful to the authors when they were learning
The author writing a book after 30 years of learning, thinking, talking with other people cannot easily reconstruct what was helpful and what wasn't. Creating 1-dimensional representation of the state of mind (which constitues "understanding") is a virtually impossible task. And here algebraic formalism comes to the rescue. "Definition" - "Theorem" - "Corollary" structure looks like a silver bullet, it fits very well in a linear format of a book. Unfortunately, this format is totally inadequate when it comes to passing knowledge. Very often, you can't understand A before you understand B, and you can't understand B before understanding A - the concepts in math are very often "entangled" (again, I'm talking about understanding, not formal consistency). You need examples, motivations, questions and answers - the whole arsenal of pedagogical tricks.
Some other form of presentation must be found to make it easier to encode the knowledge. Not sure what this form might be. Maybe some annotated book format will do, not sure. It should be a result of a collective effort IMO. Please think about it.
BTW, this is not a criticism of LADR book in particular. The proofs are concise and beautiful. But... the compression is very lossy in terms of representing knowledge.
> "Definition" - "Theorem" - "Corollary" structure looks like a silver bullet, it fits very well in a linear format of a book. Unfortunately, this format is totally inadequate when it comes to passing knowledge.
I really can't emphasize enough that this is exactly how I learn things. I don't claim to be a majority! But saying that no one can learn from that sort of in-order definition-first method is like saying no one can do productive work before 6am. It sucks that morning people control the world, but its hardly a human universal to sleep in.
> Some other form of presentation must be found to make it easier to encode the knowledge. Not sure what this form might be. Maybe some annotated book format will do, not sure. It should be a result of a collective effort IMO.
I 100% agree. Have you seen the napkin project? I don't love the exposition on everything, but it builds up ideas pretty nicely, showing uses and motivation mixed in with the definitions. I've been trying to write some resources of my own intended for interested laymen, so more focus on motivation and examples and less on proofs and such. I like the challenge of trying to cut to the core of why we define things a certain way- though I'm biased towards "because it makes the formal logic nice" as an explanation.
What do you mean with correlation and orthogonality? Like with signal processing, you might calculate the cross-correlation of two signals, and it basically tells you at each possible shifted value, to what extent does one signal project onto the other (so what's their dot product). Orthogonality is not invariant under permuting/shifting entries in just one of the vectors, obviously (e.g. in your standard 2-d arrows space, x-hat is orthogonal to y-hat but not x-hat).
Linear algebra studies linearity, not (just) orthogonality. Orthogonality requires an inner product, and there isn't a canonical one on a linear structure, nor is there any one on e.g. spaces over finite fields. Mathematics, like programming, has an interface segregation principle. By writing implementations to a more minimal interface, we can reuse them for e.g. modules or finite spaces. It also makes it clear that questions like "are these orthogonal" depend on "what's the product", which can be useful to make sense of e.g. Hermite polynomials, where you use a weighted inner product.
> Namely, the formula for the scalar product of 2 vectors is identical to the formula for the correlation between 2 series of data. There's no apparent connection between the "orthogonality" in one sense and "orthogonality" (as a lack of correlation) in another.
Of course there is. Covariance looks like an L2 norm (what you're calling the scalar product) because it is an L2 norm. They're the exact same object.
Why should it buy you something is the real question.
You don't need to understand it the way the "initial" author thought about it, should that person had given it more thoughts...
History of maths is really interesting but it's not to be confused with math.
Concepts are not useful as you think about them in economic opportunity case. Think about them as "did you notice that property" and then you start doing math, by playing with these concepts.
Otherwise you'll be tied to someones way of thinking instead of hacking into it.
I know more math than the average bear, but I think the parent has a point even if I don’t totally agree with them.
Take for instance the dual space example. The definition of it to someone who hasn’t been exposed to a lot of math seems fine but not interesting without motivation — it looks just another vector space that’s the same as the original vector space if we’re working in finite dimensions.
However, the distinction starts to get interesting when you provide useful examples of dual spaces. For example, if your vector space is interpreted as functions (for the novice, even they can see that a vector can be interpreted as a function that maps an index to a value), then the dual space is a measure — a weighting of the inputs of those functions. Even if they are just finite lists of numbers in this simple setting, it’s clear that they represent different objects and you can use that when modeling. How those differences really manifest can be explored in a later course, but a few bits of motivation as to “why” can go a long way.
Mathematicians don’t really care about that stuff — at least the pure mathematicians who write these books and teach these classes — because they are pure mathematicians. However, the folks taking these classes aren’t going to all grow up and be pure mathematicians, and even if they are, an interesting / useful property or abstraction is a lot more compelling than one that just happens to be there.
Your post represents a common viewpoint, but I don't agree with it. I'm a retired programmer trying to learn algebra for the purposes of education only. I am not supposed to take an exam or use the material in any material way, so to speak. I'd like to understand. Without understanding motivations and (on the opposite end) applications I simply lose interest. I happen to have a degree in math, and I know for the fact that when you know (or can reconstruct) the untuition behind the theory - it makes a world of a difference. If this kind of understanding is not a goal, then what is?
BTW, by "buying" I din't mean that it should buy me a dinner, but at least it's supposed to tell me something conceptually important within the theory itself. Example: in the LADR book, the chapter on dual spaces has no consequences, and the author even encourages the reader to skip it :).
> Why should I care about different forms of matrix decomposition? What do they buy me?
A natural line of questioning to go down once you're acquainted with linear maps/matrices is "which functions are linear"/"what sorts of things are linear functions capable of doing?"
It's easy to show dot products are linear, and not too hard to show (in finite dimensions) that all linear functions that output a scalar are dot products. And these things form a vector space themselves, the "dual space" (because each element is a dot-product mirror of some vector from the original space). So linear functions from F^n -> F^1 are easy enough to understand.
What about F^n -> F^m? There's rotations, scaling, projections, permutations of the basis, etc. What else is possible?
A structure/decomposition theorem tells you what is possible. For example, the Jordan Canonical Form tells you that with the right choice of basis (i.e. coordinates), matrices all look like a group of independent "blocks" of fairly simple upper triangle matrices that operate on their own subspaces. Polar decomposition says that just like complex numbers can be written in polar form re^it, where multiplication scales by r and rotates by t, so can linear maps be written as a higher dimensional multiplication/scaling and orthogonal transformation/"rotation". The SVD says that given the correct choice of basis for the source and image, linear maps all look like multiplication on independent subspaces. The coordinate change for SVD is orthogonal, so another interpretation is that roughly speaking, SVD says all linear maps are a rotation, scaling, and another rotation. The singular vectors tell you how space rotates and the singular values tell you how it stretches.
So the name of the game becomes to figure out how to pick good coordinates and track coordinate changes, and once you do this, linear maps become relatively easy to understand.
Dual spaces come up as a technical thing when solving PDEs for example. You look for "distributional" solutions, which are dual vectors (considering some vector space of functions). In that context people talk about "integrating a distribution with test functions", which is the same thing as saying distributions are dot products (integration defines a dot product) aka dual vectors. There's some technical difficulties here though because now space is infinite dimensional, and not all dual vectors are dot products, e.g. the Dirac delta distribution delta(f) = f(0) can't be written as a dot product <g,f> for any g, but it is a limit of dot products (e.g. with taller/thinner gaussians). One might ask whether all dual vectors are limits of dot products and whether all limits of dual vectors are dual vectors (as limits are important when solving differential equations). The dual space concept helps you phrase your questions.
They also come up a lot in differential geometry. The fundamental theorem of calculus/Stokes theorem more-or-less says that differentiation is the adjoint/dual to the map that sends a space to its boundary. I don't know off the top of my head of more "elementary" examples. It's been like 10 years since I've thought about "real" engineering, but roughly speaking, dual vectors model measurements of linear systems, so one might be interested in studying the space of possible systems (which, as in the previous paragraph, might satisfy some linear differential equations). My understanding is that quantum physics uses a dual space as the state space and the second dual as the space of measurements, which again seems like a fairly technical point that you get into with infinite dimensions.
Note that there's another factoring theorem called the first isomorphism theorem that applies to a variety of structures (e.g. sets, vector spaces, groups, rings, modules) that says that structure-preserving functions can be factored into a quotient (a sort of projection) followed by an isomorphism followed by an injection. The quotient and injection are boring; they just collapse your kernel to zero without changing anything else, and embed your image into a larger space. So the interesting things to study to "understand" linear maps are isomorphisms, i.e. invertible (square) matrices. Another way to say this is that every rectangular matrix has a square matrix at its heart that's the real meat.
The thing is, you can teach linear algebra as a gateway to engineering applications or as a gateway to abstract algebra. The second one will require a hell of a lot more conceptual baggage than the first one. It’s also what the book is geared towards.
It is also intended for people who know something about the trade; it isn’t “baby’s first book on maths”. (Why can you graduate high school, do something labelled “maths” for a decade, and still be below the “baby’s first” level, incapable of reading basically any professional text on the subject from the last century? I don’t know. It’s a failure of our society. And I don’t even insist on maths being taught—but if they don’t teach maths, at least they could have the decency to call their stupid two-hundred-year-old zombie something else.)
That conceptual baggage is not useless even in the applied context. For example, I know of no way to explain the Jordan normal form in 19th-century “columns or numbers” style preferred by texts targeted at programmers. (Not point at, not demonstrate, not handwave, explain—make it obvious and inevitable why such a thing must exist.) Or the singular value decomposition, to take a slightly simpler example. (Again, explain. You task, should you choose to accept it, is to see a pretty picture behind it.) And so on.
Again, you can certainly live without understanding any of that. (To some extent. You’ll have a much harder time understanding the motivation behind PageRank then, say. And ordinary differential equations, classical mechanics, or even just multivariable calculus will look much more mysterious than they actually are.) But in that case you need a different book and a different teacher.
I like the free course on linear algebra by Strang’s Ph.D student Pavel Grinfeld. It's a series of short videos with online graded exercises. Most concepts are introduced using geometric vectors, polynomials, and vectors in ℝⁿ as examples. https://www.lem.ma/books/AIApowDnjlDDQrp-uOZVow/landing
> Isn‘t there anybody close to the Feynman of Linear Algebra?
That would probably be Gilbert Strang.
While, as a maths person I would prefer a bit more rigour, his choice of topics and his teaching skill make his the most outstanding introductory course I have seen.
I would run a mile from any course that disrespects determinants. And that includes Axler's!
Also I wish more Linear Algebra courses would cover Generalized Inverses.
As mentioned, the book was intended to be a "second course" in linear algebra. I personally self-studied out of the 3rd edition of Axler, and found it very helpful for understanding exactly what is going on with all the matrix computations we do.
Plus, the same can be said about artists. After all, it's all self-aggrandization, and art is not made to be simple or intuitive.
I actually found the book quite intuitive and helpful in understanding linear algebra. It does explain a lot of the intuition for many definitions, as well as mathematical techniques.
It's easy when presented with new things that you don't understand to reflexively dismiss them, but the ideas here are quite solid. It's also a textbook which aims to introduce students to a slightly higher level of mathematical thinking.
I self studied from this book as an undergrad. I was an EE major and took linear algebra as part of the mandatory ODEs class but didn’t “get it.” At a certain point, it became clear that if I wanted to learn the more advanced applied math I was interested in studying, I needed to really understand linear algebra. I thought Axler was great at introducing both the material and teaching me how to prove things rigorously. The month or so I spent that summer reading that book made the rest of the math I took in undergrad trivial.
I have surveyed every LA books out there and a lot of amazons reviews claimed axler’s book is the best LA book.
It might be for case for printed books for sale. But I stumbled upon Terrance Tao’s pdf LA lecture slides on his website and it is so much better than all the books I’ve surveyed.
The writing is super clear and everything is built from the first principles.
(BTW terry’s real analysis book did the same for me. Much more clear and easy to follow than the classics out there)
These notes are excellent. One good thing is how often Terence Tao gives real life examples and analogies, contrary to what one may expect from a fields medal winner. From utilitarian perspective, reading Axler's book looks like comically bad use of one's time.
Tao's notes seem to be based on the book Linear Algebra by Friedberg, Insel and Spence. I found it to be one of the best books on Linear Algebra, better than even Hoffman/Kunze. The proofs are extremely clear, it has examples like PageRank, Markov Chains, PCA and the solutions to just about every exercise is available on Quizlet.
Because the poor guy contributes so much to math and math exposition and yet has his name misspelled everywhere, I'll mention that it's Terence, not Terrance.
I'm not sure that Axler's book is great as a first LA book. I would go with something more traditional like Strang.
Although I really didn't feel like I "got" LA until I learned algebra (via Artin). By itself LA feels very "cookbook-y", like just a random set of unrelated things. Whereas in the context of algebra it really makes a lot more sense.
> You are probably about to begin your second exposure to linear algebra. Unlike
your first brush with the subject, which probably emphasized Euclidean spaces
and matrices, this encounter will focus on abstract vector spaces and linear maps. These terms will be defined later, so don’t worry if you do not know what they mean. This book starts from the beginning of the subject, assuming no knowledge of linear algebra. The key point is that you are about to immerse yourself in serious mathematics, with an emphasis on attaining a deep understanding of the definitions, theorems, and proofs.
It is definitely a hard text if you haven't had exposure to linear algebra before.
The thing is, by the time you get to this book, most students have probably taken DiffEq or multivariable calculus, and had exposure to linear algebra there. (If not in high school.)
My weekly chance to gripe: unfortunately nobody who writes about GA seems to be bothered by the fact that the geometric product is basically meaningless (outside of a couple of specific examples, complex numbers and quaternions).
If they would just write only about the wedge product and omit the geometric product entirely, it would actually be a great book.
There are other models of the two that don't require the geometric product at all. The rest of linear algebra doesn't need it, and recasting all of it in terms of a frankly terrible operation is not helpful for intuition.
talking about amazon, someone suggested me to get gareth williams linear algebra with applications (5 bucks on ebay)
it's a good applied primer, not big on concepts, more about the mechanics, and it unlocked a lot of things in my head because dry textbook morphisms definitions sent me against imaginary walls faster than c
I have trouble with believing an author has the students' interests at heart when the answers to solutions are not /in/ the book. Why on earth not? It makes no sense from any student perspective.
I know it's common in textbooks to not even have them available at all. Any ideas why these are the norm not putting them where they clearly belong? It just seems so hostile.
It’s too easy to just look up the solution instead of being forced to think hard about the problem. This is training for when you’ll later encounter problems outside of textbooks where you’ll have no choice but to solve them on your own. And you’re supposed to have teaching assistants or similar available when you really remain stuck.
I care not for this explanation and opinion.
If someone is unable to work on practice problems without just looking up the solutions that is really their problem, not mine.
> And you’re supposed to have teaching assistants or similar available when you really remain stuck.
That just translates to giving the middle finger to self learners.
I have not yet seen a decent explanation for withholding sample solutions/explanations. It usually just boils down to "I want this to be only for university professors to give homework from and I am of the opinion that university students lack the minimal discipline to work on problems in their own"
Yeah sorry that's just total and utter BS. No sale.
I am an adult. >99% of students of university subjects are adults. Even assuming your premise totally, anyone so incredibly stupid they can't have it explained in the text when is the right time to use the solutions is too stupid to learn from the text. Axler puts them on a website that is less than 30 seconds away.
"No choice..." Describes what proportion of texbook sales?
“No choice” was when I was in university. Text books didn’t have web sites then. To be honest I have no idea how frequent they are now. In any case, solutions not being available to students was a normal and expected situation back then, and it helped me develop grit for solving problems, because I’m otherwise quite lazy.
"No choice" here means you didn't choose the textbook, you (or your benefactor) simply paid for it.
I don't think "normal" is any excuse for any behavior. At one time, not so long ago, <insert evil thing here> was normal.
If you need information withheld from you to learn because of laziness, fine, sorry you have that problem, good luck solving it. It should not become the problem of anyone else nor be considered acceptable to impose a solution for you on them at their expense. That's just rotten.
Any textbook that is incomplete to learn from independently, without a "professor" reading a slide deck, without TA's and a yearly bill of over $10k over and above the textbook is not worthy of the name textbook and should be treated by every educated person with contempt. Same companies rent-seeking that do academic journals right. Hate isn't too strong a word for them if we care about education over rent-seeking.
4th edition usually means 3rd time the publisher re-ordered the exercises so the numbers don't match to kill the 2nd hand market. The whole thing blows goats.
About avoiding determinants to the degree that this book does: while I agree it makes sense to delay introducing them, the goal should not be avoidance but clarity. The way author has to bend himself backwards here when dealing with eigenvalues isn't great either.
I would recommend Strang for a healthy balance in handling determinants.
Axler is pathological in his avoidance of determinants. I've heard (third hand) that he once pulled aside some fields medalist into a classroom after a talk and asked them "Do you like determinants?" I imagine him drawing the curtains and sweeping for bugs first.
I attended a (remote) seminar where he was talking about this book, and this seems more or less accurate. Mathematicians are a weird lot.
The response that he received in the story was "I feel about them the same way I feel about tomatoes. I like to eat them, but other than that, no, I don't like them."
I read Strang and then Axler. Strang is great at numerics but weak at presenting the abstract picture. I feel like if I had taken, say, finite elements (or any other subject where it's important to take the abstract / infinite dimensional picture seriously before reducing to finite dimensions) right after Strang without reading LADR then I'd have been seriously underprepared.
You have a point in that to understand any particular subject well, it makes sense to read more than one book on it, at least to compare the different perspectives.
Also worth noting that Strang has a couple of similar linear algebra books, so we might not even be discussing the same text.
That's entirely possible, but in the context of introductory books I think it's fair to assume & limit scope to Strang's "Introduction to Linear Algebra" and Axler's "Linear Algebra Done Right."
I am an applications-oriented person and my inclination was to go directly from a matrix/determinant heavy picture into applications. Strang['s intro text] only. I am extremely glad that someone intercepted me and made me get some practice with abstract vector spaces, operators, and inner product spaces first, using Axler. This practice bailed me out and differentiated me from peers on a number of occasions, so I want to pass down the recommendation.
FWIW I think this is the benefit of Strang. If you're in science or engineering or statistics, often you don't need the general picture, and IMO too much generality gets in the way of understanding. Start with a good understanding of the most important cases that appear in applied work, and drill them until you're fluent with them. Then generalizing will be easier.
Honestly, I think Strang is overrated. Yeah, I know, on HN that's like criticizing Lisp or advocating homebrew cryptography or disagreeing that trains fix everything. But still.
I bought his 6th ed. Introduction to Linear Algebra textbook, and he doesn't get more than two pages into the preface before digressing into an unjustified ramble about something called "column spaces" that appears in no other reference I've seen. (And no, boldfacing every second phrase in a math book just clutters the text, it doesn't justify or explain anything.) Leafing through the first few chapters, it doesn't seem to get any better.
The lecture notes by Terence Tao that someone else mentioned look excellent, in comparison.
I definitely covered the column space and row space in my undergrad LA class, long before I had ever heard of Strang.
An exceptional minority of people has the ability to learn linear algebra in its full abstract generality as their first treatment of the material, and come away with something resembling an understanding.
The rest of us dopey oafs must develop intuition carefully from specific concrete examples that extend gradually from algebra and geometry that we are familiar with already. Those of us in this sad deficient category must be led painstakingly over several weeks of course material to even the basic idea that a matrix is just a particular representation and special case of something called a linear transformation.
If you are one of the former type, you are blessed, but it's unfair to sneer at the latter, and it will only do your students a disservice.
Perhaps, but that's about as useful as pointing out that monads are a monoid in the category of endofunctors. What's the "image of a matrix?" Coming at LA from a 3D graphics background, I've never heard that term before. And what does the "span of its columns" mean?
To me, each column represents a different dimension of the basis vector space, so the notion that X, Y, and Z might form independent "column spaces" of their own is unintuitive at best.
These are all questions that can be Googled, of course, but in the context of a coherent, progressive pedagogical approach, they shouldn't need to be asked. And they certainly don't belong in the first chapter of any introductory linear algebra text, much less the preface.
> To me, each column represents a different dimension of the basis vector space, so the notion that X, Y, and Z might form independent "column spaces" of their own is unintuitive at best.
I can't help but feel a treatment of linear algebra that assumes all matrices are invertible by default isn't a very good treatment at all. Column spaces are exactly how you harness your (very useful!) intuition that the columns of a matrix are where the basis goes. I agree that it should be defined before use, but it is- in the textbook proper. The preface is for the author to express themselves!
Now row spaces are an abomination, but that's because I'm not really a computation guy. I'm sure they're great if you get to know them.
In the context of linear algebra, a matrix is a linear map. A map is characterized by its domain and its image. These are very important characteristics.
His lectures are great but I definitely agree about the book. It reads like one of the TAs transcribed the lectures and added some exercises to the end.
In most of the LA courses determinants just... feel almost completely unmotivated, their definition just "comes down from the heavens in all its mysterious glory", and wow, how convenient that those things have all those nice properties!.. although they don't seem to actually be used for much unless your LA course actually contains elements of elimination theory which most of them don't, for some reason (even though it would seem to be quite a useful part of mathematical knowledge but apparently not).
Also, if you prefer an abstract approach, the determinant is just the nth exterior power of a linear transformation :) No need to introduce a basis at all, at least in principle.
Note that Axler intended this book to be the second reading of Linear Algebra after you've already taken a first course, but it is doable for a first reading.
If you want to be crazy you can also check out A (Terse) Introduction to Linear Algebra by Katznelson & Katznelson.
I did undergrad linear algebra with my daughter last semester, and Strang and Axler were a good one-two punch, Strang for the computation, Axler for the proofs homework.
Yeah, my math class followed Axler, which was great - but I didn't really get a feel for how useful linear algebra was until I read through Strang on my own. The applications are endless!
Pretty much all of grad level engineering classes, all of grad level Stats classes, all of grad level Econ classes include applications of LA. Heck quantitative research work in social sciences are still included via stats.
Same time. I'd taught myself linear algebra from the Strang lectures (and a Slack study group we set up with some random university's syllabus, which gave us a set of homework problems to do) long before this, so mostly I just matched the professor's lectures to the Strang material, and dipped in and out of Axler when proof and conceptual stuff came up; it's not like we did Axler cover to cover.
Before doing this, I'd only ever sort of skimmed Axler; it's sort of not the linear algebra you care about for cryptography, and up until the spectral theorem stuff that's exactly what Strang was. It was neat to get an appreciation for Axler this was.
Dang, I remember everyone not enjoying linear algebra class with Katznelson in undergrad. I did ok but it felt like way more focus on things like row elimination algorithms than why any of it works. It wasn't until I worked with a PhD geometer that any of it made sense and they largely cribbed from Linear Algebra Done Right. Hopefully the book is better than a class aimed at a generic mix of STEM undergrads.
I would definitely choose Strang to begin learning Linear Algebra. He develops the geometric intuition you need for applying it in other subjects such as Calculus or Statistics or Physics. Linear Algebra is not just Abstract Algebra, at least for most of us, including most mathematicians. After Strang, or even alongside him, if you want more rigour then you would benefit from a text like that of Anton, or Friedberg, or Curtis or, God forbid, Axler.
Yep. Huge kudos to the author for making it available, but the PDF does feel sloppy with all the bright colors and images. In science textbooks, less is more.
Compare this to the very latest edition of Stewart's calculus, which now uses even more pastel, subdued colors for diagrams.
Like basically everybody else I teach out of this book, and I'm happy to see a new edition. I'm curious what's changed/added -- I already am unable to get through the whole thing in a semester.
At our school students take a computational linear algebra course first (with a lot of row reduction). So I am slowed down a bit by constantly trying to help the students see that the material is really the same thing both times through. I do wish there were a little more of that in Axler.
Sure, I am very familiar with them -- I actually TAed 18.06 for Strang once upon a time. They're great books too. Which is better is mostly a question of what point of view you're after -- if you want to actually calculate anything, Axler's book is not going to help you, but if you want a more conceptual view of the subject it's best place. If you're really serious about learning linear algebra, you probably want to read both, first Strang, then Axler.
This is linear algebra for undergraduate math majors, but if you just want an basic understanding of the topic with a focus on computational applications, Poole's "Linear Algebra: A Modern Introduction" is probably more suitable as it's heavy on applications, such as Markov chains, error-correcting codes, spatiel orientation in robotics, GPS calculations, etc.
From the preface. "You cannot read mathematics the way you read a novel. If you zip through a page in less than an hour, you are probably going too fast." Sadly, he's probably right.
- Be an active reader. Open to the page you need to read, get out some paper and a pencil.
- If notation is defined, make sure you know what it means. Your pencil and paper should come in handy here.
- Look up the definitions of all words that you do not understand.
- Read the statement of the theorem, corollary, lemma, or example. Can you work through the details of the proof by yourself? Try. Even if it feels like you are making no progress, you are gaining a better understanding of what you need to do.
- Once you truly understand the statement of what is to be proven, you may still have trouble reading the proof—even someone’s well-written, clear, concise proof. Try to get the overall idea of what the author is doing, and then try (again) to prove it yourself.
- If a theorem is quoted in a proof and you don’t know what it is, look it up. Check that the hypotheses apply, and that the conclusion is what the author claims it is.
- Don’t expect to go quickly. You need to get the overall idea as well as the details. This takes time.
- If you are reading a fairly long proof, try doing it in bits.
- If you can’t figure out what the author is doing, try to (if appropriate) choose a more specific case and work through the argument for that specific case.
- Draw a picture, if appropriate.
- If you really can’t get it, do what comes naturally—put the book down and come back to it later.
- You might want to take this time to read similar proofs or some examples.
- After reading a theorem, see if you can restate it. Make sure you know what the theorem says, what it applies to, and what it does not apply to.
- After you read the proof, try to outline the technique and main idea the author used. Try to explain it to a willing listener. If you can’t do this without looking back at the proof, you probably didn’t fully understand the proof. Read it again.
- Can you prove anything else using a similar proof? Does the proof remind you of something else? -
- What are the limits of this proof? This theorem?
- If your teacher is following a book, read over the proofs before you go to class. You’ll be glad you did.
[1] Reading, Writing, and Proving: A Closer Look at Mathematics By Ulrich Daepp and Pamela Gorkin.
I think in the modern era a very good piece of advice, particularly for those of us without gorilla-like stamina to comb through a math text, is to go on your favorite video website and watch through multiple videos on the topic.
Meh, there's different goals you could have. I actually find it enjoyable to read math more quickly (almost like a novel) which gives you a good sense of a lot of the higher-level themes and ideas. Then if it's interesting I might spend more time on it.
I don’t know if it’s just me, but I’m terribly lost in the search for the right texts. Every time I come across a new book/resource, it compounds the confusion. I find myself incapable of sitting down with a book and working through it without switching to another book in-between. I’d be really happy to hear if anyone has found a solution to this unproductive but sticky habit.
Because complex numbers make the fundamental theorem of Algebra nice and simple rather than complicated and ugly. In turn, this makes the spectral theorem of linear algebra nice and simple rather than complicated and ugly. In turn, this makes a bunch of downstream applications nice and simple rather than complicated and ugly.
You will get a feel for this if you work Axler's problems. More importantly, you will gain an intuition for the fact that if you turn up your nose at complex numbers while going into these application spaces, you are likely to painstakingly reinvent them except harder, more ugly, and worse.
Example: in physics, oscillation and waves A. underpin everything and B. involve energy sloshing between two buckets. Kinetic and potential. Electric and magnetic. Pressure and velocity. These become real and imaginary (or imaginary and real, it's arbitrary). This is where complex numbers -- where you have two choices of units -- absolutely shine. Where you would have needed two coupled equations with lots of sin(), cos(), trig identities, and perhaps even bifurcated domains you now have one simple equation with exponentials and lots of mathematical power tools immediately available. Complex numbers are a huge upgrade, and that's why anything to do with waves will have them absolutely everywhere.
You might think that in every real world application, complex numbers are introduced as a convenience, and that every calculation that takes advantage of them ends with taking the real part of the result, but that's not the case. In QM, the final answer contains an imaginary part that cannot be removed.
Imagine you're a 16th century Italian mathematician who is trying to solve cubic equations. You notice that when you try to solve some equations, you end up with a sqrt(-1) in your work. If you're Cardano, you call those terms "irreducible" and forget about them. If you're Bombelli, you realize that if you continue working at the equation while assuming sqrt(-1) is a distinct mathematical entity, you can find the real roots of cubic equations.
So I would say that it's less that "Complex numbers were invented so that we can take square roots of negative numbers", and more "Assuming that sqrt(-1) is a mathematical entity lets us solve certain cubic equations, and that's useful and interesting". Eventually, people just called sqrt(-1) "i", and then invented/discovered a lot of other math.
I know people have jumped here to try to explain it to you as a comment, but I highly recommend a recent talk by Freya Holmer. It's a fun little exploration of a seemingly simply question about vectors that motivates the answer you asked. It also culminates with an introduction to Clifford Algebras, which you may or may not want to learn about after the talk.
That is already a first and most important "why". It may be a little bit too terse, but mathematics didn't evolve by knowing the applications before doing something.
Without doing some playing around from sqrt(-1) and discovering how it connects one thing to another, nobody is able to come up with real applications. You need to at least build a placeholder of the concept in your mind before you can examine what's possible.
So a person with a similar mindset as those who were the first people to use complex numbers, would just try to find a way to express a square root of a negative number and see how it goes. It starts with a limitation of an important tool and tries to close a perceived conceptual gap. The mathematicians themselves that write this book probably all think that way. I wouldn't call myself a mathematician but I didn't need anything more than that sentence to believe someone was motivated enough just from that reason alone.
So really there's a whole audience out there - arguably that a professional mathematician most wants to address - that could appreciate this sentence just as-is. So it's not true that it's natural to think the audience requires further explanation. Whether you should care is another matter. But as this is a book about linear algebra not complex numbers, some others would have accused the author of digression if he granted your wishes.
So I don't think what you're demanding is fair. Maybe it's a reasonable request after the fact, but it's a little too harsh to think it's something the author must have addressed in his text to his intended audience. This kind of inquiry is what in-person teaching is useful for.
This is completely false, there is a 'why' and it's that people needed to permit taking square roots of negative numbers to find (real!) roots of cubic polynomials. I don't think this book needs to digress into that, sure.
I prefer a simpler perspective for complex numbers of "defined latently, then discovered, accepted, named and given notation", other than "invented".
Invented implies some degree of arbitrariness or choice, but complex numbers are not an arbitrary construct.
Zero, negative numbers, and imaginary numbers were all latently defined by prior concepts before they were recognized. They were unavoidable, as existing operations inevitably kept producing them. Since they kept coming up, it forced people to eventually recognize that these seemingly nonsensical concepts continued to behave sensibly under the operations that produced them.
Once addition and subtraction were defined on natural numbers, (1, 2, 3, ... etc), the concept of zero was latently defined. The concept of "nothing" was not immediately recognized as a number, but there is only one consistent way of dealing with 2-2, 5-5, 7-7, etc. Eventually that concept was given a name "zero", notation "0", and adopted as a number.
It was discovered, in that it was already determined by addition and subtraction, just not yet recognized.
Similarly with negative numbers. They were also latently determined by addition and subtraction. At first subtracting a larger number from a smaller number was considered nonsensical. But starting from the simple acceptance that "5-8" can at least be consistently viewed as the number which added to 8 gives 5, and other similar examples, it was discovered that such numbers had only one consistent behavior.
So they were accepted, given a name "negative numbers" and a notation "-x", short hand for "0-x".
And again, once addition, multiplication, (and optionally exponentiation) were defined, the expressions x*x = -1 (or x = sqrt(-1)) were run into, they were initially considered non-sensical.
But starting from acceptance that it at least makes sense to say that "the square of the square root of -1", is "-1", it was discovered that roots of -1 could be worked with consistently using the already accepted operations that produced them.
The numbers that included square roots of -1 were given a name "imaginary numbers", the square root of -1 given notation, "i", and we got complex numbers that had both real and square root of -1 parts.
I think it's the same reason why negative numbers were invented: It lets you do more with algebra than you could before (some of which, like raising something to a power leading to a sine wave is pretty weird, but turns out to be useful in engineering, etc.), and everything else still "just works" the same way as before.
(Admittedly the applications of negative numbers are much more obvious.)
Part of the issue is that there is no simple Why. To invent one is to either petition history or invent some perspective. It's not a bad idea to do these things, didactically, but they're unnecessary for the material. Or, said another way, there's something to be said for discovering your own "why" through familiarity with the many wonderous properties the complex numbers enjoy.
That said, that's a frustrating answer. An excellent book which does just what I said above and tells a lightly fictionalized "just so" story of the "history and development" of mathematics as an excuse to introduce everything in a motivated fashion is MacLane's Mathematics: Form and Function which I just recommend endlessly.
If you're a programmer, consider whether it's easier to reason about a function that always returns a value, or a function that sometimes returns a value and sometimes throws an exception. The latter is a partial function and typically complicates reasoning because of the exceptional cases, the former is a total function and is fairly trivial to reason about (like multiplication vs. division where you have to consider division by zero).
Before complex numbers, the square root function was partial, but adding complex numbers made it total, so it simplified a lot of theory and enabled new types of analysis. Fortuitously, it also turned out to be very useful when applied to the real world.
Complex numbers are algebraically closed, reals are not. This means, if you write a polynomial with complex coefficients, it will have (only) complex roots. Analogous statement for the reals isn't true.
It might have been a reason, mathematicians wanting everything well defined etc. But here's a better way to think about it: On real number line, addition defines shifts and multiplication defines scaling. If you are in two dimensions, what is the equivalent? We define a 2 dimensional number such that multiplication defines scaling + rotation. The complex in complex number should not be read as complicated but like duplex or two things intertwined together.
The next question is why bother? What's the point? Turns out that important real life signals, like AC voltage and current, are sinusoidal. And real life electrical machines shift the phase of these signals. By using complex numbers to represent these signals, you can continue to use simple maths of DC circuits to analyze AC circuits. So you'd can still use V = IR, but R of a AC machine like motor will be impedance (generally called Z), represented by a complex number.
I found first few pages of MD Alder's complex analysis for Engineers indispensable in demystifying this complex stuff. Here's a quote from first paragraph "If Complex Numbers had been invented thirty years ago instead of over three hundred, they wouldn't have been called `Complex Numbers' at all. They'd have been called `Planar Numbers', or `Two-dimensional Numbers' or something similar, and there would have been none of this nonsense about `imaginary' numbers"
Because mathematicians like to make up things and theories to feel important, since they are impractical people who don't do anything important in the real world.
Half-joke apart (and I studied math in college, BTW, as my major, with Sanskrit as a minor), complex numbers have many uses in the real world, in engineering and other areas.
Tons of things and phenomena in the real world are based on mathematics. Plant and leaf patterns, ocean waves, water flowing in tubes or channels, the weather, mineral and plant and animal structures, rain and snow and ice, mountains, deserts, glaciers, floods, thunder and lightning, electromagnetism, fire, etc., etc., etc. And some of those things are really based on imaginary numbers.
Mathematicians are infinitely better than statisticians, though, because the definition of a statistician is "a person who can have his head in an oven and his feet in a freezer", and say, "on the average, I am feeling quite comfortable".
If you think a textbook is good, the digital edition should available for review as they will often be invaluable in hardcopy for continued reference. I am pleased the author adheres in some part to this mode of thinking.
The book starts out writing vectors as row vectors (1 x n matrices), then later introduces the matrix of a vector as a column vector.
Just represent vectors as column vectors from the start, define R^n as R^n,1 (in the notation of the book) and then there's no confusion and you get matrix multiplication of column vectors as students learned in high school: M.x where M is an m x n matrix and x is an n x 1 column vector, with product an m x 1 column vector. Stick to this all the way through a first linear algebra textbook and undergrad tears can be avoided.
e.g. example 3.32 in the text: T(1,0) = (1,2,7). Just write these as column vectors and the matrix multiplication is easier to visualize. (Also, T acting on a real number was written T(x) with brackets so T acting on a vector should really be written T((1,0)). Inconsistencies in notation like this lead to pain.)
Then, for example, when you want to show that the rows of a matrix are independent, you can use a row vector to write succinctly x.M = 0 => x = 0 where x is now a row vector. And it's clear we're doing something other than transforming space vectors here as we've used column vectors for vectors representing points in space.
For undergrad textbooks you want to decide on a convention, one that matches what was done in high school preferably, and stick to it.
You do realize it makes no difference whether the vectors are written horizontally or vertically? Part of mathematical maturity is not getting hung up on pointless notation like this.
Clearly the goal of a first course in linear algebra is not mathematical maturity. If you don't set up the notation consistently from the start, once you start introducing change of bases formulae or proving row rank = column rank, etc., some students will get confused. See all the questions from confused students on Maths Stack Exchange about this point.
The way this particular author has handled the issue is perverse: first he writes vectors as row vectors, then he introduces a mapping M(v) which maps the vector v to its representation as a column vector. Also, as I tangentially noted, he misses out the brackets for a function writing T(0,1) instead of T((0,1)), which is just incompetent. The book should be titled "Algebra Done Shite" not "Algebra Done Right" ;)
The 3rd edition of this book is what my undergraduate linear algebra course used. It was a fantastic book. I feel like more computation- and determinant-heavy approaches can make the subject feel like a slog, but this book made me really enjoy, appreciate, and get a gut-level understanding of the subject
This is a great book, but you must first be familiar with proof-focused mathematics (logic and set theory). If you are not, I first suggest you study a book called "How to Prove It: A Structured Approach" by Daniel J. Velleman before studying LA Done Right.
You need a setting in which to learn proof based mathematics, and linear algebra really is the first place where students are ready for that journey. Not everyone is going to be able to do it, but it's very incorrect to say that you must be familiar with proof. One must start somewhere, and ZF ain't it.
Zermelo-Fraenkel, the axioms of set theory (minus choice). This would be what you were teaching if you were to start teaching mathematics by teaching its (classical) foundations.
I briefly checked the axioms, and if I recall correctly, Velleman's book does not delve into ZF in detail, but it is a beginner friendly book to give enough exposure to quantitative logic and set theory to feel comfortable working on proof-based mathematics such as LA (not survey version of LA, which should be called matrix algebra instead).
Sorry, I wasn't talking about Velleman's book on whatever, I was just challenging the sentiment that you should learn the "basics of proof based math" before linear algebra. Linear algebra is the basics of proof based math.
Edit: I should clarify that I believe that there are many approaches to learning mathematics, I'm sure that Velleman's book is fine, just that it isn't true that you should not try to learn from Axler, although perhaps someone who has a bit less of an axe to grind.
The look on Christina of Sweden's face on page 1 made me laugh, she looks exactly how you would imagine a 17th century princess hearing about Linear Algebra.
I know this book is not intended to be used as a first reading on Linear Algebra but, to me, this text isn’t good even for a second(or a third) read. The author hurries up too much on certain parts. I think that Serge Lang’s Linear Algebra does a better job in explaining pretty much every topic of the subject.
There's so much discussion here but I couldn't decide what is a good first course/book on linear algebra. Can someone please summarize and share a couple of books/courses ? Thanks
4th edition and still hasn't bothered to disambiguate between polynomial and polynomial function.
Still a good example of mathematical writing. But, as Einstein supposedly said, "as simple as possible but not simpler". Why is it always American authors that forget that last part?
I do not know what the previous poster has meant to say, but polynomials can be defined in a more abstract way as just elements of certain vector spaces on which an additional operation of multiplication between polynomials is defined, which obeys special rules (the "carry-less multiplication", which is implemented in hardware by most modern CPUs, is an example of polynomial multiplication).
In computers there are many applications of such abstract polynomials, e.g. for error detecting or correcting codes, pseudo-random number generation, authenticated encryption and others, which depend only on the rules for addition, internal multiplication and multiplication with scalars, and which have nothing to do with the polynomial functions associated with the polynomials.
I've been waiting for this for 6th months. Thanks to Sheldon Axler for making it available for free. This is intended to be a second book on Linear Algebra.
For a first book I suggest "Linear Algebra: Theory, Intuition, Code" by Mike X Cohen. It's a bit different than a typical math textbook, it has more focus on conversational explanations using words, although the book does have plenty of proofs as well. The book also has a lot of code examples, which I didn't do, but I did appreciate the discussions related to computing; for example, the book explains that several calculations that can be done by hand are numerically unstable when done on computers (those darn floats are tricky). For the HN crowd, this is the right focus, math for the sake of computing, rather than math for the sake of math.
One insight I gained from the book was the 4 different perspectives of matrix multiplication. I had never encountered this, not even in the oft-suggested "Essence of Linear Algebra" YouTube series. Everything I had seen explained only one of the 4 views, and then I'd encounter a calculation that was better understood by another view and would be confused. It still bends my mind to think all these different perspectives describe the same calculation, they're just different ways of interpreting it.
At the risk of spamming a bit, I'll put my notes here, because this is something I've never seen written down elsewhere. The book has more explanation, these are just my condensed notes.
4 perspectives on matrix multiplication
=======================================
1 Element perspective (all possible dot / inner products)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
(row count × row length) × (column length × column count)
In AB, every element is the dot product of the corresponding row of A
and column of B.
The rows in A are the same length as the columns in B and thus have
dot products.
2 Layer perspective (sum of outer product layers)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
(column length × column count) × (row count × row length)
AB is the sum of every outer product between the corresponding columns
in A and rows in B.
The column count in A is the same as the row count in B, thus the
columns and rows pair up exactly for the outer product operation. The
outer product does not require vectors to be the same length.
3 Column perspective (weighted sums / linear combinations)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
(column length × column count) × (column length × column count)
In AB, every column is a weighted sum of the columns in A; the weights
come from the columns in B.
The weight count in the columns of B must match the column count in A.
4 Row perspective (weighted sums / linear combinations)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
(row count × row length) × (row count × row length)
In AB, every row is a weighted sum of the rows in B; the weights come
from the rows in A.
The weight count in the rows of A must match the row count in B.
The most important interpretation IMO is that a matrix is a specification for a linear map. A linear map is determined by what it does to a basis, and the columns of a matrix are just the list of outputs for each basis element (e.g. the first column is `f(b_1)`. The nth column is `f(b_n)`). If A is the matrix for f and B the matrix for g (for some chosen bases), then BA is the matrix for the composition x -> g(f(x)). i.e. the nth column is `g(f(b_n))`.
The codomain of f has to match the domain of g for composition to make sense, which means dimensions have to match (i.e. row count of A must be column count of B).
It's debatable what the "most important" perspective is. For example, if I need a bunch of dot products between two sets of vectors, that doesn't seem like a linear map or a change of basis (not to me at least), and yet it's exactly what matrix multiplication is, just calculating a bunch of dot products between two sets of vectors.
Or when I think about the singular value decomposition, I'm not thinking about linear maps and change of basis, but I am thinking about a sum of many outer product layers.
If you don't have a linear map in mind, why do you write your dot products with one set of column vectors and another set of row vectors? Computationally, the best way to do dot products would be to walk all of your arrays in contiguous memory order, so the row/column thing is an unnecessary complication. And if you have more than 2 matrices to multiply/"steps of dot products to do in a pipeline", there's almost certainly a relevant interpretation as linear maps lurking.
Outer products are one way to define a "simple" linear map. What SVD tells you is that every (finite dimensional) linear map is a sum of outer products; there are no other possibilities.
This is great! I eventually figured out all these interpretations as well, but it took forever. In particular, the "sum of outer products" interpretation is crucial for understanding the SVD.
The "sum of outer products" interpretation (actually all these interpretations consist in choosing a nesting order for the nested loops that must be used for computing a matrix-matrix product, a.k.a. DGEMM or SGEMM in BLAS) is the most important interpretation for computing the matrix-matrix product with a computer.
The reason is that the outer product of 2 vectors is computed with a number of multiplications equal to the product of the numbers of elements of the 2 vectors, but with a number of memory reads equal to the sum of the numbers of elements of the 2 vectors.
This "outer product" is much better called the tensor product of 2 vectors, because the outer product as originally defined by Grassmann, and used in countless mathematical works with the Grassmann meaning, is a different quantity, which is related to what is frequently called the vector product of 2 vectors. Not even "tensor product" is correct historically. The correct term would be "Zehfuss product", but nowadays few people remember Zehfuss. The term "tensor" was originally applied only to symmetric matrices, where it made sense etymologically, but for an unknown reason Einstein has used it for general arrays and the popularity of the relativity theory after WWI has prompted many mathematicians to change their terminology, following the usage initiated by Einstein.
For long vectors, the product of 2 lengths is much bigger than their sum and the high ratio between the number of multiplications and the number of memory reads, when the result is kept in registers, allows reaching a throughput close to the maximum possible on modern CPUs.
Because the result must be kept in registers, the product of big matrices must be assembled from the products of small sub-matrices. For instance, supposing that the registers can hold 16 = 4 * 4 values, i.e. the tensor/outer product of 2 vectors of length 4, each tensor/outer product, which is an additive term in the computation of the matrix-matrix product of two 4x4 sub-matrices, can be computed with 4 * 4 = 16 fused multiply-add operations (the addition is used to sum the current tensor product with the previous tensor product), but only 4 + 4 = 8 memory reads.
On the other hand, if the matrix-matrix product were computed with scalar products of vector pairs, instead of tensor products of vector pairs, that would have required twice more memory reads than fused multiply-add operations, therefore it would have been much slower.
To the author: I commend you on your work! As a former full-time educator, I believe that one of the best thing you can do to promote usage is to create supplementary materials (lecture slides, quiz deck in Blackboard format, etc.
I do realize there are differences between everyone, but when I was an educator, there were many times when I was expected to teach multiple subjects that I had little to no experience with, with as little as one days notice in many cases (the adjunct had a baby. Can you teach this?).
TLDR; Having a good text is a good start. To increase adoption and usage, create teaching materials, which saves valuable time for instructors.
Peace out
A lot of the discordance in this thread can be boil down to "different people having different goals when reading". As for Linear Algebra, due to its wide relevance, it is hard for one book to accommodate to all needs of people from different fields and levels of mathematical maturity. I would say there are 2 main reasons for one to study LA:
- To understand and use its toolbox in practice: in fields like computer graphics, engineering, statistics, data analysis, modeling, etc.
- To understand and use its toolbox in other fields of (pure) math: single, multivariable, and functional analysis, abstract algebra, etc.
(Sometimes the latter goal is coupled with a secondary goal of introducing the student to mathematical proof. However, for the self-learners, I don't think using a LA book is good for learning how to write proof for the first time.)
Unlike in calculus where Spivak serves as the one-size-fits-all, readers should identify their reasons before picking an LA book. I have not much experience with the first goal. Strang's book seems quite popular, and given the title, I would say it fits this goal. Axler's book falls squarely into the second goal, and I wager it's a good book for this purpose if not for two things: his avoidance of determinants and the dearth of any computational aspects or applications (yes, even theoretical LA books ought to be grounded in reality). There are three books that, in my opinion, do better than Axler here:
- "Linear Algebra" by Stephen H. Friedberg, Arnold J. Insel, and Lawrence E. Spence: dry, does not instill in me the sense of excitement that Axler does. However, it is comprehensive and rigorous, with a decent amount of explanation and a good selection of exercises. Application aspects are present but quite disparate and incoherent. This is the old and long-standing recommendation.
- "Linear Algebra" by Elizabeth S. Meckes and Mark W. Meckes: a new book that somehow manages to combine the material of Friedberg and the eloquence of Axler. The rigor and exercises are comparable to Friedberg, but the applications are presented much better here.
- "Linear Algebra" by Sterling K. Berberian: a little bit more abstract than the other two, but the exposition is still very clear. It connects LA to other fields of math and gives plenty of examples and motivations (shocking I know). The exercises department is good.
Honorable mentions to the classics of Serge Lang and Paul Halmos. They are concise and, as with the rest of their works, wonderfully written. For "Linear Algebra Done Wrong", well, I feel like it's a stripped down Friedberg: shorter but not better. (The proof of the multiplicativity of determinant is just some hand-waving at "a lucky coincidence".)
Seriously this is the worst book ever in the textbook history. People always laugh for stuff like "tensor is stuff that works like a tensor", but this book is literally doing that in every page.
https://www.math.brown.edu/streil/papers/LADW/LADW.html