I actually object a little bit to the claim that matrices are representations of linear transformations. No, matrices are just two-dimensional arrays of numbers. If you then define a specific 'mutliplication' operation on those arrays, and a mapping from linear functions to matrices, it turns out that that multiplication operation is isomorphic to function application. That's neat! But it doesn't mean that that is what matrices are.
If you came up with an isomorphism from matrices to a domain where it made sense to define a different multiplication operation - like maybe placewise multiplication, where
[a b] * [e f] = [ae bf]
[c d] [g h] [cg dh]
then that would be just as valid, but it wouldn't change what matrices 'are'. In fact, because you know how to map matrices to linear functions, it would let you describe an operation to combine two linear functions in a new way and that might lead to some new insight about linear algebra!
It's like how, in school you were taught that you can't multiply vectors together. Yet, in shader languages, it turns out that it's really useful to be able to multiply two vectors just by multiplying each component ([a,b][c,d]=[ab,cd]), so they define that as a valid operation.
As with most things in mathematics, the varying perspectives on an object and the relationship between those perspectives are more important than what an object is per se, to the point where the idea of what an object "is" per se is often meaningless.
Are the real numbers "actually" Dedekind cuts or equivalence classes of Cauchy sequences? If we prove that both constructions result in isomorphic objects, what difference does it make? Once equivalence has been established, we're free to adopt either perspective as the situation warrants.
Matrices represent linear transformations whether we want them to or not. As someone else pointed out, the operation you've defined is the Hadamard product[1], which is totally valid but doesn't correspond to the composition of linear transformations.
There are other "products", too, like Kronecker product[2] and the Frobenius product[3], each with the own properties, motivations, and relationships to other parts of mathematics. These are neither good nor bad nor anything else — they just are.
I think it was a misstep for the article to be titled What Matrices Are, because the real idea is that when we think of matrices as representing linear functions then the formula for "standard" matrix multiplication corresponds to the composition of linear functions. It's not just some crazy scheme we invented to torture Algebra II students in high school, but a different perspective on the composition operation that has its own advantages and disadvantages relative to other perspectives.
I think I'd be more comfortable with this article making the claim that linear transformation composition is "What matrix multiplication is". Because really, that's a more defensible position. If I weren't treating my matrices as representations of linear functions, I'd really have little reason to define the matrix multiplication operation that we all know and love - it's not a particularly useful operation on a rectangular array in general. So I guess, if you consider 'matrix multiplication' to be part and parcel of matrices, then sure - matrices are linear functions.
And yes, understanding that is very important to motivate high school students. Affine transformations provide a good context for that motivation, as well as a good framework for intuiting noncommutativity of multiplication.
To be honest, I'm not sure what point you're trying to make. It feels like you're over-interpreting the title because the author uses much more precise language in the article.
From the first paragraph, where he explains the purpose of the article:
> The two fundamental facts about matrices is that every matrix represents some linear function, and every linear function is represented by a matrix. Therefore, there is in fact a one-to-one correspondence between matrices and linear functions. We’ll show that multiplying matrices corresponds to composing the functions that they represent.
And later:
> The connection is that matrices are representations of linear transformations, and you can figure out how to write the matrix down by seeing how it acts on a basis.
If there's a meaningful difference between "a matrix is a representation of a linear transformation" and "a matrix can be viewed as a representation of a linear transformation" it seems largely philosophical and, in any case, tangential to the author's stated goal of explaining why matrix "multiplication" is defined the way it is.
Whether or not matrix multiplication is a "particularly useful operation on a rectangular array in general" boils down to a debate about what is or isn't useful to do with a rectangular array. I'll leave that to other folks with stronger opinions on the matter.
But, a "matrix" comes with the matrix multiplication we all know and love, right? The word doesn't just mean a 2d array. I think of it as a math term, not a CS data structure. More of a class, if you will, data and operations bound together. Not literally and not always, my analogy is imperfect, but, if you asked someone to perform an inner product on a block of numbers, you'd just confuse people if you said "matrix multiply".
I think it is much more intuitive and powerful to make the reverse reasoning. Linear algebra studies vectors belonging to some vector space and the linear transformations between those spaces. You can then realize that abstract concept with, for instance, ordered 3-tuples of real numbers, in which case 3x3 matrices represent linear mapping between those R^3 vector spaces, in which case we can use it to do physics in a 3D world. But it's misguided to say matrices come "before" abstract linear algebra.
Honestly, the alternative is to teach people component-wise calculations like those in old school (and some new school) GR with a bunch of indices everywhere. As others have pointed out, a 2x2 matrix or a square matrix in general is a nice representation because operations involving them follow basic rules.
For your qualm where people told you "can't" multiply vectors together, what they should have said is "we won't define a multiplication between vectors like we can between scalars" where "we won't" means "we won't for this course." Something I agree that isn't stressed in Math enough in the early years is that it's creation: mathematicians define what they can in order to prove other things, and as long as one can define something that is consistent with other definitions and is logically coherent, it goes. Just like you can define element-wise multiplication, of course it's valid.
Final nitpick: I'd argue an isomorphism between objects A and B is enough to say that A is B, up to some non-isomorphic details, like how they are typeset in an article.
Set, list, array, these are all words to describe collections of numbers, including an NxM array. But a matrix it's a very specific thing, it is not just an array of numbers, it is a mathematical object with defined mathematical functions applicable to it, as such it does define a set of linear combinations.
Similarly, there is a difference between an N-element set of numbers and an Nth dimensional vector.
Traditionally, multiplication takes two elements of a set and turns them into another element of that set. The dot product doesn't do that, it takes two vectors and turns them into a scalar.
The cross product only exists in three dimensions. And it's not associative (A×B×C gives a different answer depending which order you do it in), which is another thing multiplication usually satisfies.
There are two other not-quite-multiplication operators that I recall seeing. There's an analog of the cross product in two dimensions: (a,b,0)×(c,d,0) = (0,0,ad-bc), so it can be useful to have an operator (a,b)×(c,d) = ad-bc, again turning two vectors into a scalar.
And if the dot product is defined in terms of matrix multiplication by A·B = AᵀB, then you can also define an operator ABᵀ, turning two vectors into a matrix. These vectors don't even need to have the same length.
> The cross product only exists in three dimensions
While true, the Wedge product is a useful concept that generalizes a cross product to arbitrary dimensions and is used in multivariable calculus for proving various integral theorems in high dimension. Here, generalized Stokes theorems apply despite the cross product not being defined. Admittedly, it isn't really a map on the vector space, but the fact that Stokes theorems still hold makes it pretty darn useful to me.
To add to the other responses at this level, I want to point out that one form of vector-vector "multiplication"—inner products—corresponds to applying linear functionals, i.e., to linearly mapping a vector space into its underlying scalar field.
So just as every matrix is a representation of a linear transformation of vectors into other vectors, with matrix-vector multiplication corresponding to function application, it is also true that each vector in a vector space represents a linear transformation that turns vectors in the space into a scalar, with vector-vector multiplication in the form of inner products corresponding to function application. The converse is also true: every linear functional on a vector space can be represented by a vector in the space.
This last insight is known (in various forms) as the Riesz representation theorem and holds not only on finite inner-product spaces (i.e., vector spaces on which an inner product is defined) but also on Hilbert spaces (complete inner product spaces, whether finite or infinite). It turns out to be quite powerful.
Well, it actually depends what you've been told about what "multiplication" is. Multiplication should be closed, hence dot product is not a multiplication because the result is not a vector (unless you are using 1-dimensional vectors, sure, but the result is still not a vector.) Wedge (or outer, or cross) product is a delicate issue, because, well, it works as a product but to get it to be actually defined you get the generalisation (exterior algebras) and then they are also not closed (because the exterior algebra is different from the source algebra and is only the same dimension in a few cases)
You can do the arithmetic any way you like, I suppose, but there is definitely a reason to always think of a matrix as a rectangle, and the article went to some length to explain why. A 2x2 is not 4 independent real numbers thrown into a box, it is a set of vectors that describe a linear space in relation to the "world", and it doesn't really make sense to describe a linear space any other way than by writing down the vectors that make up the axes of that space.
Just one of the many reasons you don't want to use R^4 to describe a matrix is because, for an orthonormal basis, A^-1 == A^T. The inverse and transpose are the same thing. That doesn't work in any arrangement except a rectangle.
I had always trouble with math at high school. I managed to pass in the end, but I never really got it.
I think math should be thought like programming. They should present you with a problem and then show you a cool way to solve it. My math classes consisted of memorizing formulas and algorithms for standardized tests. I learned about hyperbolas, matrices, integrals etc. but none of them stuck with me. One of the few things I remember is basic trigonometry, because we used to do practical stuff like getting an angle from the ground and length from the building to calculate a person's height.
Moreover, I think math seriously needs a REPL. At my math exams(not tests, written ones) I never could calculate the solution right, I would always make a mistake.
We need to acknowledge the fact that we are human and humans make errors. We need to teach high school math in a hackable, practice-oriented way.
The current math curriculum excludes pupils who think different. If you can't solve it in the traditional way, you're doomed. But you're actually smart and can understand math if you learn it by doing, hacking, programming.
Most math reads like very bad code to me: insufficiently commented, bad variable names, excessive (and wildly, ah, creative) operator overloading, no explicit types. Plus the debugging and analysis tools blow, forcing you to keep way too much in your head at any given time.
The documentation is extensive, but unfortunately too much of it takes the form of the same garbage code.
I had the luck of having very good math teachers, plus a share of natural "easyness" with math (wouldn't precisely describe it as talent). All the things you mentioned can be solved by having good, dedicated teachers (I mean dedicated to teaching, not just to one specific student). Unfortunately, they're not as common as they should.
And/Or a REPL. I started programming with Apple LOGO in summer school as a small child, the math came alive. Much more malleable feedback and exploration than the 100%'s on quizzes.
I was taught discrete math from the building blocks of matrices and it seemed as useful at the time as learning driver's ed in a license plate factory.
Mathematics can be taught by good teachers and bad teachers, in a very good or very bad way. It's definitely an important and difficult issue, because teaching mathematics the right way in order to capture the attention and interest of school children is a very difficult problem, not least of all because there isn't even a consensus on what teaching mathematics the right way is.
However, I don't think in the slightest that throwing CS buzzwords such as "REPL" and "hackable" at the problem is the way to go.
They aren't buzzwords, they are simple ideas and don't have to be referred to with those terms if they are triggering you. The OP just wants interactive, flexible systems to learn math from, which I wholeheartedly agree with.
Okay, my bad, then. Can you tell me what exactly is meant by a "hackable" math learning environment? And how will a REPL (however it is realized) make learning mathematics better?
Imagine if all the proofs in your linear algebra textbook were done in Isabelle, Idris or something similar to that. Then you'd be able to interactively explore the proof, it would be unambiguous, and you would know exactly like it worked. If you don't understand how a C program generates its result, you could start a debugger and single-step through it. With a paper proof, how can you convince yourself that there is nothing missing in the proof? Maybe you think you understood, but you didn't.
The great tragedy of maths education is that people come out the other side and think "fill in the box" is maths. No, that is doing you sums, not living in the world of mathematics.
The filling in of the boxes is most fairly compared to finding the typo in the complicated regex, or the file with incorrect permissions on the web server
"Fill in the box" is not referring to sums. It's referring to extremely rigid homework where you fill out each step in a process to finding out a result very precisely. It coincides with the horrific bubble tests where you select 1 of 5 answers and there is no partial credit.
This style of learning is what dominates US schools now from basic arithmetic all of the way up through advanced calculus.
Then replace the domain specific terminology with something more palatable to educators.
The point made in the parent comment is that teaching math often fails in its intent, to provide students with the rewards of mathematical insight and ability to reason.
Instead basic math education results in a rift between those who "aren't good with numbers" and, well, masochists.
What is it about math education that so abhors references to applications?
What is so crucial about references to applications? Sometimes the applications are so distant that it is not imaginable for high school teacher to dig into them.
People learn about mammals in biology and don't ask for references to applications. Why is it different in math?
What equivalence do you see between math and biology? The pedagogy seems to begin from opposite directions. What axioms of biology are comprised of "mammals"?
This thread began with a parent article on matrix algebra, which has many, many, exciting real-world applications.
It depends on the audience, but I think you are largely correct. Computer graphics / demos / games are a seductive way to introduce linear algebra. Here's how you rotate a point on screen, using only numbers is way more interesting than solving linear equations.
That said, there is an audience for whom the beauty of the subject is enough. Now when I hear "Galois theory" or "Navier-Stokes" I just think "I don't know what that is, but I want it!"
At least in Denmark, your method counts as almost as much as your result. If you make a mistake in your calculations, but your method is good, you get partial, near full point.
In your weekly assignments (but not exams) the teacher may even fail a problem if you only write the correct answer, but fail to show how you got there.
Not exactly. I want to be able to see step by step what I'm calculating. When you're learning programming, you write a very small block of code, (e.g. getting inputs, printing a welcome screen) and then build it from there. With math assignments you have to solve the problem entirely and then check the results.
Wouldn't it be better if we could graph every step of our solution, write small loops to test out different values and observe their effect?
The math questions I encountered at school were based on tedious and often mysterious manipulations of numbers.
Even though I memorized every formula I would still fail. With computer programming you don't have to trust anything. If say you want to learn about pointers, you write some code, get a segmentation fault. You learn from your mistake instantly.
If you still don't get it, you read code examples, modify them and see the results instantly.
That's what I mean when I say math seriously needs a REPL.
In english classes teachers take pupils to the computer lab and give them online exercises. The pupil can fill in the blanks press a button and check his/her answers instantly. Why don't we do this with math?
Computers give the pupil the chance to try out different things. If you get stuck, you can always ask it to your teacher.
I know many have experience with Gilbert Strang's Introduction to Linear Algebra textbook and course (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebr...), but I thought it would be useful to mention them here. I've found his explanations to be very intuitive.
Agreed, I bought the book and am doing the class as a "refresher". I put refresher in quotes because there is stuff in Strang's class that I know we didn't cover in my university class, notably SVD. Strang definitely has a way of really helping you deeply understand the concepts. I remember the subject pretty well but I don't think I understood it as well as I do now.
I don't know if Strang's approach would be good as an initial run through Linear Algebra, but it's a great refresher/enhancer (I read through it in grad school to prepare for prelims, back when I thought I was going to got for an applied mathematics PhD.)
Kahn Academy is great for people with no access to educational materials -- that's why it was created. But it isn't material developed by professional or expert educators.
So, matrices are a convenient way to write down linear transformations. But then the student might ask, why study linear transformations at all? Just because they have nice properties? Why this particular set of nice properties, and not some other set? As a rule of thumb, a good math explanation shouldn't start with axioms and claim that they are "nice". First give some intuitive examples, and only then say which axioms they satisfy.
For linear transformations, one possible avenue is start with the notion of derivative. If we take a real-valued function, its derivative at a particular point is just a single number, which represents the function's rate of change. But what if we have a function that accepts, say, two real numbers and outputs three? It turns out that the natural generalization of "derivative" to such functions is a rectangular array of numbers:
dy1/dx1 dy1/dx2
dy2/dx1 dy2/dx2
dy3/dx1 dy3/dx2
If we know these numbers (and nothing else), we can linearly approximate the values of a function near a particular point, with at most quadratic error.
Now let's say we have two functions. The first one takes two numbers and outputs three, and the second one takes three numbers and outputs four. If we compose them together, can we find the derivative of the composite from the two simpler derivatives by some kind of chain rule, like the one we have for ordinary real-valued functions? It turns out that yes, we can, if we replace the product of two numbers with the product of two matrices (defined in a particular way).
Now it's easy to explain what linear transformations are. They are just multidimensional functions whose derivative (matrix) is the same at every point. They are just like one-dimensional linear functions, whose derivative (number) is the same at every point. (For convenience, people also say that every linear transformation must take the point (0,0,...) to the point (0,0,...), so that matrices correspond one-to-one to linear transformations and vice versa.)
If you want to work with linear transformations effortlessly, there's a lot more intuition to develop, but this should serve for the basics.
Matrix is not a convenient way to write linear transformation; it is a good way to reason about linear transformation, because among other things matrix product is function composition. Loose definition and incoherent reasoning are not good motivation for anything let alone a mathematical subject. Bringing matrix analysis as an introduction to linear algebra. For example, there are different ways of writing chain rule for matrices with radically different looking formulas; higher derivatives of vector functions are multilinear forms; convergence in multidimensinal space has too many subtleties for beginning students. Solving systems of linear equation is a perfect way of introducing linear algebra. Not only is it vitally important in math and used everywhere, it is also almost half of linear algebra. Gaussian elimination is the elementary school method formulated as an algorithm and, not coincidentally a sequence of linear transformations culminating in LU decomposition because we can represent a sequence of linear transformation as a series of matrix products. Writing Gaussian elimination in partitioned matrix form, which includes regular scalar manipulation as a special case, results in Schur's complement, and taking inverse of that is a good way to deductively prove the matrix inversion lemma, an important formula in numerical analysis and signal processing and usually taught in advanced courses. Nothing I mentioned is inaccessible to an undergraduate. Linear algebra is an unique mathematical subject because it doesn't require much basics to begin and is very visual, unlike calculus which needs a good foundation in analytic geometry and has many techniques what beginning students call tricks, not to mention mental obstacles like limits. Mathematical intuition comes with practice and experience and mathematical maturity, not dumbing down the topic. I agree motivations should be taught along with any mathematical subjects, but they should be mathematical motivations. Applications are important too, but teaching well and thoroughly theory and application at the same time is impossible and undesirable; different people need different applications which in turn emphasize different aspects of theory; it is just too much.
> If we know these numbers (and nothing else), we can linearly approximate the values of a function near a particular point, with at most quadratic error.
Ignoring the fact that you need the function's values actually at the particular point in question, you do in fact need to know something else: you need to know that the function in question has sufficient regularity (enough smoothness) to allow an application of Taylor's theorem.
Why would you want to teach "every well behaved function has a family of linear functions associated with it" before you teach how to work with linear functions?
CCSS.MATH.CONTENT.HSN.VM.C.6
(+) Use matrices to represent and manipulate data, e.g., to represent payoffs or incidence relationships in a network.
CCSS.MATH.CONTENT.HSN.VM.C.12
(+) Work with 2 × 2 matrices as a transformations of the plane, and interpret the absolute value of the determinant in terms of area.
Math education gets a bad rap in the USA, because anything that students don't remember learning, is claimed to have never have been taught. And that is just not true. Math teachers aren't as dumb as people pretend they are.
The word linear is never used in that part of the standard, nor is any relationship to calculus expressed explicitly (and there is not really any relationship to anything else, just saying that a matrix can be used to express incidences in a network is useless if they never study networks). Probably because calculus is not part of the CC standard, which is my point exactly. Besides, Common Core is not what most living people in the US experienced in school.
I have seen the standards for high school math teachers and trust me, they are low. Most of my classmates in undergrad (which was supposed to be an excellent program for math teachers) are now high school teachers. Their complaints about how much they hate basic linear algebra and "just want to be done with it so they can go teach high school" are still ringing in my ears.
Matrices are axes. Vectors, or points, side by side. It wasn't until graduate school, after years of math and OpenGL that What that meant really sunk in and I really got how and why matrix ops are made of dot products.
The article is right in my case, I didn't learn the intuitive understanding at first, and it could have been taught that way. It is also well written, for people that know math, but I also feel like the article describes math using more math, and that the point could be better made with a picture or two. There's something about just seeing the correspondence between matrix rows, or columns, and the axes of a space, or transform, that finally helped it all sink in for me.
3 years into a math degree and this never clicked until I watched Feynman's QED lectures.
Everyone just tells you, "oh now this can be a matrix." Nobody tells you why matrices are useful, or how we ended up with them. Just, you know matrices use them.
The problem can be that most mathematics maintain that mathematics can be done for its own sake (something which I 100% agree on). However interesting mathematics is, though, there will be plenty of students more interested in the applications to the real world (and there's nothing wrong with that) or they may even have to interest in the mathematics until they see some clever ways it can be used to solve or abstract a specific problem, at which point they take a greater interest in the math itself.
As anecdotal evidence, most math teachers I had in college were at best uninterested and at worst disdainful of applying the math to the real world beyond theorems in a blackboard. Most physics teachers, however, in introducing us to new concepts made an effort to show as soon as possible how the definitions we made were inspired by real world problems and in turn simplified or helped create new physics. This made it much easier to appreciate the pure math itself.
Just explaining the pure mathmatical roots isn't always done, or cross field relations. It drives me insane when you use cross field tools, and when you bring up, Oh so we're doing X but with Y.
The response is often, No we do X because of Z. Which often you're learning Z. So now you just feel lost and confused. Its not 2 classes later until you'll learn A maps Y to Z and X is actually an operation of A so it applies to both.
I guess it's just me but the joy of math has always been its tangled relationships to itself.
Yes, I think typical intro classes focus too much on the rows and not enough on the columns. We learn about change of basis where each element in the new vector is a dot product with a row of the matrix. But you can also look at the new vector as a weighted combination of all the columns in the matrix.
You mentioned opengl and it's a perfect example of why you need to learn both. The Model matrix is column-oriented - if Z is up in the model but Y is up in the world, you need a Y unit vector in the third column of the matrix. The View matrix is row-oriented - each row is an axis of the camera's orientation. The columns perspective also makes it easier to understand homogeneous coordinates.
Can you elaborate on that a bit or give an example maybe? It seems to me that mathematics does pay attention to what things are, since mathematicians often start their arguments with getting themselves and their readers to agree on rigorous definitions of mathematical structures, before they do anything with those.
It's basically duck-typing. If you can do arithmetic on it, it's a number. If you can do matrix operations, it's a matrix. If it satisfies the axioms of X, it's an X.
This is just something I thought of and I am not sure if this makes sense but mathematicians explore objects much in the same way particle physicists do. So the physicists bounces particles off each other and see what happens to understand how these particles work.
Similarly, mathematicians study objects by looking how they act on other objects. So if it turns out that two differently defined objects have the same actions on other objects, we call them isomorphic and don't really distinguish between them.
So for instance, we say that the set of rigid motions that preserve the triangle and it's orientation is the same as the set of permutations of the roots of say: x^3-3x+1 even if the two sets are absolutely not defined in the same way.
I think he means that mathematical objects are pure structure without substance. They are defined only in relation to other objects and there is no deeper meaning.
I guess you've been to a US school. In france the problem is the opposite. They generaly teach you theory first, and you pretty much have to guess by yourself what all those things are for (beyond the obvious problems you find in exercises).
By "do" he means the behaviour of the objects, not their use in real-world applications. In general, you consider a bunch of isomorphic things as one, rather than worrying about the differences between them which have no effect on their behaviour. This is sort of handwavy because it depends on context.
For example, if we are considering sets and functions between them, we generally don't care about the exact names of the elements of the set, only the fact that they are distinct elements. The important aspects of a function in this case are properties such as injectivity, surjectivity, etc, not that it sends one particular element of one set to another particular element of another set.
Another example is in linear algebra: we really care about linear transformations on an abstract vector space more than we care about what that linear transformation looks like relative to a specific set of coordinates.
This point of view is espoused in category theory, where the important information is carried in the morphisms between objects, not really the objects themselves.
But to me, it's better to start with a context, a purpose for using matrices & linear algebra first, and learn what and how to use matrices in that context. The contexts that helped me included 3D graphics/games and later circuit simulations.
In general I find it helpful in understanding mathematical concepts and notations to learn about their history as well. What sort of problems they were meant to solve in the first place and what were the methods used before that.
From that point of view, the interesting thing would then be why this particular abstraction layer works well which is what linear algebra answers.
This is in some sense the process all math students go through. The formulas for computing determinant and multiplying matrices look really complicated and it feels like a mystery as to why it works at all but then linear algebra explains all of that slowly.
Perhaps it would help students to start with an application before digging into the theory. Coordinate transformations might not be too forbidding for high school algebra students, especially if aided by some math software or a Python notebook.
I haven't done anything linear algebra related since my high school Algebra II course. This was really simple for me to follow so thanks for posting. Its a shame that the authors don't update it, they have same great content and I would've definitely followed it.
Well, what you see as matrices (that is the array of numbers) is underneath a function f: N x N -> F
(Where F is a field of your choice, or a ring if you so desire). So its a function that takes two natural numbers (i.e. column and row) and outputs a number.
What is more interesting to me is that linear functions: \mathbb{R}^n --> \mathbb{R}^m turn out to be useful when applied in many, many different problems areas. Call this "the unreasonable effectiveness of linear operators" if you will.
Linear operators are useful in many situations because
(1) They are "nice" operators which carry properties most functions can only dream of having: f(v + w) = f(v) + f(w), and f(av) = af(v). From these properties you can develop a very rich theory (along with the properties of vectors). These are the sorts of properties that we want all of our functions to have when we are young and first learning algebra - how many prealgebra students wish they could simplify (a + b)^2 to a^2 + b^2?
(2) A linear approximation is the first useful approximation for most behaviours, and at a small enough scale almost anything looks linear. Once you've made this approximation you get to exploit the properties of (1)
I learnt this from looking at code for old 3d engines, where you have to first make up your matrix functions, but then also unroll them for optimising stuff like rotations round a single axis.
If you came up with an isomorphism from matrices to a domain where it made sense to define a different multiplication operation - like maybe placewise multiplication, where
then that would be just as valid, but it wouldn't change what matrices 'are'. In fact, because you know how to map matrices to linear functions, it would let you describe an operation to combine two linear functions in a new way and that might lead to some new insight about linear algebra!It's like how, in school you were taught that you can't multiply vectors together. Yet, in shader languages, it turns out that it's really useful to be able to multiply two vectors just by multiplying each component ([a,b][c,d]=[ab,cd]), so they define that as a valid operation.