Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Linear algebra for programmers (coffeemug.github.io)
237 points by coffeemug on Sept 1, 2023 | hide | past | favorite | 76 comments


If anyone wants to try things hands-on, I highly recommend the SymPy (in particular the online live shell https://live.sympy.org/ ). The `Matrix` class can be used to create matrices (lists of lists of numbers) and vectors (lists of numbers). Here is a clickable link that demos the first example rotation: https://live.sympy.org/?evaluate=A%20%3D%20Matrix(%5B%5B0%2C...

For more info about SymPy, see section "VI. Linear algebra" in the SymPy tutorial I wrote https://minireference.com/static/tutorials/sympy_tutorial.pd... (also available as notebook https://github.com/minireference/sympytut_notebooks/blob/mas... )


Oh and for even more linear algebra stuff, here is a 30 min condensed video tutorial that introduced most of topics in a standard LA course, also using SymPy to show demonstrations: https://www.youtube.com/watch?v=2G3PmEZI6n8&list=PLGmu4KtWiH...


That brings back some memories


For someone who actually needs to understand linear algebra to create novel complex programs, this is completely insufficient and possibly needlessly distracting. I suggest this one instead: https://www.youtube.com/playlist?list=PLE7DDD91010BC51F8 (Gilbert Strang)

Aggressively taking the least shortcuts possible is the fastest shortcut.


Agree. I love the Gilbert Strang series and am working through it myself. Also, the book "Linear Algebra Done Right" by Sheldon Axler[1] is amazing. He has on his website a set of short videos and slides to accompany it[2]. It's incredibly comprehensive and takes an interesting strategy of teaching all of linear algebra in a very rigorous fashion without introducing determinants until the end. His reasoning is explained in a paper he published called "Down with Determinants!"[3].

[1] https://linear.axler.net/

[2] https://linear.axler.net/LADRvideos.html

[3] https://www.axler.net/DwD.html


Axler and Strang seem like pretty different approaches to the same subject, and if your goal is just to do stuff with linear algebra, rather than deepening your understanding of it or really getting how it generalizes to functions and stuff, Strang's approach is likely to be more immediately profitable. I like 'em both. I helped my daughter through UIUC linear algebra last semester, and Strang definitely prepped me for all the computation, all the way through the end of the class, but I needed to go back to Axler for all the proof stuff.


I'm really enjoying them both for that precise reason.


I never found Strang's lectures out of ordinary nor particularly enlightening. But what managed to give me perspective was Pavel Grinfeld's playlists, starting from https://www.youtube.com/watch?v=Fnfh8jNqBlg&list=PLlXfTHzgMR....


A student of Strang's, for whatever that's worth to you.


Yes, I was aware, though only much after I watched his videos.

To me, the didactic style is what matters, not the intellectual prowess of the mathematician in question. For this topic, and the depth to which I wanted to expose myself, I've found Pavel more accessible than anyone else.


I think his videos are great, too. Of course, he did his years after Strang's, which were really one of the first of their kind. But I learned linear algebra from those Strang videos, so I have a soft spot for them.

(Watch them at 3x, and then at the end of each one, reward yourself by setting it back to 1x, and Gil Strang sounds like he's on a Robotussin bender.)


Can anyone suggest a something that teaches Linear Algebra with a practical applications, especially for software engineering?

I can sorta kinda get the theory, but every demonstration involves moving an arrow around which is.... not something I need to do frequently. So I'm not sure how I actually apply linear algebra to solve actual problems.

I'm a software developer and I know it's useful I just don't get where to use it - and I'm struggling to actually understand the different operations, purpose of the dot product, etc. I have a decent base for basic stats and calc, both of which I can "conceptually apply" near daily for understanding how things work.

3Blue1Brown is helpful, but I just kinda go "yeah I guess that looks right" without knowing what to do with it.

EDIT: Thank you!


Boyd & Vandenberghe's new book: https://web.stanford.edu/~boyd/vmls

It has companion code in Julia and Python, and addresses many important applications, such as convex optimization where the authors' previous book is really famous.


One place to start might be a tutorial on principal component analysis, which will take you through some of the intuitions for applying SVD.

You can also go in the direction of cryptography; here's, for instance, a really excellent LLL tutorial that builds on Graham-Schmidt: https://kel.bz/post/lll/


This is the PCA tutorial that worked for me: https://arxiv.org/abs/1404.1100.


Imperial College has a pretty good course on Linear Algebra for Machine Learning on Coursera. It's part of their Mathematics for Machine Learning Specialization. Deep Learning also has a course that appears to be similar.

The specialization as a whole starts simple, and works its way up to PCA.

https://www.coursera.org/specializations/mathematics-for-mac...


Find and work through some old OpenGL tutorials from the fixed-function era. It's long obsolete, but there's a nice immediate connection between your matrix algebra and what you see on the screen. Putting chunky triangles on the screen is fun!

Edit: I should mention that this is a good way to learn to appreciate what modern shaders are doing. You could jump into those as well, but I think it's a little harder to get your first triangle on the screen. All this assumes you're comfortable writing C, of course.


I found a lot of practical CS applications in this one:

"Coding The Matrix: Linear Algebra Through Computer Science Applications" https://codingthematrix.com/

It used to be on Coursera too, has video lectures and a book to go with it.


I endorse this as well. I worked through the course back when it was on Coursera, and thoroughly enjoyed it. I've also found what I learned applicable many times since - most recently when experimenting with some computer vision algorithms.


Two chapters of my book https://pimbook.org focus on linear algebra, with the first chapter being for SVD (a workhorse of data analysis) and the second demonstrating physics modeling using eigenvectors and eigenvalues.

But if you really want practical applications, you probably want to read a book on numerical analysis, which is where linear algebra really finds its glory.


Love this book, by the way.


In my experience it largely depends on the field within Software Engineering, and where it blurs into subfields of Applied Math / Engineering.

Machine Learning needs some things (mainly "easy" things like matrix multiplication and convolution with the occasional truncated SVD), Modeling of Physical Systems needs other things (PDE solvers), Computer graphics different stuff (mainly focusing on small rotation/transformation matrices applied to many many points), Nonlinear Optimization yet another subset (solution of large sparse systems), and that's before you get to signal processing and statistics.

The main commonality is once you have the basic terminology down and understand that if you're explicitly inverting a matrix you're probably doing it wrong, you should just use the existing highly tuned libraries for your use area (at least until you decide it would be cool to try and beat them). Once you need to go beyond that you're more into the realms of matrix analysis and structure exploitation, and firmly have at least one foot in the math camp.


You might want to watch this short video a couple of times - https://www.youtube.com/watch?v=rowWM-MijXU

A simple way to think about Linear Algebra is as a set of rules i.e. functions (aka Matrices) applied to input points (aka vectors) to give output points in a vector (i.e. multidimensional) space. Thus multi-valued inputs give multi-valued outputs. The key is to keep both the algebraic manipulations and what they may mean in the geometric interpretation simultaneously in mind. The basic example of solving a system of linear equations in two variables will cement this idea when you map it to the XY coordinate system.

You might find the book Practical Linear Algebra : A Geometry Toolbox by Farin and Hansford useful for further study.


Much of the use of linear algebra in programming is for machine learning. To a first approximation, ML is statistics on huge datasets, and linear algebra makes it possible because matrix operations are massively, if not embarrassingly, parallel.


I just posted this in another comment: https://archive.org/details/build-your-own-flight-sim-in-c-d...

It turns out dot products, matrix multiplication, etc. are fairly common in projecting a 3D world onto a 2D display (flight simulators, for example).


I don't know a recommendation to make; probably it makes more sense to find a field your interested in (pretty much any STEM field will do), and learn the more math heavy version of that. Inevitably linear algebra will come up, giving some motivation for the pure theory.

As for why it comes up so much, it concerns itself with solving systems that look like y = Ax + b, where the Ax term works similarly to multiplication in 1-D. The point is these simple equations are ones we can actually understand! Everything else is too hard.

But there's a trick we have for everything else: if you have some y = f(x) where f is super complicated, you can differentiate. The derivative of f at a point x_0 is the best linear approximation to f. i.e. f'(x_0) is the best matrix A such that y ~= Ax + b near x_0. Now your problem is linear and you can understand it (locally)! Then you can integrate your local solutions into a global one.

The purpose of dot products is that they let you talk about things like angles, lengths, and projections. The point is you learn how it works for arrows and shadows and stuff, and figure out some equations that hopefully make intuitive sense in 2- and 3-D, and then it turns out those equations work in higher (even infinite) dimensions too.

Projections are useful because they let you break vectors down and build them back up, and hopefully the broken down version is easier to understand. Understanding projections in high or infinite dimensions gives some intuition for things like the Fourier transform, where you project a function onto simpler waves, maybe study how a system reacts to those waves, and then use that description to build back how up the system reacts to your original function.

Angles give one way to measure closeness. If you have some machine learning model that figures out a way to map text into a 50,000 dimensional space, you might be able to do it in a way where two sentences are intuitively similar if they are mostly pointing off in the same directions, so if the angle between them is small.

So tl;dr the idea is you learn some geometry with arrows and all that, you figure out some equations from that geometry, and then you realize that those equations and that geometric intuition work anytime you have a linear (i.e. f(ax+b) = af(x) + f(b)) system. Calculus gives you ways to turn non-linear problems into linear ones, so you will find examples of linear systems everywhere.


In addition to what has already been recommended, I would recommend going to a used book site and typing in "applied linear algebra" or "linear algebra applications". You'll see a ton of books, many under $10 shipped. Usually old textbooks about 5-10 years old, but the math doesn't go out of date.


Look into making a Doom-like game, tons of linear algebra there.


Fast ai had a computational linear algebra class iirc


As a mathematician, if there's one piece of advice I'd give people who need 1-2 math courses, it would be LEARN IT RIGHT. Pick up a rigorous textbook and read through it. Don't be lazy. I've spent decades reading many texts and papers. If you're smart, reading one real and thorough book will not take a lot of time.

And, truly understandng the subject will pay you back in spades. Don't go for fluff. Linear algebra isn't some Facebook post. Truly go through the book line by line and really understand it down to the nuts and bolts. Suffer a little.


This is the advice for the ages!

However i fear it is lost on the current generation which is always looking for that quick/short article/blog/video/etc. which will teach and make them understand the subject as painlessly as possible. Everything must be "Fun and Enjoyable" (i don't even know what that means anymore!). They have forgotten the maxim "There is no Royal Road to Mathematics"(or any other scientific field). The amount of people (even on HN!) who espouse disdain for Textbooks is astonishing. By definition, the process of learning involves diving into the unknown and hence there will be discomfort/difficulties/effort/time needed and thus is not going to be easy. But the student doesn't want to put in any effort at all; everything is the fault of the Teacher(can't teach properly)/Teaching method(i am a visual learner)/Too rigorous/Mathematical/I have ADHD/ADD/Autism/Whatever.

The real tragedy is that used Books are now so easily/cheaply/universally available that there is no reason not to have one's own library of books on subjects of interest; it is "food" for the Mind.


We have a discord "HN Learn", where we collaborate to learn things, especially Maths. If you want some company to delve into linear algebra, feel free to join.

[0] https://discord.gg/RxSjEMnW


> This is actually not so strange– you can think of many structures as functions. For example, you can think of a number 3 as a function. When you multiply it by things, it makes them three times bigger.

I don't see how 3 can be a function from this example. "3*" (partially applied multiplication by 3) looks more like it.

Matrices and vectors as functions? Yeah, if the argument is within bounds. That makes it just an indexing operation.

(I guess one can view 3 as a one element vector but that sounds like a degenerate case)

Or maybe I'm missing something...?


Check out Peano arithmetic.

Intuitively, natural numbers come from and are defined by counting, and that implies that "3" means inherently that something (could be anything) happened or was repeated three times. For example, if you have three apples, that means that you can identify one particular apple that you have, then do that again, then do that again.

Adding a unit to a number is like adding further information on what it is that is being repeated. Three pairs of apples? You just invented the number six!

The meaning of doing something three times (most abstractly: applying the successor function) is already inherent in the meaning of three, so multiplication isn't something that has to be added on top. It's already in there.



3 is the following function: 3 == lambda x: 3*x

But I think that the technical, mathematical way to think about it is:

The monoid of linear functions L:R->R is isomorphic to the monoid (R, *)

Meaning, the structure of 1x1 matrices under multiplication is exactly the same as the structure of real numbers under multiplication.


Importantly matrix multiplication is the same as function composition of the linear functions, hence the analogy to functions that multiply by a factor.

Seems trivial but among other things it implies associativity, which is not quite trivial for larger matrices.


I think the mathematical concept that you are looking for is that of the dual space. Essentially if you have a vector space V, you can construct a dual space V* where the elements of the dual space are functions taking elements of V to the underlying field F, and under certain conditions these spaces are isomorphic (the same) - so there is a 1:1 correspondence between elements of the vector space and the functions in the dual space.


Hmm you're right, I didn't quite explain it well. The idea is that you can think of `3` as a function that makes thing three times bigger, and think of `*` as a function application operator. So `3*x` is equivalent to `three(x)`. I'll think on this more and change the wording to try and make this more clear.


I take it as analogous to the association of a matrix to a linear transformation. This association is via multiplication.


If you haven't seen it yet, even if you are proficient in LA, 3Blue1Brown's playlist on YouTube is good to watch.

It really helps build intuition beyond the typical teaching methods.

It will really help connect the dots.

https://youtube.com/playlist?list=PL0-GT3co4r2y2YErbmuJw2L5t...


weird to talk about linear algebra and never invoke linearity.

also, why say these 2 things?

>If you forget how matrix-vector multiplication works, just remember that its definition flows out of the notation.

>Another way to think of matrix-vector multiplication is by treating each row of a matrix as its own vector, and computing the dot products of these row vectors with the vector we’re multiplying by. How on earth does that work?! What does vector similarity have to do with linear equations, or with matrix-vector multiplication?

But you just told me to plug into the definition? Which is a dot product.

Pretty incoherent.


IMO it is much clearer to justify something like matrix multiplication via a simple real life example like markov chain computations, it might add a bit of complexity in understanding the application but motivates the definition much better.


i think it's a rushed draft

it hasn't gone through a spell-checker, with words in there like hight and strage


True.

Moreover, here as in most other writings about linear algebra that I have seen, there is the very bad habit of describing the more complex operations as being composed from dot products.

On modern CPUs, dot products must be avoided, because like all reduction operations they consist of one chain of dependent operations, so their speed is limited by the latency of the fused multiply-add operations, instead of being limited by the much higher throughput of the FMA operations.

When vectors are multiplied with vectors, there is no alternative for dot products, so the only way to accelerate them is to reorder the operations into a tree, to be able to overlap a part of them, which are independent.

When arrays with more dimensions are multiplied, e.g. matrices with vectors or matrices with matrices, the multiplications correspond with nested for loops, i.e. 2 nested loops for matrix-vector multiplication and 3 nested loops for matrix-matrix multiplication.

The nested loops can be reordered arbitrarily. In each case, one of the possible loop orders has in the innermost loop the computation of a dot product.

This is the only order mentioned in the parent article and in most other linear algebra manuals. However, this order is exactly the worst possible computationally.

For 2 or more nested loops, there is always another order where the innermost operation is a so-called AXPY operation (the BLAS function name). AXPY means the scalar A multiplied by the vector X Plus the vector Y, with the result stored in Y. AXPY operations are always better than dot products, because the FMA operations are independent, so they can be pipelined, but for 3 or more nested loops there are better orders, where the innermost loop needs much less load and store operations than for AXPY.

For 3 or more nested loops, there is always an order where the innermost operation is a tensor product of 2 vectors. This is a more attractive operation than both AXPY and dot products. If the tensor product of 2 vectors with N elements is stored in registers, then its computation needs N+N loads from memory, but N*N FMA operations, so if there are enough registers so that N>2, there will be much more FMA than loads, allowing the full utilization of the execution units of a modern CPU.

In conclusion, matrix-vector products must not be described as being composed of N dot products that are executed separately for each element of the result vector, but as being the sum of N AXPY operations, which are accumulated into the result vector.

Similarly, a matrix-matrix product must not be described as being composed of N^2 dot products, one for each element of the result matrix, but as being the sum of N vector-vector tensor products, which are accumulated into the result matrix.

Such descriptions would be much more useful in practice, where the definitions based on dot products are just a hindrance.


This reads like someone who thinks they absolutely nailed matrices and is using that as a proxy for linear algebra expertise.

I wouldn't know for sure because it all whooshed about 60 feet over my head


"In hight school your math teacher may have started a treatment of linear algebra by making you solve a system of linear equations, at which point you very sensibly zoned out because you knew you’d go on to program computers and never have to solve a system of linear equations again (don’t worry, I won’t be talking much about them here)."

No offense but I stopped reading there. Too many software developers have this weird superiority complex when it comes to math. When they struggle with math, I've seen devs criticize everything from naming conventions to curricula to it being "useless". A lot of them seem unwilling to acknowledge that math is sometimes... simply hard.

If your attitude to math is "I don't need any of this useless stuff, so I'll zone out", then I kindly suggest you first at least try to learn Linear algebra for mathematicians first.


Or write a flight simulator from scratch.

I followed a book in the 90's to create a flight simulator from scratch. Besides learning Bresenham's line algorithm, I learned a lot of linear algebra.

Probably this book: https://archive.org/details/build-your-own-flight-sim-in-c-d...


Really interesting recommendation! I had never seen this before, I’ll definitely give this a read.

I always recommend “Ray Tracing in One Weekend” [1] series of books for some light coverage of linear algebra/geometry/graphics.

[1] https://raytracing.github.io/


I spent a ton of time with this book too! It really lays everything out. The feeling of building something like that from absolute scratch was amazing. I can’t get motivated to do that again these days — ready-made, better implementations are too accessible.

I was 13 at the time, so I struggled with the math (this was pre-internet so I couldn’t just search stuff). I’d never seen matrices and wasn’t able to figure them out from that book. I distinctly remember hitting a wall in the chapter on lighting because it mentioned taking the dot product of two vectors, but I didn’t understand why that would give you the cosine of the angle between them because I’d never heard of the dot product before.

Edit: right at the bottom of page 405. I reread that sentence so many times.


As a programmer, I like maths and find it very useful, albeit difficult at times.

However, the naming conventions and syntax overloads/ambiguities really can be a pain as a non-mathematician.

I get it. Everything has warts, and the mathematical syntax is as much for thinking and experimentation as it is for communication, and verbosity hinders understanding. I'm also aware computing has its horrible dark corners. But from the outside, it seems like maths practitioners often oppose any attempts to make it any clearer, even to people in other mathematical subfields.

Papers leave variables undefined because everyone working in that subfield is expected to just know what that variable means in that context, possibly derived from on some book that everyone in that field knows about so nobody feels the need to specify it. I mean just a stupid but simple example, I learnt maths from an applied engineering school so the imaginary unit was j. Always j. Never specified that it could be anything else nor was it ever specified anywhere that j was the imaginary unit. Then I start exploring the topic outside textbooks and everyone's using i. Again, no indication that this is the imaginary unit, it's just a given. Obviously in this particular case it's easy to notice that it's been swapped over, but there are plenty of other much more subtle ambiguities that can make trying to understand anything a real slog.


As someone who is currently on a path to self-study maths I understand what you're saying and sympathise. This is my predicament at the moment to a certain extent. And of course it all depends heavily on context- In the context of \frac{\sum^n x_i}{n}, i isn't the imaginary unit, it's the looping variable over the elements of x so we can compute the average, and everyone just knows (or is expected to), just like r^2 means the square of some variable r, or the extent to which one variable is explained by another in a regression etc whereas R^2 means a two-dimensional space over the real numbers.

That said, I think and hope that I may in the future feel differently as my skills improve. My theory is that if everyone always added in that foundational knowledge to each paper etc it would make everything really verbose and make trying to get to the point of what you're saying a real slog for the author and the experienced practitioners. Being able to be consise means you get to the heart of the new stuff quickly without having to slog through a bunch of "C is the set of complex numbers a+bi where a and b are in R and i^2 = -1" first.


>My theory is that if everyone always added in that foundational knowledge to each paper etc

Except once someone did that, you could literally just cite their paper or book.


Software is frankly worse. Take javascript. What does that even mean? ECMAScript, CommonJS, something else? Almost nobody clarifies what they mean precisely when they use an ambiguous, overload term like "javascript".

Does the "es" in ESLint mean it only works for ECMAScript? Or does the "ES" mean something entirely different? The homepage doesn't say. Are eslint, ESLint, and Eslint the same thing? Capitalization usually matters in software after all, but nobody is consistent here, not even eslint.org. Why do are ES6 and ES2015 used interchangeably? That's unnecessarily confusing.

All of this is far more confusing than "i" vs "j" for sqrt(-1).


Author here.

The point is to teach it to people who never understood its value but now want to learn, and to do that it helps to relate to their experience.


Nice to see you back @coffeemug. I haven't touched linear algebra for 10 years since data science jobs are hard to come by in my country, and data jobs are rarely outsourced. Really nice refresher and well written. Btw, I have been trying to get ahold of you on signal. What platform you are active on these days.


coffeemug@gmail.com


That’s how I was in high school, and immediately regretted it the minute I found interest in a domain where strong math ability was required.


Many devs who "zone out" because they won't "need math" still manage to do just fine?

Software is a big field and there's room for folks with different interests and talents.


The thing about math is that developers live in a well specified world. The moment they have to deal with mathematical notation, not only is there no way to look it up, it is inherently ambiguous and arbitrary and nobody tells you that. No math teacher on this planet is going to tell you "alpha beta gamma usually refers to angles, by the the way this is completely made up. In fact, it is the result of a popularity contest. If you see anyone else use the same symbols it is sheer coincidence as all mathematic notation is made up on the spot. The moment you read another book, they are free to do things as they see fit."

But here is the thing. As a software developer, there IS a "higher being" aka a designer for the programming language or library and they try and try and try their best to maintain "rhyme and reason". With math? There is no such thing, or rather, everyone pisses in the pool of math notation but nobody wants to admit that and so you get confused people, people who mistakenly look for the "rhyme and reason" and they find none, they find no pattern.

So the first lesson about math notation that you need to learn is that there is no such thing. There is this chicken scratch that makes it easier to write on the blackboard or on paper and that is about it.

Math isn't "hard..." it is abstract.


I don't understand why you take issue with that statement, it sounds descriptive as an attitude that may be common in the audience but perhaps you are interpreting it as the author endorsing that view? i.e. I imagine they might agree with you.


I was nodding in agreement until "naming conventions". Naming conventions and common notation is really bad. To the point, that it's sometimes easier to invent your own for personal use.


It's not a criticism over math. It's a criticism over high school math education.

> criticize everything from naming conventions

A programmer criticizing naming conventions. a.k.a another Tuesday.


I strongly suggest anyone getting into Linear Algebra to have a project to work on. It makes everything so much easier when you get to play with the stuff.

My hint for something to play with is that basic linear algebra applies very directly to graphics, rotation matrices and so on. If you know how to multiply a matrix with a vector, you basically know what you need to render basic line-art 3D graphics. May want to look into dot and cross products as well as vector projection, but it's fairly basic all of this.


I did this awhile ago, but found myself simply copying and tweaking graphics algorithms, rather than gaining any intrinsic understanding of what linear algebra is really doing. I guess I just didn’t have enough computer graphics background? Yes, I can use lighting equations to define a pixel shader, but I’m basically copying and translating the algorithm from a book.


No I mean do full software rendering, no shaders, no graphics card.


> No I mean do full software rendering, no shaders, no graphics card.

But why?

There doesn't seems to be a lot to learn about applied linear algebra (in the sense discussed here) by implementing a rasterizer.

But there is plenty of LA above and below that (in the scene management, in the shaders).


Eh, it's basically a project consisting entirely of the parts of LA that the article is talking about.


I'm sorry if this comes across as curt, but can't you just decide to not do that? Decide to not look at the examples in the book, but write it from the explanation instead. You could even forego the book and just sit down with a piece of paper and do the work from first principles. You might not come up with the most efficient algorithm, but you'll have a foothold into understanding the one you can then go and look up.


Sure that would be ideal. But I would need some 101 knowledge that I couldn’t really find in the books I had, or maybe I wasn’t patient enough in reading them, like somewhere there was a key realization of linear algebra that would have made doing everything from first principles possible, or perhaps I could just read Phong’s original paper.


Deep learning! It's all "just" (more or less) high school calculus (partial derivatives, chain rule) and matrix multiplication.


I feel like I saw one once but lost it -

Is there a githubrepo/tutorial for how linear algebra is used for a very small model just to demonstrate how that allows it to "learn"?

I've got the calc, I just don't understand what the matrix multiplication "does"


Watch Karpathy's recent lectures. They're gold. Start here[1] with micrograd[2]. It doesn't use linear algebra/matrices to start, but the principles are the same. The matrix multiplication is how the weights of the connections between neurons and the input values are combined (to form an activation value that then may lead to that neuron "firing" or not, depending on whether it passes some threshold function). We use matrices to model the connections between neurons - each row is a connection, and each column is a weight corresponding to an input.

[1] https://www.youtube.com/watch?v=VMj-3S1tku0 [2] https://github.com/karpathy/micrograd


I cannot recommend Andrew Ng's courses on Machine Learning enough. Something like this seems like it would cover everything you're looking for.

https://www.coursera.org/learn/machine-learning

I cannot speak to the author of the content of this github repo, but it appears they have completed the course and included all of the solutions here. It might let you jump right to what you're looking for.

https://github.com/greyhatguy007/Machine-Learning-Specializa...


What math pre-reqs does it need for someone who never made it to college level maths?


Based on my experience as long as you have a good foundation in the basics of algebra you’ll be able to pick up the rest from the course.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: