It is a simple iterative algorithm that goes from one point to the next. It doesn't even have memory of previous steps (caveat, the authors used BFGS which approximates the Hessian with previous gradient iterates, but this is still not AI). There is no finding weights or any such thing.
If every for loop is AI, then we might as well call everything AI. Can you pass me the AI, please?
Not every for loop is AI but pretty much any complex equation with unknown parameters that needs to be trained by gradient descent is usually an ML problem the overwhelming majority of the time.
In that AI has come to mean data-driven algorithms, I don't see this as being AI. What they describe is a local-global optimization method with BFGS as the local optimizer (this is not AI) and a noised average weighted by local optimizer performance as the means to produce new starting points. This is simply a heuristic similar to particle swarm optimization. As far as I know, these are not called AI and funds for this type of research don't typically come from funds targeting AI.
This conflation of everything with AI is precisely why people say things like "gradient descent is most often used in ML" without evidence, and this likely being wrong. No, instead it is 1) ML is the currently most prominent (to the public) use of mathematical optimization and 2) everything else is called AI to the public so they conflate that with ML even when it isn't.
Take a random employee at an engineering or applied sciences (non experimental) lab, ask them if they ever use mathematical optimization, chances are a majority will tell you they do. The vast majority of these are not using or devising ML algorithms.
This matters because of what is clear from this thread. Some people devise a classic algorithm that requires intimate knowledge of the problem at hand, the press calls it AI, the public thinks it's AI, registers one more case of "AI as the tool to replace all others". The Zeitgeist becomes that everything else can go to the bin, and AI (by the more restrictive definition) receives disproportionate attention and funds. Note that funding AI research would not fund the people in the headline, unless they do like the minority of bandits that rebrand their non-AI work with AI keywords.
I’m talking about the application of gradient descent: ie When it is used it’s used on an equation that is too complex for analytic methods.
When the equation is too complex for analytic methods but good enough for gradient descent that equation is overwhelmingly the majority of the time characterized as AI.
A gradient descent is used to solve optimization problems, those arise in many many cases unrelated to ML. Please research a little the history of this field, and notice how it predates ML by decades (even 180 years in the case of the gradient descent specifically).
A great deal of applied mathematics is related to finding a minimum or maximum quantity of something. There are not always constructive methods, sometimes (often) there's no better way than to step through a generic optimization method.
Some quick examples clearly unrelated to ML, and very common as they relate to CAD (everywhere from in silico studies to manufacturing) and computer vision:
- projecting a point on a surface
- fitting a parametric surface through a point cloud
Another example is non-linear PDEs. Some notable cases are Navier-Stoke's equations, non-linear elasticity, or reaction-diffusion. These are used in many industries. To solve non-linear PDEs, a residual is minimized using, typically, quasi-Newton methods (gradient descent's buff cousin). This is because numerical schemes only exist for linear equations, so you must first recast the problem as something linear (or a succession of those, as it were).
By the way, I might add that most PDEs can be equivalently recast as optimization problems.
Yet another is inverse problems: imaging (medical, non destructive testing...), parameter estimation (subsoil imaging), or even shape optimization. Similarly, optimal control. (similar in that it is minimizing a quantity under PDE constraints)
To summarize, almost every time you seek to solve a non-linear equation of any kind (of which there are many completely unrelated to ML), numerical optimization is right around the corner. And when you seek to find "the best" or "the least" or "the most" of something, optimization. Clearly, this is all the time.
I think I've provided a broad enough set of fields with ubiquitous applications, that it is clear optimization is omnipresent and used considerably more often than ML is. As you see, there is no association from optimization to ML or AI, although there is one the other way around. (much like a bird is not a chicken).
Right but gradient descent is not used for non linearity. The neural net is linear. Gradient descent is used because of sheer complexity. That’s why you know it’s ai.
It is a simple iterative algorithm that goes from one point to the next. It doesn't even have memory of previous steps (caveat, the authors used BFGS which approximates the Hessian with previous gradient iterates, but this is still not AI). There is no finding weights or any such thing.
If every for loop is AI, then we might as well call everything AI. Can you pass me the AI, please?