A pet peeve I have is when writers particularly journalists use the word algorithm when they really mean heuristic. I understand the word is slightly nebulous particularly in the context of real programs.
See an algorithm both in comp sci and cognitive science always gets the same correct answer but heuristic may or may not be correct and may even produce different results with the same input.
afaik an algorithm is literally a logical sequence of steps to produce some output. Whether the output is correct, or approximately correct, or even wrong, is irrelevant to the usage of "algorithm".
Which ofc is why we can say an algorithm is incorrect; correctness is not part of the definition.
What you are describing might be called a computational method or process. "Algorithm" has been formalized over the years to have a very well defined technical definition:
To be considered an algorithm, a method must be:
1. Finite. It must terminate after a finite number of steps. (The program, i.e. the expression of the method must also be finite.)
2. Definite. Each step must be precisely defined, requiring no 'interpretation' or ingenuity on behalf of the executor of the algorithm.
3. It must take zero or more inputs, and these inputs must be well-defined.
4. It must produce one or more well-defined outputs.
5. It must be an "effective method", which (in addition):
5a. Must always produce a result
5b. The result must always be a correct result
5c. It must always arrive at a result after a finite number of steps
5d. It must work for all defined categories of inputs
This is (almost) literally page 1, chapter 1, book 1 of Knuth. Most probably one of the first things you'll ever be taught on any respectable computer science course.
Of course, the term is used more vaguely and everyone still knows what we all mean, but the above is the widely-accepted formal definition.
A point I made myself, and I didn't say otherwise. I also clearly mentioned that was the formal definition, not a working definition.
In mathematics and computer science that is indeed the proper definition.
And HackerNews is definitely a mathsy/computersciencey sort of a place, so it might be nice to use the word correctly here.
Out of interest (or is it flippancy?), where do you use the word algorithm, when not talking about mathematics or computing?
Do you consider a discussion about heuristics vs algorithms (which is what this thread is above), on a site like Hacker News, to be 'general language'...? Or would you consider that this might indeed be a suitable context to comment on their proper formal definitions?
Another way of looking at it: If we have a function f(x), we can have an algorithm that computes f(x). In fact, there are an infinite number of algorithms that compute f(x). But there are also an infinite number of algorithms that compute f'(x), where f'(x) is an approximation of f(x). And such algorithms are called heuristics for f(x).
Suppose you have some algorithm for integrating a function exactly (e.g., by restricting the algorithm's domain to simple polynomials and then doing it symbolically).
Now, you can approximate this algorithm's output with the trapezoid rule. However, the trapezoid rule also looks like an algorithm to me. Is the difference that one is exact and the other is not?
Does it switch from being a heuristic to an algorithm if we switch the specification from
integrate(function, a, b)
to something more like
integrate_with_error_tolerance(function, a, b, tol)?
I think of a heuristic more as a procedure that works only in certain cases, than as a numerical approximation. A good example is solving differential equations. There's no algorithm for that AFAIK -- all we have is a mental database of forms of such equations that have been solved. When faced with one to solve, we pattern-match it against our database, and if we have a match, we know the solution. This works okay as long as our database is big enough to contain all the kinds of equations we actually care about for whatever domain we're modeling, but it's always possible to come up with one whose solution isn't known yet.
(Someone will correct me if I have the facts wrong, but even if this isn't true for differential equations, there are certainly areas of mathematics where it is true.)
That is certainly one definition and use case of heuristics.
Another example might be the travelling salesman problem, where we know how to calculate the exact solution for every instance, but it is prohibitively expensive (unless P=NP). There are a number of heuristics that "can find solutions for extremely large problems (millions of cities) within a reasonable time which are with a high probability just 2–3% away from the optimal solution" (Source: https://en.wikipedia.org/wiki/Travelling_salesman_problem)
In that case, you can always come up with a solution, it's just not always the best solution.
A heuristic is always an algorithm. What's important is what is it a heuristic for and what is it an algorithm for (by which I mean: what function does it approximate and compute, respectively).
If algorithm A is a heuristic for f(x), what we really mean by that is something like "A is an algorithm for g(x), where g(x) ~= f(x) with an epsilon small enough to be useful for all x" or "A is an algorithm of g(x), where g(x) = f(x) for some x (hopefully most of the values of x we care about)".
The heuristics are generally implemented by an algorithm though. I think that is where the confusion arises.
In other words, you've got an algorithmic step that produces some output, followed by a heuristic step that ascribes some meaning to the output.
For example, consider spam detection with Naive Bayes. The parts that tokenize each email, accumulate the counts of each token, and estimate their conditional probability given the class are all algorithmic, even under the formal definition given by @zero_iq, below). Each of these steps is deterministic, finite, etc. The same holds for the part where you calculate the probability of a new email under each class, assuming that each token is independent (it's just multiplication!).
The heuristic part is the assumption that these numbers are meaningful (training data is representative, pre-processing is reasonable, independence assumption isn't bonkers, etc.) and where you choose to set a decision boundary for actually classifying each item.
Unfortunately, people who should know better also get this wrong sometimes. That's why we have "genetic algorithms". I too prefer the stricter definition of algorithm, as a procedure with a formally specifiable postcondition.
The author just quotes an old phrase without knowing the origin or meaning of the verb. It is/was the opposite of wane.
The OED and Shorter Oxford definitions explain the relevant sense of the verb: To start talking moderately, then gradually increase in potency or intensity. Quoting the OED's typically long entry: "Originally a more frequent synonym of GROW v. , which has now superseded it in general colloquial use, exc. with reference to the moon [...] confined to literary use, and have, in varying degrees, a somewhat archaic flavour; [...] The verb is said still to be current in certain dialects"
While I have both of those dictionaries, the Shorter Oxford is the one I use and recommend.
"Waxing poetic" is an idiom meaning "talking longwindedly" or "droning on"; "wax" is used to describe talking, but you'll not find it used that way anywhere else.
Go to the bookmarks manager in your browser and add it the way you'd add a website, with the JS snippet as URL. Then when you navigate to a WSJ piece, click on the bookmarklet to open it with the Facebook passthrough link.
See an algorithm both in comp sci and cognitive science always gets the same correct answer but heuristic may or may not be correct and may even produce different results with the same input.
Thus ML is generally heuristic based.