Another way of looking at it: If we have a function f(x), we can have an algorithm that computes f(x). In fact, there are an infinite number of algorithms that compute f(x). But there are also an infinite number of algorithms that compute f'(x), where f'(x) is an approximation of f(x). And such algorithms are called heuristics for f(x).
Suppose you have some algorithm for integrating a function exactly (e.g., by restricting the algorithm's domain to simple polynomials and then doing it symbolically).
Now, you can approximate this algorithm's output with the trapezoid rule. However, the trapezoid rule also looks like an algorithm to me. Is the difference that one is exact and the other is not?
Does it switch from being a heuristic to an algorithm if we switch the specification from
integrate(function, a, b)
to something more like
integrate_with_error_tolerance(function, a, b, tol)?
I think of a heuristic more as a procedure that works only in certain cases, than as a numerical approximation. A good example is solving differential equations. There's no algorithm for that AFAIK -- all we have is a mental database of forms of such equations that have been solved. When faced with one to solve, we pattern-match it against our database, and if we have a match, we know the solution. This works okay as long as our database is big enough to contain all the kinds of equations we actually care about for whatever domain we're modeling, but it's always possible to come up with one whose solution isn't known yet.
(Someone will correct me if I have the facts wrong, but even if this isn't true for differential equations, there are certainly areas of mathematics where it is true.)
That is certainly one definition and use case of heuristics.
Another example might be the travelling salesman problem, where we know how to calculate the exact solution for every instance, but it is prohibitively expensive (unless P=NP). There are a number of heuristics that "can find solutions for extremely large problems (millions of cities) within a reasonable time which are with a high probability just 2–3% away from the optimal solution" (Source: https://en.wikipedia.org/wiki/Travelling_salesman_problem)
In that case, you can always come up with a solution, it's just not always the best solution.
A heuristic is always an algorithm. What's important is what is it a heuristic for and what is it an algorithm for (by which I mean: what function does it approximate and compute, respectively).
If algorithm A is a heuristic for f(x), what we really mean by that is something like "A is an algorithm for g(x), where g(x) ~= f(x) with an epsilon small enough to be useful for all x" or "A is an algorithm of g(x), where g(x) = f(x) for some x (hopefully most of the values of x we care about)".
Another way of looking at it: If we have a function f(x), we can have an algorithm that computes f(x). In fact, there are an infinite number of algorithms that compute f(x). But there are also an infinite number of algorithms that compute f'(x), where f'(x) is an approximation of f(x). And such algorithms are called heuristics for f(x).