braised_babbage's comments

braised_babbage · on April 9, 2022

There are many evaluation strategies for tensor contractions; naively translating into nested for loops and then doing direct random-access arithmetic is just one approach. If you like numpy's einsum, a drop in replacement is https://optimized-einsum.readthedocs.io/en/stable/. In general, it's an open question as to how to best "compile" a tensor contraction expression into something executable, for several reasons (there are a lot of possible orderings for the contractions, if one wishes to rely on BLAS routines for efficient primitives one will find that they don't exactly fit the problem of tensor contraction like a glove, and so on).

thrownawayalre · on April 9, 2022

The linked documentation states

    For example, the optimization of np.einsum has been passed upstream and most of the same features that can be found in this repository can be enabled with numpy.einsum(..., optimize=True).

So numpy should provide the same perf for most cases.

braised_babbage · on Oct 6, 2019

So in this metaphor, the "Extended Church-Turing Hypothesis" is the belief that there is no haybale?

braised_babbage · on April 20, 2019

This is, strictly speaking, not true. Talking about an "orthonormal" basis implies that you have in mind some Hilbert space; but in any such instance there will be interesting functions that are not in this Hilbert space. So consider for example the standard space L^2(R) of square-integrable functions on the real line. This does not contain the function f(x) = 1, as a really dumb example.

nisuni · on April 20, 2019

You are nitpicking, and completely missing my point. Let me rephrase it.

The original article is about neural networks being able to represent any function (for some definition of any).

I was just pointing out that there exists much cheaper ways of representing any function. Therefore the article seems very unexciting to me.

Btw, you have chose L^2(R) yourself and then used that to show that there are interesting function not in the space you have chosen, quite a circular argument.

Since the article on neural networks never mentions functions defined on a infinite domain, one can easily take L^2([0,1]) or L^2([0,Lambda]) up to some cutoff Lambda. I would say that all non-pathological functions you can think of are there!

dragon96 · on April 21, 2019

> You are nitpicking, and completely missing my point.

> I was just pointing out that there exists much cheaper ways of representing any function. Therefore the article seems very unexciting to me.

To be fair, your original comment didn't really make a point. What kind of cheaper representations do you have in mind? What makes an orthogonal basis of functions too "expensive" a representation for your taste?

> I would say that all non-pathological functions you can think of are there!

I'd argue that most "real functions" that we care to learn (e.g. mappings between high dimensional data and labels) are pathological. In this sense, we should really care about the completeness of these spaces, perhaps even more than the well-behaved ones.

nisuni · on April 21, 2019

Most functions are pathological? Are you serious?

All physics assumes you are dealing with non-pathological functions, except for some really particular cases. You can do nearly everything in Electromagnetism and nearly all Quantum Mechanics with non pathological functions.

Maybe we have a different definition of pathological, I am using it in the way a physicist would use (i.e. continuous, continuous derivatives, so on)

dragon96 · on April 21, 2019

The physics I've done have only done with n-particle systems, where n is a number I can count on a hand. Likewise, most of the examples I'd learn in a pure math class are either (a) relatively well-behaved objects with nice properties or (b) "pathological" objects that are constructed specifically to prove a counterpoint.

The kind of pathological function that I'm referring to is neither of these. For example, what does the manifold of all 1 second clips of the word "the" look like? If the clip is sampled at 60 Hz, each clip is already in 60d space. I'm inclined to think that it's some unimaginably complicated manifold that would likely fall into the category of "edge cases", which previous commenters have mentioned and it sounds like you're discounting as nitpicking.

I don't know if this aligns with what you mean by a pathological function, and I'm happy to continue having this discussion with a more concrete example of what you mean. :)

braised_babbage · on April 20, 2019

One of the things about continuous functions is that they aren't necessarily differentiable, cf. https://en.wikipedia.org/wiki/Weierstrass_function

throwawaymath · on April 21, 2019

True, but that was kind of an afterthought of my point. All continuous functions still comprise a vector space.

soVeryTired · on April 21, 2019

I don't see how that helps?

> A neural network computes linear transformations

They compute a nonlinear function of an affine transform. Neural networks are nonlinear functions.

braised_babbage · on April 20, 2019

No, neural networks cannot "represent any arbitrary function". Find a theorem that you think states otherwise, and then read what the theorem actually states.

dual_basis · on April 20, 2019

You're correct, the statement of the theorem refers to continuous functions.

In practice, however, the input to neural networks is represented by floating point values, which is a discrete set. So pick whatever arbitrary function you would like, there is some continuous approximation to that function which is actually equal to it on every floating point value, and that function can be approximated arbitrarily closely by a neural network.