Polynomial Activation Functions (2020)

peter_d_sherman · 2024-10-21T05:04:15 1729487055

Consider the following:

Consider a computationality implemented Neural Network -- with only one single "Neuron".

This "Neuron" is functionally a boolean function.

That is, if we say, pass it a cat picture, or its binary or mathematical or data equivalent, that it will return 1 (True -- it is a "cat picture") or it will return 0 (False -- it is NOT a "cat picture").

(In a real world Neuron, the function could pass back some decimal value between 0 and 1 (or some value between 0 and 100%, or some value between 0 and MAXINT) to return what degree of cat picture it is, where 0 is that it doesn't resemble a cat picture at all, and the highest value, whatever it is, is the maximum resemblance to a cat picture)

But we'll say for simplicity that our "Neuron" is basically a yes/no, true/false, 1 or 0 boolean returning function...

With me so far?

This single neuron, whatever data gets sent to it, must have an Activation Function (https://en.wikipedia.org/wiki/Activation_function), i.e., Sigmoid, ReLU, etc., etc.

But, let's say we thought of our data as a Polynomial...

And let's say we thought of our Activation Function as the Polynomial being equal to 0.

In other words, in Mathematics, we'll often see math examples where a Polynomial is set to 0, made equal to 0 -- and then solved.

What I am suggesting is "what if it were OK for the result of a Polynomial to be non-zero", but that IF the result of it for a given value of X was non-zero, that that would serve as a flag that a boolean Activation Function was to return either true or false.

So if the Polynomial, when evaluated results in 0, that's one boolean value.

If the Polynomial, when evaluated results in a non-zero (or negative) value, that's the other, opposite boolean value.

See, I intuit that in traditional Mathematics, the full range of possibilities for Polynomials, by forcing them to equal zero, might not be grokked...

If we examine them in the context of computational Neural Networks, in the context of Activation Functions, it would sure seem that they could be used as various types of Activation Function, which, in their simplest case, would be to return a boolean value, a True or a False, a 1 or a 0...

In other words, what happens, or what can happen, if we examine Polynomials not so much as numbers, but as functions, and not so much as functions, but as Activation Functions...

That is,

If you get 0 from your Polynomial, you get something...

But if you don't get 0 from your Polynomial, you get something else, and it's OK and desired and information-rich -- to get that something else!

I'm going to go out on a limb here...

I'm going to conjecture here on HN -- that each mathematical Polynomial, each Polynomial mathematical object -- could not only be used as Activation Functions for computational Neural Networks, but also that:

Polynomials might be their own miniature neural network!

Think of it like this:

If you have a complex waveform (sum of many simpler waveforms) as data on the one hand, and a Polynomial on the other, then the Polynomial function might act as a "intelligent filter" or perhaps frequency comb of sorts, which might result in some data passing through it under some conditions and other data not passing through it.

Which might point to the Polynomial -- in effect being its own, albeit miniature Neural Net of sorts!

Which would be great if it were true... because if that turns out to be an Identity (i.e., proven to be an Identity), then we could substitute "Polynomial" anytime we saw "Neural Net", and be able to think and understand things better!

So Polynomials Vs. Neural Nets...

Could one be proven to actually be / have been -- the other in disguise?

Or are there incompatible aspects to both?

Minimally speaking, Polynomials are Activation Functions...

And minimally speaking, the smallest Neural Net consisting of one Neuron -- consists of one Activation Function...

So, I'll leave it to future computer scientists/mathematicians -- to prove or disprove this conjecture...