TL;DR: an early effort to establish a 'standard language' for representing and symbolically manipulating (e.g., transforming) every possible neural net that could ever be constructed with graphs and tensors.
In principle it should, because these guys are using matrices to represent network structures, and every finite graph (cyclic or acyclic, directed or undirected) can be represented by a matrix.
I used a very similar idea to create an indirect coding for neuroevolution that was successful in solving some tasks. It's based on a generalization of K^2 Trees and shares the same ideas of modularity and hierarchy.
I've heard training using some tools is easier than others. This might allow you to experiment on one tool and then easily port your commonly formatted model to a more production like system, for performance or whatever other reasons. Seems like compatibility between these things would be pretty useful
I stopped trying to learn about machine learning in general because mathematical notations are always used everywhere in ML.
I think it's odd that programming languages and code have not replaced math. I mean I agree that you need math for ML, but that doesn't mean you can't show what you do with pseudo code.
Math is fine to prove things or simplify a formula, but when it involves computing, why not use code instead? It's okay to use math in physics, but when you deal with computing and data, I think it's a little misplaced.
Not to mention that the language of mathematics is poorly defined. I mean I would love to learn math by learning its syntax.
It seems like "arbitrary" notation is also a problem in other fields of mathematics such as Linear Algebra - perhaps depending on the background of the author in math or CS. In Linear Algebra every text book has their own favorite notation for matrices, row/column vectors (special letter such as c for column vectors, roman letters, roman letters with an arrow on top) and scalars (greek letters or the letter c).
Yup! And half the papers mean W_i to mean the weights for the layer.
And then there are those who say screw mathematics completely and use the Hadamard or Kronecker product symbol to denote convolution, which has caused me no end of trouble in the past.
Very cool to see this sort of thing.