Oh wow! I wasn’t expecting to see this on HN. I’m one of the authors of the paper, and I’m happy to answer any questions. (I’m sorry I didn’t see it earlier, it was posted in the middle of my night.)
First of all, to clear up a possible misunderstanding: this new formula will not help you to compute determinants more quickly. The number of terms still grows a lot faster than polynomial, and if you just want to compute the determinant, there are polynomial-time algorithms for that.
The best theoretical results on the complexity of computing determinants, that I know of, are in [0]. In practice you'd probably use Gaussian elimination, or maybe Bareiss for integers.
Someone complained that there’s no code in the paper. Since it isn’t really useful for practical computation, I didn’t think code would be useful, but in fact it did start as a Mathematica program when I first discovered it. Adam Goucher (one of my co-authors) wrote a blog post on it [1] within a few days of its discovery, which links to my original Mathematica notebook.
(Edit: ur-whale points out that the link to the Mathematica notebook no longer works. I have now republished it at [2].)
> Someone complained that there’s no code in the paper.
That was me, apparently much to the dislike of a number of HNers :)
Thank you very much for the pointer to the notebook although it is not easy to access, you seem to need a Wolfram account to get to it.
[EDIT]: "Sorry, you do not have permission to access this item." after I logged into Wolfram ...
> Since it isn’t really useful for practical computation, I didn’t think code would be useful
This is a very common misconception.
There is a large category of people out there (software engineers among other) that find it much easier to read code than reading abstract math formulas full of weird greek letters and which typically pre-suppose what the reader does and doesn't know (things that are considered "trivial" by the author and not worth mentioning in the paper is often a huge obstacle for the reader to proceed with understanding what goes on).
Code is explicit to the degree that a machine can execute it, and therefore often way easier as a path to understanding an idea.
Publishing code also has the minor benefits of actually exposing things that are claimed to work but often don't - again because a machine can execute it whereas a math formula in a PDF does not have that nice property.
Thanks a lot for sharing the code.
It does indeed make it order of magnitude easier for me to understand and validate my own theory and assumptions about it.
For context when I was like 6 years old I was writing code in BASIC on a “TRS-80 Model III” to do linear algebra for fun (trying to make an arcade game).
But never learned algebra until I was forced to at ~20 years old in university.
So my understanding of mathematics was always mechanical (algorithm) and never very abstract.
And it always was extremely hard to read research paper because the Greek letter don’t even mean the same thing in a (Statistic/ machine learning) paper and in a (physics/ calculus paper). And the author never properly define all the function and variables and assume you are already an expert.
My experience in math unfortunately is that code is just about unpublishable. I'd love to be able to just have relevant code in a paper, but editors will reject it.
They'll also reject you for having too much discussion and explanation. Now that almost all "papers" are PDFs and we don't have to worry about printing costs, I worry that the expected brevity in math is just elitism. Sure, if you're one of the six experts in the sub sub subfield the paper concerns itself with, the brevity is welcome, but for everyone else...
> My experience in math unfortunately is that code is just about unpublishable. I'd love to be able to just have relevant code in a paper, but editors will reject it.
Would they still reject it if all you do is add a reference to a github page?
I'm sorry, I'm having a hard time parsing this paper. It looks like even calculating "partial partitions" is exponential. Have they found a nice polynomial time algorithm to find determinants or is this just an exponential algorithm but with fewer components than what is "traditional"?
FYI, The Bareiss algorithm can calculate the determinant in polynomial time, including bounding the bit complexity to polynomial space [0].
From my read of the paper. This new formula also gives an exponential time algorithm. It’s not of interest as a practical method for computing determinants. It does however have interesting geometric and combinatorial properties and they used it to give new bounds for the tensor rank and Waring rank of a determinant respectively. You can think of these ranks loosely as measures of the “complexity” of a determinant (but not in the sense of computational complexity). These ranks are of interest in pure mathematics but I can’t really say more about them because it’s been too many years since I’ve studied related math.
The standard formula for determinants of an n×n matrix (equation 1 in the paper) is the sum of n! terms, one term for each permutation of {1..n}.
The paper details a method (equation 11) in which the sum is instead over all what they call »partial partitions of {1..n}«, denoted PP(n). I don’t know the size of PP(n), but if their claim is correct then PP(n) has exponentially less elements than S(n) (which has n! elements). I don’t see the value of |PP(n)| in the paper, but if it’s O(n^42), that would be an example of something that’s superexponentially smaller/faster. (n! is approximately n^n/e^n, the Stirling Formula.)
There’s a bit of a subtlety here, which is that the formula evaluates to zero whenever the partial partition contains a singleton. For example, the partial partitions of {1,2,3} are:
but most of these contain a part that only has one element. For example {1} | {2, 3} has the singleton part {1}. Only five of them have no singleton part:
{}
{1, 2}
{1, 3}
{2, 3}
{1, 2, 3}
(In fact all five of these consist of just a single part, but that’s just an artifact of using such a small example. If there are more than three elements then we can have things like {1, 2} | {3, 4}.)
In general the number of partial partitions of n elements that have no singleton parts is the n’th Bell number[0]. The Bell numbers grow much more slowly than the factorials, but still much faster than any polynomial.
It's about a formula (another word for algorithm) for computing the determinant, it's not like they shat a new theorem in category theory.
And however higher-mathy that may be, a link to a chunk of python code implementing the formula certainly would not have been detrimental to the paper.
Unless of course all of the other higher math folks would look down upon a despicable math researcher that would dare spoil a perfectly honorable higher math research paper by adding disgustingly usable code to it.
Think of the insult!
People could even get down to running the code just to check that the formula actually works !
Are you familiar with the different stages of research? This is "basic research" -- they found a new determinant formula and analyzed it. Others can pick up their work and continue, for example creating implementations and comparing their runtimes in practice. I think including a determinant implementation would be detrimental if it were not accompanied by this additional analysis, since people might use this implementation as the standard implementation, but very likely in practice there will be many implementation tricks to get it to work competitively.
They're certainly not above writing code. You seem to have missed the appendix, where they wrote some to prove that their bound on the tensor rank of 4x4 determinants over GF2 is exactly 12. https://gitlab.com/apgoucher/det4f2
Also, a formula is not an algorithm. You can oftentimes extract an algorithm from a formula, but the value of a formula in pure math is to be able to write something in a different way. Formulas reveal structure that (hopefully) can be exploited when solving other problems.
> People could even get down to running the code just to check that the formula actually works !
This is an interesting perspective, but you have to take into account that papers are written for some audience in mind -- this one is written for combinatorialists (hence the math.CO designation on the arXiv). It's not too hard of an exercise to plug this formula into Mathematica for this audience. But also, this audience would rather see a proof than computational evidence.
Sure, I'm a mathematician, but what point are you making beyond a potential ad hominem?
I myself include code in my own papers when it's useful. As the author of this paper mentioned (and it's something I'm a bit embarassed I missed), the asymptotics of this new algorithm do not compete with the usual O(n^3) algorithm, so it's not clear to me that having Python code would benefit anyone. Of course you never know, though I think the underlying concepts in their formula are significantly more difficult to learn than the notation itself, so the code also wouldn't help very much with understanding the formula.
Anyway, given that you're saying I'm in an ivory tower, doesn't that give me some authority to explain why the choice to not include code has nothing to do with it being an "insult" to the community or it "spoiling" the paper with "disgustingly usable code"?
> what point are you making beyond a potential ad hominem?
I guess I'm guilty of the exact same sin most math authors commit: I naively assumed that the readers of my post have the intellectual means to get to the obvious conclusion without crutches.
> so it's not clear to me that having Python code would benefit anyone
Please read my answer (and thanks) to the author where he addresses my complaint and does indeed publish the code (top of thread).
TL;DR: there are many (I'd wager most) people who vastly prefer reading code to reading obtuse abstract math papers written using specialized math jargon.
There is huge value in publishing your findings as code:
. it can be *understood* by people who happen not to live in your ivory tower and might neither be able to parse the language, the symbols, the "what is considered trivial by the author and not worth mentioning in the paper". Code is clear, unambiguous, down to earth and leaves no stone unturned.
. code can be *verified* by a machine, which goes a long way to prove that your entire paper isn't a giant pile of BS.
You jumped to conclusions that the code wasn't worth publishing because it isn't practical/efficient etc... You entirely missed the point. Code explains ideas clearly. Math papers very rarely do.
(Derailing this troll-initiated thread a bit with a serious answer)
> what’s the formula for quicksort
That is expressible! It’s what one would write in a proof assistant or powerful type system such as Coq, Agda, Idris. The formula is the algorithm, the type is the theorem (quicksort has type any-list and maps it to sorted-list, or in code – literally – `quicksort : Ordering(A) -> List(A) -> SortedList(A)`).
>It's about a formula (another word for algorithm)
They are not the same - there's plenty of formulas for which there is no algorithm, and even proofs in complexity theory proving there can never be an algorithm.
Algorithms merely formalize a tiny subset of math as things which are computable via a step by step process, and modern computers grew out of exactly the question of what things are computable. Turns out not much compared to the space of all "things."
A mathematical proof is a better guarantee that something works than a demonstration in code.
The higher math researchers usually pay for it being less accessible, because the first engineering paper that uses their works gets cited a lot more.
But that also leads to cargo-cult engineering, where the first implementation of the paper just gets copied around without understanding (e.g. if there are limitations/assumptions).
This, but unironically. Their formula (11) involves basics (sums and products), the only nontrivial part would be the partial partitions, and for that the first paragraph of chapter 2 explains how to get it from ordinary partitions, which is a standard function.
First of all, to clear up a possible misunderstanding: this new formula will not help you to compute determinants more quickly. The number of terms still grows a lot faster than polynomial, and if you just want to compute the determinant, there are polynomial-time algorithms for that.
The best theoretical results on the complexity of computing determinants, that I know of, are in [0]. In practice you'd probably use Gaussian elimination, or maybe Bareiss for integers.
Someone complained that there’s no code in the paper. Since it isn’t really useful for practical computation, I didn’t think code would be useful, but in fact it did start as a Mathematica program when I first discovered it. Adam Goucher (one of my co-authors) wrote a blog post on it [1] within a few days of its discovery, which links to my original Mathematica notebook.
(Edit: ur-whale points out that the link to the Mathematica notebook no longer works. I have now republished it at [2].)
[0] https://users.cs.duke.edu/~elk27/bibliography/04/KaVi04_2697...
[1] https://cp4space.hatsya.com/2022/07/02/a-combinatorial-proof...
[2] https://www.wolframcloud.com/obj/robin.houston/Published/det...