Hacker News new | past | comments | ask | show | jobs | submit login
A New Formula for the Determinant (arxiv.org)
95 points by scentoni on Jan 19, 2023 | hide | past | favorite | 33 comments



Oh wow! I wasn’t expecting to see this on HN. I’m one of the authors of the paper, and I’m happy to answer any questions. (I’m sorry I didn’t see it earlier, it was posted in the middle of my night.)

First of all, to clear up a possible misunderstanding: this new formula will not help you to compute determinants more quickly. The number of terms still grows a lot faster than polynomial, and if you just want to compute the determinant, there are polynomial-time algorithms for that.

The best theoretical results on the complexity of computing determinants, that I know of, are in [0]. In practice you'd probably use Gaussian elimination, or maybe Bareiss for integers.

Someone complained that there’s no code in the paper. Since it isn’t really useful for practical computation, I didn’t think code would be useful, but in fact it did start as a Mathematica program when I first discovered it. Adam Goucher (one of my co-authors) wrote a blog post on it [1] within a few days of its discovery, which links to my original Mathematica notebook.

(Edit: ur-whale points out that the link to the Mathematica notebook no longer works. I have now republished it at [2].)

[0] https://users.cs.duke.edu/~elk27/bibliography/04/KaVi04_2697...

[1] https://cp4space.hatsya.com/2022/07/02/a-combinatorial-proof...

[2] https://www.wolframcloud.com/obj/robin.houston/Published/det...


> Someone complained that there’s no code in the paper.

That was me, apparently much to the dislike of a number of HNers :)

Thank you very much for the pointer to the notebook although it is not easy to access, you seem to need a Wolfram account to get to it.

[EDIT]: "Sorry, you do not have permission to access this item." after I logged into Wolfram ...

> Since it isn’t really useful for practical computation, I didn’t think code would be useful

This is a very common misconception.

There is a large category of people out there (software engineers among other) that find it much easier to read code than reading abstract math formulas full of weird greek letters and which typically pre-suppose what the reader does and doesn't know (things that are considered "trivial" by the author and not worth mentioning in the paper is often a huge obstacle for the reader to proceed with understanding what goes on).

Code is explicit to the degree that a machine can execute it, and therefore often way easier as a path to understanding an idea.

Publishing code also has the minor benefits of actually exposing things that are claimed to work but often don't - again because a machine can execute it whereas a math formula in a PDF does not have that nice property.


Oh, what a pain. Sorry. It looks as though Wolfram Cloud removes published notebooks after 60 days unless you pay extra.

I’ve just republished it at https://www.wolframcloud.com/obj/robin.houston/Published/det...

I’d be interested to know whether the code _does_ make it easier for you to understand.


Thanks a lot for sharing the code. It does indeed make it order of magnitude easier for me to understand and validate my own theory and assumptions about it.

For context when I was like 6 years old I was writing code in BASIC on a “TRS-80 Model III” to do linear algebra for fun (trying to make an arcade game).

But never learned algebra until I was forced to at ~20 years old in university.

So my understanding of mathematics was always mechanical (algorithm) and never very abstract. And it always was extremely hard to read research paper because the Greek letter don’t even mean the same thing in a (Statistic/ machine learning) paper and in a (physics/ calculus paper). And the author never properly define all the function and variables and assume you are already an expert.


My experience in math unfortunately is that code is just about unpublishable. I'd love to be able to just have relevant code in a paper, but editors will reject it.

They'll also reject you for having too much discussion and explanation. Now that almost all "papers" are PDFs and we don't have to worry about printing costs, I worry that the expected brevity in math is just elitism. Sure, if you're one of the six experts in the sub sub subfield the paper concerns itself with, the brevity is welcome, but for everyone else...


> My experience in math unfortunately is that code is just about unpublishable. I'd love to be able to just have relevant code in a paper, but editors will reject it.

Would they still reject it if all you do is add a reference to a github page?


Also keep in mind that there’s people out there that will take your code and make it 40.363.757.265% more efficient (perhaps not in this case)

https://youtu.be/c33AZBnRHks


I'm a former academic whose research area was matrix theory. I left for the dark side and do software engineering these days.

I'm just jazzed to see a math paper in my wheelhouse on the front page of HN.


I'm sorry, I'm having a hard time parsing this paper. It looks like even calculating "partial partitions" is exponential. Have they found a nice polynomial time algorithm to find determinants or is this just an exponential algorithm but with fewer components than what is "traditional"?

FYI, The Bareiss algorithm can calculate the determinant in polynomial time, including bounding the bit complexity to polynomial space [0].

[0] https://en.wikipedia.org/wiki/Bareiss_algorithm


From my read of the paper. This new formula also gives an exponential time algorithm. It’s not of interest as a practical method for computing determinants. It does however have interesting geometric and combinatorial properties and they used it to give new bounds for the tensor rank and Waring rank of a determinant respectively. You can think of these ranks loosely as measures of the “complexity” of a determinant (but not in the sense of computational complexity). These ranks are of interest in pure mathematics but I can’t really say more about them because it’s been too many years since I’ve studied related math.


A new explicit formula for the determinant that contains superexponentially fewer terms than the usual Leibniz formula


What does "superexponentially fewer" mean?


»More than exponential«.

The standard formula for determinants of an n×n matrix (equation 1 in the paper) is the sum of n! terms, one term for each permutation of {1..n}.

The paper details a method (equation 11) in which the sum is instead over all what they call »partial partitions of {1..n}«, denoted PP(n). I don’t know the size of PP(n), but if their claim is correct then PP(n) has exponentially less elements than S(n) (which has n! elements). I don’t see the value of |PP(n)| in the paper, but if it’s O(n^42), that would be an example of something that’s superexponentially smaller/faster. (n! is approximately n^n/e^n, the Stirling Formula.)


> I don’t know the size of PP(n)

There’s a bit of a subtlety here, which is that the formula evaluates to zero whenever the partial partition contains a singleton. For example, the partial partitions of {1,2,3} are:

  {}
  {1}
  {2}
  {3}
  {1, 2}
  {1, 3}
  {2, 3}
  {1, 2, 3}
  {1, 2} | {3}
  {1, 3} | {2}
  {1} | {2, 3}
  {1} | {2} | {3}
but most of these contain a part that only has one element. For example {1} | {2, 3} has the singleton part {1}. Only five of them have no singleton part:

  {}
  {1, 2}
  {1, 3}
  {2, 3}
  {1, 2, 3}
(In fact all five of these consist of just a single part, but that’s just an artifact of using such a small example. If there are more than three elements then we can have things like {1, 2} | {3, 4}.)

In general the number of partial partitions of n elements that have no singleton parts is the n’th Bell number[0]. The Bell numbers grow much more slowly than the factorials, but still much faster than any polynomial.

[0] https://en.wikipedia.org/wiki/Bell_number


Minor comment: the improvement over the state-of-the-art is in the second order term of the exponent.

Their formula express det. with B_n = exp( n ln n - n lnln n - O(n)) terms.

The old formula had exp(n ln n - O(n)) terms.

It's a nice bound, but won't change much for non-theoretical results.


Very interesting, but ... another paper without code.


The formula is literally in equation 11.


I'm a layman, but this seems to be a very approachable presentation of a paper.

Section titles are approachable and it has figures and graphics with geometric ideas.


And equation 12 shows an expansion of the formula given in eq11.


It's about higher math dude.


> It's about higher math dude.

It's about a formula (another word for algorithm) for computing the determinant, it's not like they shat a new theorem in category theory.

And however higher-mathy that may be, a link to a chunk of python code implementing the formula certainly would not have been detrimental to the paper.

Unless of course all of the other higher math folks would look down upon a despicable math researcher that would dare spoil a perfectly honorable higher math research paper by adding disgustingly usable code to it.

Think of the insult!

People could even get down to running the code just to check that the formula actually works !

Ivory tower much ?


Are you familiar with the different stages of research? This is "basic research" -- they found a new determinant formula and analyzed it. Others can pick up their work and continue, for example creating implementations and comparing their runtimes in practice. I think including a determinant implementation would be detrimental if it were not accompanied by this additional analysis, since people might use this implementation as the standard implementation, but very likely in practice there will be many implementation tricks to get it to work competitively.

They're certainly not above writing code. You seem to have missed the appendix, where they wrote some to prove that their bound on the tensor rank of 4x4 determinants over GF2 is exactly 12. https://gitlab.com/apgoucher/det4f2

Also, a formula is not an algorithm. You can oftentimes extract an algorithm from a formula, but the value of a formula in pure math is to be able to write something in a different way. Formulas reveal structure that (hopefully) can be exploited when solving other problems.

> People could even get down to running the code just to check that the formula actually works !

This is an interesting perspective, but you have to take into account that papers are written for some audience in mind -- this one is written for combinatorialists (hence the math.CO designation on the arXiv). It's not too hard of an exercise to plug this formula into Mathematica for this audience. But also, this audience would rather see a proof than computational evidence.


> Are you familiar with

Starting a sentence by "Are you familiar with" is a pretty clear cut indication that you _do_ live in an ivory tower.


Sure, I'm a mathematician, but what point are you making beyond a potential ad hominem?

I myself include code in my own papers when it's useful. As the author of this paper mentioned (and it's something I'm a bit embarassed I missed), the asymptotics of this new algorithm do not compete with the usual O(n^3) algorithm, so it's not clear to me that having Python code would benefit anyone. Of course you never know, though I think the underlying concepts in their formula are significantly more difficult to learn than the notation itself, so the code also wouldn't help very much with understanding the formula.

Anyway, given that you're saying I'm in an ivory tower, doesn't that give me some authority to explain why the choice to not include code has nothing to do with it being an "insult" to the community or it "spoiling" the paper with "disgustingly usable code"?


> what point are you making beyond a potential ad hominem?

I guess I'm guilty of the exact same sin most math authors commit: I naively assumed that the readers of my post have the intellectual means to get to the obvious conclusion without crutches.

> so it's not clear to me that having Python code would benefit anyone

Please read my answer (and thanks) to the author where he addresses my complaint and does indeed publish the code (top of thread).

TL;DR: there are many (I'd wager most) people who vastly prefer reading code to reading obtuse abstract math papers written using specialized math jargon.

There is huge value in publishing your findings as code:

    . it can be *understood* by people who happen not to live in your ivory tower and might neither be able to parse the language, the symbols, the "what is considered trivial by the author and not worth mentioning in the paper". Code is clear, unambiguous, down to earth and leaves no stone unturned.

    . code can be *verified* by a machine, which goes a long way to prove that your entire paper isn't a giant pile of BS.
You jumped to conclusions that the code wasn't worth publishing because it isn't practical/efficient etc... You entirely missed the point. Code explains ideas clearly. Math papers very rarely do.


> a formula (another word for algorithm)

Huh? Some differences:

- A formula says nothing about evaluation order, sometimes not even about evaluation (example: where’s the algorithm in e^(iπ) -1 = 0?)

- the computations that an algorithm makes cannot always be expressed in a closed-form expression (example: what’s the formula for quicksort?)

A formula is “a concise way of expressing information symbolically” (https://en.wikipedia.org/wiki/Formula)


(Derailing this troll-initiated thread a bit with a serious answer)

> what’s the formula for quicksort

That is expressible! It’s what one would write in a proof assistant or powerful type system such as Coq, Agda, Idris. The formula is the algorithm, the type is the theorem (quicksort has type any-list and maps it to sorted-list, or in code – literally – `quicksort : Ordering(A) -> List(A) -> SortedList(A)`).


>It's about a formula (another word for algorithm)

They are not the same - there's plenty of formulas for which there is no algorithm, and even proofs in complexity theory proving there can never be an algorithm.

Algorithms merely formalize a tiny subset of math as things which are computable via a step by step process, and modern computers grew out of exactly the question of what things are computable. Turns out not much compared to the space of all "things."

Start here to learn more https://en.wikipedia.org/wiki/Computable_function


A mathematical proof is a better guarantee that something works than a demonstration in code.

The higher math researchers usually pay for it being less accessible, because the first engineering paper that uses their works gets cited a lot more.

But that also leads to cargo-cult engineering, where the first implementation of the paper just gets copied around without understanding (e.g. if there are limitations/assumptions).


The code is trivial and left as an exercise to the reader. /s


This, but unironically. Their formula (11) involves basics (sums and products), the only nontrivial part would be the partial partitions, and for that the first paragraph of chapter 2 explains how to get it from ordinary partitions, which is a standard function.


> The code is trivial and left as an exercise to the reader. /s

Yeah, the margin was probably not wide enough.


I'd argue that the results of this paper are not meant to be implemented, hence no code required.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: