Hacker News new | past | comments | ask | show | jobs | submit login
A surprisingly hard CS problem: sums of square roots (2018) (shlegeris.com)
297 points by EvgeniyZh on Jan 24, 2022 | hide | past | favorite | 178 comments



It reminds me of this problem. When you plot the motion of conservative dynamical systems, say this one

https://en.wikipedia.org/wiki/Standard_map

you get these chaotic areas that look like television static on the map. I was reading an 1987 article from Byte magazine that reminded me of a conversation I had when I was working on my PhD, which is that every plot like this you see is wrong because these are done with finite precision math and it doesn't take that many iterations for calculation errors to be amplified up to the first digit.

It's significant because the reason we know those chaotic regions are full of unstable periodic orbits is that those chaotic regions are also full of stable periodic orbits and that those breed unstable periodic orbits on the separatrices between them. The stable orbits form a hierarchical lattice that constrains the chaotic motion and that should appear as visible structure.

There are hints in the literature that it ought to be possible to use variable precision interval math to do parameter scans, make better images, and more accurately understand the chaotic motion. On top of that we know a lot about how the stable periodic orbits relate to each other which would help in making that kind of image.

I haven't seen any evidence that it's been done and I know one reason it is hard is that the scaling of the algorithm would be worse than the usual way of doing things for the same reason the above problem is hard.


Yes. every plot of a chaotic system is grossly wrong.

But the cool thing is that it is imperceptibly different from another plot that is correct.

this is a consequence of the shadowing theorem that says that while any finitely computed sequence has unavoidable errors, there is a sequence arbitrarily close to the numbers you do compute. (don't hold me to rigorous statements here, it has been many years) (the gist is right)


My understanding is the shadowing lemma is controversial. For instance it applies to the chaotic orbits but not the stable orbits that are embedded in them.


Wouldn't the KAM theorem apply to the stable orbits?

But for the chaotic orbits there was some good work a few years back by Boekholt & Portegies Zwart. They built an arbitrary-precision N-body integrator and found that the results of ordinary N-body simulations are statistically identical to arbitrary precision integrations: https://arxiv.org/abs/1411.6671


Since your finite-precision initial conditions are probably not on any stable orbits, I'm not sure how much that interferes with the truth of Ted's explanation.


> But how long will we need to look through these sequences of digits before we find the disagreeing digit? It feels intuitively like we should be able to establish some kind of bound on this. Like, maybe we should be able to say “if you add two lists of n numbers, each of which has d digits, then they can’t disagree for more than k * n * d digits” for some k. But no-one’s been able to prove anything like this.

You can write down a completely explicit bound here using the Mahler-Mignotte root separation bound. More generally, for any algebraic expression involving algebraic numbers, you can bound a priori the number of digits you need to check to determine its sign.

When you involve transcendental numbers, things do get much harder though.


So, to state explicitly, given a list of positive integers, a_i, and coefficients, d_i \in {-1,1}, test whether \sum_i d_i sqrt(a_i) <=? 0. Now, construct a polynomial, P(z) = \prod_i (z^2 - a_i), and this gives a (univariate) polynomial so that a Mahler-Mignotte like bound can be used.

I guess there's different levels of bounds you can use (Mahler, Mahler-Mignotte, Davenporte-Mahler-Mignotte [0]) but they all involve the discriminant, the deg to the deg power (n^n) and maybe some other factors which put it neatly in a polynomial time bit representation. One bound puts it in the 2^{-2s^2} range, for bit size s [1].

Why does this not solve it? The problem as stated on cstheory.stackexchange explicitly says the square roots are square roots of integers [2]. What am I missing?

[0] https://arxiv.org/pdf/2005.07843.pdf

[1] http://160592857366.free.fr/joe/ebooks/ShareData/Fundamental... (pg 165, Lecture VI, Section 7, Root separation (pdf pg. 197))

[2] https://cstheory.stackexchange.com/questions/79/problems-bet...

EDIT: I forgot to include the sqrt in the sum equation


I'm wrong. The Mahler-Mignotte only works for pairs of roots and doesn't say anything about the absolute value of the sum, at least in the way I was thinking about it. There may be a way to "fix it up" but not that I see and I suspect folks who've studied this in earnest are aware of Mahler-Mignotte and understand why it can't be used to solve this problem.

Thanks to @devit [0] who has understood why the Mahler-Mignotte tactic doesn't work. Just because you can bound the bit complexity of pairs of roots doesn't mean you can bound the complexity of all the 2^N possible {-1,1} combinations of them. At least, I don't see how it can be done simply.

[0] https://news.ycombinator.com/item?id=30059545


A request: please always link to abstract pages of articles, not directly to PDFs.

https://arxiv.org/abs/2005.07843


You can also go straight to the abstract from PDF links using the "Redirectify" browser add-on for Firefox and Chrome.

https://github.com/imurray/redirectify


May I ask, for what reason, please?


Personally, I prefer not to get surprise PDFs; but that's just personal.

A better reason is that linking to the abstract page lets you navigate easily around the arXiv from there, including to the PDF if you desire; but there is no 1-click way to get from the PDF back to the abstract. (Of course, it's an easy matter of address munging, but even easier is not to have to do the munging.)

A perhaps less satisfying reason is the same reason that one doesn't deeplink directly to an XKCD image, but rather to the XKCD page for the relevant cartoon: a courteous acknowledgement of the source.


These are fine but pretty idiosyncratic. HN doesn't have citation rules so the 'always' seems overstated. People linking papers are already going the extra mile for the benefit of others and we don't really need to berate them about how they're holding their generosity wrong.


I didn't mean to come across as berating, but rather as suggesting a better way to link. I hoped that 'request' and 'please' would set the proper tone, but am certainly open to better ways of wording it. I meant 'always' to indicate that I specifically wasn't just complaining pointlessly about the present case, but rather talking about future links; but I can see how it came across like the scolding 'always' as in a phrase "you always do this."


The PDF contains the abstract right at the top, along with full attribution, and often uses less bandwidth. In most browsers, selecting the title and first author then right-clicking "search" allows a user to find related material on the open web.

My personal preference is the PDF link.

Your request was perfectly polite, I was simply wondering what your rationale was for it.


Multiple people have said they prefer not getting surprise PDFs. Could you elaborate on why? It's never something I've ever thought about. PDFs open in the browser now for most people, so it seems like it shouldn't make much difference?


I was going to add that some people on slow connections might not want to download large PDFs.

However, in today's web, the html page with the abstract (one paragraph) was a 2.5 MB download, while the 15 page paper including a figure was just 800 kB.


That's honestly an incredible indictment of today's web, isn't it?


In my particular case PDFs linked from hacker news (accessed through the nextcloud feed reader) end up in my phone's downloads, which in that way fills up with PDFs I only wanted to read once.


There may be more of a security risk with PDFs, hard to say though. This was likely more of an issue back when using an external program to view them was a requirement.


So people can determine for themselves whether they want to download the PDF, by reading the abstract first. This has always been a point of petty annoyance for me as well.


Does that actually work?

It seems that the degree of minimal polynomial having as root the sum of N square roots might be up to 2^N, and if you then apply the bound at https://en.wikipedia.org/wiki/Geometrical_properties_of_poly... (where n = 2^N) you get a bound on the order of at least 2^N digits (more precisely 2^N (N + D)).

So it doesn't seem to lead to a proof of a subexponential number of equal digits, unless the degree of the minimal polynomial is actually subexponential.


Why do you need the bounds for every combination of the N square roots? Isn't it enough to get the minimum distance between the two nearest elements in that list?

If so, why not consider the 2N degree polynomial where P(z) = \prod (z^2 - a_i) ? This polynomial is only 2N degree and gives you the bound you actually care about, the number of bits needed to sum two numbers in the list. Since you're summing 2N of them instead of just one, you might need on the order of lg(N) more bits in your representation (so 2N + lg(N) bits, say) but this is still well within "polynomial" bits.


Not clear how a lower bound on the absolute value of the difference of any two of the square roots would help give a lower bound on the absolute value of the difference of the two sums of square roots.


Sorry to be obtuse, but I don't understand your hesitation.

If you have a lower bound on the absolute value of the smallest difference of any/all pairs of roots, the lower bound on the sum of N of them is at most adding lg(N) bits.

EDIT:

I'm wrong, you're right. You've hit it on the head. My apologies.

Just because there's bounds on pairwise roots, doesn't mean they then can be bounded when they're all summed together.

In other words, say you have d_0 = |a_0 - a_1| and d_1 = |a_2 - a_3|, you might get into a situation where |d_0 - d_1] requires some exponential number of bits to represent.


I don't see the problem. Once you have enough bits to resolve every pairwise difference you're just reduced to the problem of comparing the sums of two lists of n bit integers and I'm pretty sure that's in P.

If it's not then that should be the main point here, sqrt has nothing to do with it.

EDIT: You also know what the largest possible error is on each pairwise delta so if you add log(N + 1) bits you handle even the worst case where one sum is +N x maxerror and the other -N x maxerror.


Yes, the worst-case complexity is exponential in N, but the wording in the article could lead you to believe that no explicit exponential bound is known, which is false.


Do you have a reference for that claim ?


Correcting myself, the bound is worse than exponential (so read "at least exponential in N"), but the point I wanted to make is that it is explicit.

Again, this follows from the general theory of algebraic numbers: the degree and height of a sum, product or root of algebraic numbers can be bounded explicitly (resultants + Mignotte bound for factors), and finally root separation bounds can be applied to the resulting polynomial.


The author says that this problem is in PSPACE. That's not obvious to me because I don't know how you sum arbitrarily long binary numbers in polynomial space.

However if you and he are both right that would suffice to prove P != PSPACE, so this problem is potentially very important. Unfortunately I don't even know what this kind of problem is called, which makes googling a bit difficult.


This is false. PSPACE is in EXPTIME.


Exactly: algebraic numbers, despite not being periodic, are in general "reasonably far from each other", and especially from rationals.

I guess the problem can be solved using what you say, certainly.

It is only transcendentals that can be "too near" each other, and near rationals (this is Liouville's result, which was improved later on, in a specific case the one you say).


Rational numbers are algebraic so how are algebraic numbers reasonably far from each other? Algebraic numbers are dense in the real number line.


It is a specific statement by Liouville: if you can approximate a number "very well" using rational numbers, then it must be transcendental.

https://mathworld.wolfram.com/LiouvillesApproximationTheorem...

My statement above may be a bit confusing, though.


They are using a different notion of “measure” than the standard notion of absolute value of the difference. Under the standard measure every number is within epsilon distance of a rational for any positive epsilon. Thank you for the clarification.


Yes, of course. Sorry. It is an asymptotic result, so the meaning of "distance" is very blurry in my statement.

I was replying to the previous comment which seemed to imply that knowledge.


I’ve never seen this before so thanks for the links and clarification. I learned something new.


I remember doing side by side plots of conservative Hamiltonian trajectories doing a standard Euler method (maybe even RK45), vs a symplectic method (which will maintains energy conservation). The RK45 implementation had a very nice symmetric pattern, but which was completely different from the one in the (correct) symplectic implementation. This was a useful eye opener for me to not just blindly rely on Matlab's ODE45 or other default solvers...


I would imagine (but note that I'm totally ignorant here) that this bound depends pretty poorly on the degree of the polynomial defining the expression (and pretty reasonably on the coefficients). Then when you sum two algebraic numbers, the degree of the polynomial defining the sum gets way worse (in general as bad as the product of the degrees). I would imagine this is the issue.


The part of this post that I find most wonderful/strange:

> EDIT: I think that Edward Kmett and Paul Crowley might have figured out how to solve this problem in the comments on my Facebook post; see here. I’ll investigate further and update.

> EDIT 2: actually we didn’t solve the problem, but it might still be a good direction for future research.

Possibly ground-breaking math being done on comments on a Facebook post.



> “This proof shows that you don’t need to be a professional mathematician to understand mathematics and advance the frontier of knowledge,” Pantone says. “That’s the beautiful thing about math, is that anyone can understand the questions.”

or that prof math post on 4chan?


It seems less crazy when you think about the fact that actual ground breaking math has been solved on scrap paper and rudimentary writing utensils. Sometimes all it takes is the right person being prompted with the question.


Just wait until you hear about the paper [1] inspired by a Twitter discussion that ended up winning the "Best theme paper" award in the ACL Conference 2020.

[1] Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data: https://aclanthology.org/2020.acl-main.463.pdf


> My guess is that [...] we’re just stuck on proving it because [...] most of the people who would be good at working on this are doing something more useful instead.

Evidently not


Most of the people who would be good at working on this are working on getting people to click ads instead.


I don't know if Dunning Kruger effects on Facebook posts are anything new/wonderful/strange.

This thread reads like somebody who just heard about the Collatz conjecture, and after half an hour are sure they have a solution.


What a strange coincidence: 17 hours ago Edward Kmett tweeted about Dunning-Kruger https://twitter.com/kmett/status/1485464550786883588?s=20


An example of a tricky case: which is bigger, sqrt(1000000) + sqrt(1000018) + sqrt(1000036) + sqrt(1000059) + sqrt(1000083), or sqrt(1000003) + sqrt(1000011) + sqrt(1000048) + sqrt(1000050) + sqrt(1000084)? (They agree to more than 20 decimal digits of precision!)


The second is bigger, but only by ~2.3 * 10^(-13).

Very good illustration indeed.


i wonder if you can use logs to do this faster:

1. simplify the sqrt by log of each number, i.e. log(sqrt(x)) = 1/2 * log(x)

2. since sum of logs is the log of the products, i.e., log(a) + log(b) = log(ab)

you can simplify the whole expression by multiplying all the numbers, taking the log of the product once (which i presume is much faster), then multiplying by 1/2

since logs are strictly increasing, the resulting number is still going to be bigger if it originally was going to be bigger, and now you don't need to have performed all those sqrts.

Now you've reduced the problem to computing 1 log to an arbitrary precision...not sure how one does that actually...


I think the + and × are backwards from the way that would be helpful. But if you think it would work, try writing it out in detail.


let met try, for [a, b, c], and [d, e, f]

1. Take the log of all terms (allowed, because log keeps the sum monotonic):

log(sqrt(a)) + log(sqrt(b)) + log(sqrt(c))

2. pull out the sqrt from the log:

1/2 * log(a) + 1/2 * log(b) +1/2 * log(c)

3. factor out the 1/2:

1/2 * (log(a) + log(b) + log(c))

4. sum of logs can be rewritten as a log of product:

1/2 * log (a * b * c)

5. compute log of (a * b * c), and halve it. Ditto with log of (d * e * f). This should give a number which is proportional to the original sum of sqrt.


> Take the log of all terms (allowed, because log keeps the sum monotonic)

It seems you're employing a + b < c <=> log(a) + log(b) < log(c), which doesn't hold (consider 10, 10, and 20).

(the real rule is a * b < c <=> log(a) + log(b) < log(c)), because log(a) + log(b) <=> log(a * b)


ahh, that is where i tripped up! Good to see!


This doesn't work. Even if you assume sqrt(a) + sqrt(b) > sqrt(c) + sqrt(d), that doesn't necessarily mean that log(sqrt(a)) + log(sqrt(b)) > log(sqrt(c)) + log(sqrt(d)). For example, let a = 1, b = 100, c = 25, d = 25. Then

> sqrt(1) + sqrt(100)

11.0

> sqrt(25) + sqrt(25)

10.0

> log(sqrt(1)) + log(sqrt(100))

2.302585092994046

> log(sqrt(25)) + log(sqrt(25))

3.2188758248682006


It's hard to solve "generally", sure, but "practically", is there any application where extreme correctness of the algorithm would actually matter? Seems like if two huge lists sum to almost the exact same thing for tons of decimal places, then you can effectively treat them as equal. Sort of like how you only need like 5 or 6 digits of pi to get into orbit around the moon, etc.


If question asked for just returning the sum, then yea, that would be acceptable. However, the question requires the comparison of sums. To decide which one is actually larger, 5-6 digits is not enough, even "practically".


It's not really different. If you were given the sums in the first operation to 5-6 digits and it was acceptable error margin, the comparison is within that acceptable error margin too, it's just this time the error led to a wrong binary not a wrong decimal.


(Author here) As far as I know, there are no cases where this actually matters.


In your post you say that a PSPACE algorithm exists. Do you have a reference for that algorithm ?

Another commenter here is saying the problem has an explicit exponential or worse lower bound.

If both of those claims are true that would prove P != PSPACE, which would be a very important result.


I think that's basically fair. If we were dealing with the reals I'd assume that there was some uncountable number of pathological examples that just happen to exclude all the numbers we really care about (since God ran out of good numbers and had to scrape the bottom of the barrel to fill in the gaps), but the integers seem safe.


I have an very efficient and "practical" algorithm for determining the score in a sports match. In almost all cases it tells us which team had the higher score -- the only time it fails is when the game in close. In those cases, it can't accurately tell who won.

It SOUNDS like a very practical algorithm, but in reality it's precisely in the "close" cases where people care the most.


we are talking about math here, there are always applications even ones where we dont know in this day and age.

I am positive the same question could be been asked about basic calculus concepts in the 12th century.


Yeah, I get that, but I'm just saying if this is a problem needing solved to accomplish some visual effect in game programming (as a contrived example), that practically speaking it's not a hard problem. It's only a hard problem in a theoretical, general sense.


Here is my attempt at proving that this can be done in P-time. I am using the fact that square roots can be expressed as periodic continued fractions, and that there is an upper bound on the period of these fractions which must thus be unique. Please tell me if there are any issues! I hope this holds :)

The period of the square root of n is less than k1 * (sqrt(n) log (log (log (n))) ) = Lm(n) We can find this period in polynomial time (n^3) each term is smaller than 2*sqrt(n)

Therefore, all square roots up to n are uniquely represented as part of a continued fraction expansion as a0 + 1/(a1 + 1/ (a2 + ...)) ... up to a[Lm(n)].

If we transform this nested fraction into a decimal number, any modification of this fraction will necessarily lead to a delta in the number by at least (a[Lm(n))]^-(Lm(n)), which is at most 2sqrt(n)^-(sqrt(n)

Therefore, accurately computing the square root up to the most significant digit of 2sqrt(n)^-(2sqrt(n)) will yield a result distinct from any square root up to n.

In the case of a sum of square roots, assuming that n is the largest root, accurately computing the sum up past the most significant digit of 2sqrt(n)^-Lm(n) will be sufficient to decide if a sum of square roots is larger than or smaller than another.

2sqrt(n)^-Lm(n) is reciprocally subexponential, thus the number of digits needing to be computed will grow sub-linearly with n

Therefore, the problem can be solved in polynomial time.


Can we get a validity test and benchmark comparing this solution to the generally accepted one?



I wonder if comparing powers of sums would help?

Suppose one of the sequences was 3, 7, 15, 30, and consider s = √[3] + √7 + √15 + √30.

Then s^2 = 55 + 30 √2 + 6 √5 + 6 √10 + 2 √21 + 2 √105 + 2 √210.

s^3 = 159 √3 + 90 √6 + 151 √7 + 90 √14 + 135 √15 + 105 √30 + 18 √35 + 18 √70.

...

s^30 = 1679613741139712617544067791402275 + 1187666266098277047375460186425450 √2 + 750901847525954802822187362608010 √5 + 530967788407084153582901906991810 √10 + 366402251460399628674629181584238 √21 + 259085516641872377051783075481000 √42 + 163913368573924563188162165414670 √105 + 115904254449237925942667189567670 √210

and so on.

For any given finite sequence of positive integers there is a finite set of square roots of positive integers such that all positive integer powers of the sum of the square roots of the sequence members can be expressed as a linear combination of that finite square root set with non-negative integer coefficients.

With s^n if we approximate each of the square roots by first the nearest integer not larger than the square root, and second by the nearest integer not smaller than the square root, we get a lower bound and upper bound for s^n.

Suppose we do that for the sums of the square roots of the two sequences we want to compare. For each power n, that gives us two integer ranges we can compare. If they do not overlap, we can tell which square root sum is lower.

If they do overlap, we can try a higher power n, so we can try a better square root approximation than nearest integer to get narrow ranges for the n'th powers.

What I'm hoping for is that by going to higher powers we can avoid having to do high precision square root calculations, so it can be done with just large integer arithmetic and regular floating point.


If you can immediately come up with an algorithm for a well-studied computer science problem, then it's very likely the approach isn't going to work out.

Notably in your case you've just converted high-precision floating point into even-higher-precision integer math. So the problem here that's inherent, which it that you might have to look at a very large number of bits of precision to find the differences, hasn't been sidestepped.


I guess you’re right in your first sentence, but it often helps to study some (semi-)obvious algorithms and analyze why that approach won‘t work.


Doesn't help, because if you start with the sum of N square roots you end up (in the general case) with 2^N square roots once you raise it to a power.


I was curious how this would play out and didn't see your comment, so I made a Jupyter notebook of raising various sums of square roots to various powers. Posting in case anyone else finds it interesting: https://gist.github.com/chrisshroba/8f12757ecbcdd394ceccb3e9...


This looks more like a problem with ( my understanding of ) complexity theory in general.

Categorizing problems by "The size of the input" is too imprecise when irrational numbers are a part of the problem.


Irrational numbers are involved in the problem, but not in the the size of the input, which is a list of natural numbers. There are a number of reasonable ways to encode such a list but they'll all scale in more or less the same way and be equivalent when looking at big O notation.


I would say that imprecision is somewhat artificially induced by the fact we so thoroughly use IEEE floats that we tend to assume that they are the only numbers.

When invoking arbitrary-precision floats (or equivalent), pretty much all numerical calculations get a lot harder than we intuit instantly. So much as comparing one number to another without taking a square root goes from O(1) to O(n) for n = the representation of the shortest of the two numbers. Our IEEE-based intuition sees that and goes wha?

But you can also build a separate intuition based on the arbitrary representation that would make this easy to both understand and intuit. It's really easy to get into PSPACE with arbitrary precision. Heck, it's pretty easy to get into EXPSPACE and almost any other prefix to "SPACE" you want, because it's easy to stipulate very close numbers and transforms on those very close numbers that require obscene precision to be sure you're correct.

If you work in this sort of mind space all the time, it's perfectly precise to talk about arbitrary-precision numbers.

But the real universe we live in bottoms out at 10^-35 meters or so, our practical ability to be precise a dozen or two orders of magnitude sooner than that at a minimum (and often another dozen for macroscopic work), and the practical impact of this is minimal because in practice, the first Haskell solution in that post is essentially correct in our real universe 99.9% of the time, even though it is glaringly wrong mathematically, because in the real universe we never have hundreds of digits of precision, and that's where our intuition is built.


Good example of this: What is the complexity of the fastest algorithm that prints the nth Fibonacci number?

Novice: "O(n), because you have to iterate from 0..n."

Expert: "O(log n), because we can use matrix exponentiation."

Master: "O(n), because F(n) has O(n) digits to print!"


The Master's argument would imply Ω(n) time,

The matrix exponentiation algorithm would take O(nlogn) time, since you have to multiply large numbers, and this takes nlogn with the best known algorithms (FFT).

I don't think there are any O(n) algorithms.


There are closed forms for F(n), and even faster ways to get it without needing to compute the sequence.

You're conflating n power with requiring n digit multiplications. This isn't true. The size of needed numbers is smaller for most problems of this type. And special structure is likely exploitable.

A simple proof is to write the recurrence as a matrix, take powers, diagonalize and read off the result. If I recall, the answer is something like F(n) is ((1+sqrt5)/2)^n + ((1-sqrt5)/2)^n.

Then you can only compute part of the first term, the second is small.

Then use the bits of n similar to power mod to compute powers in log n steps.

You only need something like log n precision along the way.

This should reach O(n) easily, perhaps below.

Quick check shows O(log n) steps in standard constant time ops.


You seem to be saying that if we don't want to compute F(n), but just some numbers a, e such that F(n) ~ a*10^e, then we can do it faster than nlogn time. That's of course true.

However, that's a different problem than actually computing the n digits of F(n). Even computing the first n digits of ((1+sqrt5)/2)^n probably takes nlogn time.

You say "use the bits of n similar to power mod". I suppose you mean repeated squaring. But what happens in the last of the logn steps of that procedure?: You multiply two n/2 bit numbers.


No, I am not saying that. I am saying we can compute F(n) exactly in O(log n) time. All the digits. That Fibonacci identity I posted allows computing exactly every single integer digit by rounding at the end, because the other term goes to zero as n goes to infinity, and that other term is always less than 1.

There are papers showing the complexity I claimed is true. For example, [1]. Algorithm 3.7, on page 15, and I quote: "Hence, with constant time arithmetic, the time complexity is O(lg n). The space complexity is also logarithmic in n." It uses the same ideas as the algorithm I suggested.

Also the two algorithms in section 3.8 achieve the same. The algorithm in section 11 achieves the same.

These use, as I used above, as is common for algorithms, what is called constant time arithmetic. This is how pretty much every textbook you will find uses these terms. Otherwise, even simple things like Quick Sort are no longer O(n log n) in the number of items, because as those items grow without bound, if the arithmetic does also, you end up with (often) more factors of n or log n in your final complexity.

For example, here is the bit complexity of quicksort [2], which is O(n log n log n) instead of the usual O(n log n). This concept is used so rarely that I don't think I've ever heard another person state it. People state the O(n log n) complexity, which is the standard for constant time arithmetic.

>You multiply two n/2 bit numbers.

You keep mixing your ideas for complexity. To specify the problem for computing F(n) you need not n bits - you need log n bits. So the input for this problem is not n, it is log n. Thus if you take your claim, and at the end multiply two (log n)/2 sized numbers, what do you get?

For example, if I tell you that you should compute F(1024), you do not need 1024 bits to tell you that. You need 10 = log 1024 bits to specify the problem. To describe the input to compute F(1,000,000) you do not need 1 million bits. You need 20.

Thus you do not multiply out n/2 bit numbers at the last step.

[1] https://arxiv.org/pdf/1803.07199.pdf

[2] https://www.ams.jhu.edu/~fill/papers/BitsQuickxabs.pdf


gmp uses a particular recurrence relation to exactly compute fib(x) in about log(x) steps[0]. Of course, the digits of the numbers it has to multiply also increase. It's fast-ish up to fib(128 million) or so but soon after that you run into the limit of the size of numbers in gmp, which are limited to being less than 2^31 * 32 bits long or something.

[0]: https://gmplib.org/manual/Fibonacci-Numbers-Algorithm


> the real universe we live in bottoms out at 10^-35 meters or so

We don't actually know this. It's a plausible speculation in quantum gravity, but we have no evidence either way. This length scale is about 18 orders of magnitude smaller than the smallest scale we can probe with experiments, so we're highly unlikely to get any evidence either way any time soon.


If you assume that mathematical notation (language) doesn't abstract complexity in a uniform fashion (something you can write and capture meaning with a few symbols isn't inherently more or less complex than something you can write with a lot of symbols), it becomes pretty obvious that something isn't necessarily as simple as it may look at the surface.

Having worked in computing for long enough, I'm well aware of this fact in the world of estimating time (which to some degree is estimating complexity) to address a given problem request. I can very easily write down in English a pretty generalized description that you need to: cure any and all forms of cancer. That's a fairly concise request, I can write it in one line, but it hides a mountain of complexity needed to accomplish that. It also doesn't describe what we consider as "cure" or "cancer" which can be ambiguous in some cases.

Much of the same is true with seemingly cute conjectures that seem to capture a behavior that may be incredibly complex to prove (if not impossible, e.g. Godel). The Collatz conjecture comes to mind where Paul Erdos famously said something to the effect of "Mathematics may not be ready for such problems."


Have you read the article?

The author explicitly defines what the question is (and how the question of irratonality of numbers is resolved there).

As a matter of fact they explicitly mention that the algorithm in question only requires polynomial memory space to compare N sums of roots of arbitrary numbers.


Why would it be too imprecise?

I agree there are some (many?) problems with some definitions used in complexity theory, but "size of input" is certainly not part of the problem. It can be defined very precisely. I don't see the slightest problem with how it's framed.


So what is N here? The sequence length, number of digits across all sequence elements, sum of all elements?


N is the sum of the lengths of the elements of all lists, when all elements are written in binary (or any other base, which is the same up to a constant).


It is always fascinating how many problems that are simply stated are difficult to solve. Whenever I see something like this I try and think about what the repercussions would be if an efficient algorithm did exist, and that helps to understand where the complexity is. In this case I believe there would be many problems in computational geometry involving Euclidean shortest paths that would be made trivial by an efficient algorithm here.


This problem is only hard when infinite precision is needed. It's trivial if you allow any tolerance on the scale that could exist in the Universe.


However, it would be interesting to find some pathological examples of pairs of lists whose sums of square roots compare almost equal if only approximated to some reasonable precision, but diverge absurdly if the fully general algorithm is used, if such pairs even exist.


I am reasonably sure that this won’t happen, or said another way, a function that returns the minimum delta between any finite pairs of lists is a reasonably well behaved function wrt the length of the list and the size of the integers and doesnt just zoom off to infinitely close to zero ever.

Said yet another way, the ways in which real numbers are dense is spooky and almost totally untied to how rationals work, and i do not believe you can get there from here using square roots.


This isn't true, which is why the problem lies in the complexity class listed in the article. Arbitrarily bad pathologies exist even for simple inputs.


It probably has been tried already by someone, but how about this: all square roots can be written as repeating continued fractions [1]. With a bit of work, continued fractions can also be summed [2] and compared. Wouldn't this take less than exponential time?

[1] http://benpaulthurstonblog.blogspot.com/2012/05/estimating-s...

[2] https://www.jstor.org/stable/1969389


The article points to a Facebook post that considers this method and then is subsequently invalidated [0].

The argument is that the continued fraction representation for sqrt(N) grows as O(lg(N) sqrt(N)), making the representation blow up.

[0] https://www.facebook.com/bshlgrs/posts/10215278471769811?com...

[1] https://mathworld.wolfram.com/PeriodicContinuedFraction.html...


You're assuming that the period will remain of manageable size. I see no reason why this should be the case.

Edit: Also found a more accessible website describing how to do arithmetic with continued fractions: https://perl.plover.com/yak/cftalk/. In case anyone wants to try it out.


In practice I think the problem is linear, in that the required precision is proportional to log maxValue, that would be enough for a certain answer.

It's just that a proof of this is considerably harder, since you can't just assume the fractional part of irrational numbers is random.


Without looking into the details, the problem might be that two n-bit numbers x and y might have continued fractions for sqrt(x) and sqrt(y) which differ very far along. Also, the continued fractions themselves can get more and more expensive to compute as you go along; possible exponentially more, but I'm not sure.

Also, two continued fractions are not necessarily easy to compare. It's not a positional notation like decimal or binary.


Can't resist mentioning my personal favorite expansion of a sqrt:

sqrt(1-x) = 1- \sum_n C(n)/2^(2n+1) x^(n+1)

where x is in (0,1) and C(n)=binomial(2n,n)/(n+1) is the n'th catalan number.

[Learned this from http://www.math.chalmers.se/~wastlund/coinFlip.pdf]


Continued fraction approxmations are not a separable monotonically increasing sequence like decimal approximations are (the approximation goes up and down and the error range overlaps other nearby fractions of the same size), so you have no idea when you can terminate a partial computation.


that is a really good point. What coding language/library is best at representing fractions as opposed to floating points. Even though there might be overhead storing the numerator and denominator, it would prove useful with this problem.


I'm not sure I understand the problem. If we only need to determine what set produces a sum of square roots that is larger, why can't we simply compare the sum of the original numbers? The square root of 2.0000001, for example, is larger than the square root of 2. The square root of X will be always be larger than the square root of Y if X is larger than Y.

The real problem is calculating the precision of square roots. But the challenge stated in the post can be solved rather easily in a programmatic fashion by simply comparing initial inputs.


Consider [25] and [9, 9]. 25 > 18, but 5 < 6.


Thanks! I had the same thought as the OP and this is just the counter-example I needed.


you're adding the two square roots though.

so for example, which is bigger: [2.00000000000000011,2.0000000000000003],[2.0000000000000002,2.00000000000000021]


f(x) = sqrt(x) does not increase linearly with x so you can't do that.

Compare the plotted graphs of x (input) vs. sqrt(x) (output)


I'm curious if you could get an algorithm using some sort of factoring.

  (sqrt(a_1) + sqrt(a_2) + ...)*(sqrt(b_1) + sqrt(b_2) + ...) = (sqrt(a_1*b_1) + sqrt(a_1*b_2) + ... + sqrt(a_2*b_1) + sqrt(a_2*b_2) + ...).
So you have a convolution operation on lists of integers which satisfies the following:

  sumOfSqrts(xs) * sumOfSqrts(ys) = sumOfSqrts(convolution(xs, ys))
You could try something where you factor the two lists you are comparing into their "prime lists", remove the duplicates, and then you've reduced it to comparing some countable set of lists, that might have some properties that make them easier to compare? Of course all of that assumes you can uniquely factor lists under this convolution. I don't think you can't if you assume negative numbers can be in the list. But if you restricted your attention to lists with only positive entries, and factored into only lists with all positive entries, it's possible you have a unique factorization method. I don't have the math skills offhand to tell for sure.

NB: the article describes PSPACE as being definitely larger than P or NP. But, just like how we don't know (but strongly suspect) NP is bigger than P, we don't know if PSPACE is bigger than P! Complexity theory is hard, so much so that even these relatively simple questions haven't yet been proven.


I must not be understanding the problem here because this seems pretty simple. If someone told me to add two lists of numbers (the square roots) and then take the difference of the two sums and the sums were potentially too big for the computer to handle, I'd start differencing before I finished the sums. (For example find the total of sum1 - sum2. While processing values, if the running sum is positive take a number from list 2. It it is negative take a number from list 1.)

That should be linear in the total number of elements in the list.


You understand precisely the first part of the problem ... it looks easy and linear.

But, as the article points out, you may need a very large amount of precision to figure out which way the difference goes if it is very close. This isn't about computing a really big sum. This is about computing enough digits of irrational numbers. If you have to compute an exponential number of tinier and tinier digits you still can need exponential time for very small values.


The issue is that the precision of the square roots needs to increase in order to guarantee a result. Consider an algorithm to generate the worst case scenario that starts with two lists that have sqrt sums that differ by D. Append to the lists numbers x and y such that |sqrt(x) - sqrt(y)| > D and append them so that the previously lesser list is now greater but also the absolute difference decreases each step.


When I was dealing with this the problem was running into machine error: it happens much faster than you think. Especially when you're summing many very small numbers.


OK I get it, rounding error in calculating the square roots.


Stupid questions:

1) It seems like all the difficulty is from the fact that the representation of the square roots involves a non terminating sequence of digits requiring high precision. So don’t you have this problem with all irrational functions, not just square root? Eg logarithm, sine, exponent from irrational base. (Does it make a difference that sine is bounded?)

2) Do you have the same problem (difficulty being PSPACE) without the square root part at all, as long as you allow the inputs to be arbitrary precision?


For (1): Comparing sums of logarithms is pretty easy, since you can rewrite it as comparing products of integers. So not all sums of irrationals are hard.

For (2): Allowing inputs to arbitrary precision presumably means they are arbitrarily long bit strings? But if you measure complexity in terms of the total number of input bits, summing long bit strings is very easy.


1) Not all representations of square roots have non-terminating form.

Continued fractions, for instance, reach a repetitive cycle.

Other irrational functions have other patterns.

2) No. If you are just doing sums, the cost is O(N) in time and space where N is the total number of bits in the input. If the inputs are large (i.e. N is big) then you have more time to compute the answer.


> I’m going to update towards thinking that integers and square roots are much scarier, richer objects than I’d thought. I’ve updated to being more scared of real numbers than I used to be—they have all these sketchy properties like “almost none of them have finite descriptions”. Real numbers, and sets, and logical statements, have all started feeling to me like Cthuluesque monstrosities whose appearances are only tolerable because we only look at the pretty parts of them and don’t let ourselves look at the horrors that lurk below.

I don't know much number theory but whenever I poke around at stuff like the Collatz conjecture, it seems like there is something profoundly weird about the interaction of addition/subtraction and multiplication/division. Like each of those pairs defines a totally separate universe and moving between them is irreducibly hard even though the members of both are "just" numbers.

By that I mean you can have a number that you understand perfectly well one side (for example, as a set of factors in the world of multiplication). You move it to the other side and perform a trivial operation (say add one). And now you know nothing about it on the other side any more.

Maybe this is just me knowing very little about the field.


> Maybe this is just me knowing very little about the field.

Can't tell if pun. Either way, it's pretty good.

https://en.wikipedia.org/wiki/Field_(mathematics)


Looks like Presburger Arithmetic [1] versus Peano Arithmetic [2].

[1] - https://en.wikipedia.org/wiki/Presburger_arithmetic

[2] - https://en.wikipedia.org/wiki/Peano_axioms


> there is something profoundly weird about [numbers]

you should hear John Conway (RIP) talk about numbers - https://www.youtube.com/watch?v=1eAmxgINXrE


So I'm 40 minutes into this. For anyone reading: Conway is great, but this is basically him rambling, there really isn't any substance in this video. It's basically the Simpsons scene where grandpa is talking about onions on his belt.


So let’s say we start with computing the square roots to X digits of precision after the decimal point. Then we add up lower and upper bounds, giving us intervals for the sums over the two lists. If those intervals overlap we have to go back and compute e.g. 2X digits of precision, and so on. But in most cases we’d be done after the first iteration.

Seems to me that would be polynomial in the average case. The worst case could of course be really bad. But if you’re telling me you know for sure that it’s exponential, then you must know something about the existence of lists with very close sums. As I understood the OP we don’t know if such lists exist.

So average time complexity is polynomial and worst case is unknown.

EDIT: Doesn’t https://www.sciencedirect.com/science/article/abs/pii/S00200... show that this algorithm is in fact polynomial?


> EDIT: Doesn’t https://www.sciencedirect.com/science/article/abs/pii/S00200... show that this algorithm is in fact polynomial?

No, they give a linear lower bound for the number of digits you need to compute and show that the bound is tight for certain special numbers, but they don't have a polynomial upper bound for the general case.


> comparing sums of square roots is actually kind of a common subtask in eg computational geometry, so a bunch of their problems (including “shortest path through a graph in Euclidean space”!) are as-far-as-we-know extremely hard.

But the question then arises: do we always need to solve these problems to their fullest precision? Can we perhaps leave some ambiguity, and let the system gracefully deal with it? E.g. I don't care if my CAD model has a tiny chip of 1e-12 mm missing, as long as my CAD software doesn't crash on the resulting internal inconsistencies.

See also, e.g.: https://www.cs.purdue.edu/homes/cmh/distribution/papers/Robu...


Yeah, afaik this problem is totally unimportant in practice.


Wait until it ends up being used in some other important field of math.


I thought that any algorithms relating with real numbers (and floating point) require use of some epsilon as a measure of closeness.

In such approach some different numbers will be treated as equals, but qualified with the epsilon.

Thus it could be made practical, as long as epsilon is chosen appropriately to the application domain.


It seems like this should just reduce to the question: given two natural numbers x and y represented by n bits how many bits are needed to represent the difference sqrt(y) - sqrt(x) so that it is non-zero when x != y. To see this suppose you compute each sqrt in both lists to k bits then group together the closest matching pairs from the two lists. Some of these pairs may have a difference of zero. Now increase k until all of the pairs are distinguished (have non-zero differences). At that point you know the answer, you just have to add up the differences.

As for the first question it should take no more than n bits because sqrt(x) is a contraction at least for x >= 1 as is obvious from its graph compared to that of f(x) = x.


> At that point you know the answer, you just have to add up the differences.

What if your sum in this step is zero?


There's a known upper bound for the error on each pair delta caused by the finite number of bits. So that gives upper and lower bounds for the error on the sum of the deltas. Add enough bits to your representation to account for that error. I think that's an additional log N + 1 bits where N is the sequence length. Then if the sum comes out to zero that must be the correct answer (probably impossible unless the two sequences contain the same values).


> Add enough bits to your representation to account for that error.

I've reviewed a few papers where authors respond with this sort of argument. It never ends in a publication. But, I really should have picked on the preceding sentence:

> Now increase k until all of the pairs are distinguished (have non-zero differences).

Do you have an estimate on the bounds, time or space, required for that? Because establishing those bounds is the "hard problem" here.

Also, what do you do about questions like sqrt(2)+sqrt(50) =? sqrt(72) where the sums have differing numbers of terms?

And, it might be worthwhile to play with an example: (55, 77, 83) and (64, 68, 82) -- depending on where you decide to truncate, the difference wobbles above, below and equal to zero, well past the point you've described where the pairwise differences are nonzero (4 bits and the sums appear to be equal). It finally stabilizes after 25 bits -- much greater than log N + 1.


OK, I think I understand the problem now. It's that the actual answer can be very small but non-zero. You can put error bars on your calculations but that doesn't necessarily tell you which side of zero the answer is on. In fact you would need to do the calculation with enough bits to represent the true answer which you don't know a priori. What you can do though is put an upper bound on the size of the difference between the two sums and you can make that bound as small as you want by using more bits.


Doesn't this imply that every algorithm that compares real numbers is in PSPACE?


The predicates < and > are strictly semidecidable, meaning that if two real numbers are equal to each other, then no algorithm is guaranteed to terminate. The complexity is thus RE, which is worse than PSPACE.

But the real numbers which show up in the sum-of-square-roots problem are from being as general as possible, so the complexity is PSPACE at worst.

The foundations needed to understand general real number computation are listed here: https://news.ycombinator.com/item?id=30057794


Super interesting, thanks a lot fot the resources!


Real numbers are slippery, because almost no real numbers are computable, even approximately. This conjecture only applies to numbers that have small descriptions but complicated values.

If you don't have simple names for your real numbers, then the algorithms are trivial, because all the complexity is in writing the input!


No. That said, most are at least that bad.


I don't get it - how is this problem any different than simply comparing two very long integers. Given integers "a" and "b" - determine which one is bigger. You don't know as "a" could be seventy six billion trillion quadrillion digits long and "b" could be even more digits and you never stop comparing them as there are more and more digits. No need for square roots or sums.


Good question :) The difference is that in that case, the problem might take a long time because the input is very long, but in this case, the problem might take a long time even if the input itself is fairly short.


Why would one try to formulate it as a function of the input length in binary? Does this have a practical relevance? Why not think about it in terms of the numbers given in decimal, their count and their size? And what is the input length in this problem? The numbers encoded in binary? Pretty sure it cannot be computed by just looking at how many numbers are given in each list.


Input length in binary is the standard way to calculate computational complexity. Sometimes we handwave that away, but in problems like this where the size of the number matters, we stick to the more strict definition of computational complexity.

Number in binary vs number in decimal doesn't make a big difference because it's just a constant factor.


Thanks really. I guessed as much. What about the other questions? How do you get to an input length from a list of numbers? Just concattenate all their binary digits? Can you answer any of these?


We could instead specify the length (in characters) of a json list representing the two inputs: "[[1, 17, 8], [2, 6, 9]]" if we wanted to. It's base ASCII (i.e. 128 I guess), but unambiguous, and the same up to a multiplicative constant, right?

That solves needing to figure out an encoding scheme for arbitrary length binary number lists.

Or use a trinary encoding, with the set {"0", "1", ","} to let you list numbers nicely if you want. It doesn't matter that much.


Simply the space it would naively take in memory, ie. the sum of the sizes of the elements. Which is the same thing as the length of all the elements concatenated.


It doesn't really matter. Use 11 for 1, 00 for 00 and 01 for next number. Most ways you can come up with have the same length up to multiplying with some constant, and the constant does not matter in the analysis.


>> Suppose that I can find some lists of numbers whose sums of square roots are equal for the first ten million decimal points and then start being different

Would have been nice to add an example of such list in the blog, I wonder how short that list can be.

Also, how important is this in practice? The algorithms using this comparison could handle the 3 cases with a tolerance (larger, smaller, undecided)?


Very short, expressed in number of items in the lists. Once N is large enough, these two single-item lists have that:

First list: (N)

Second list: (N + 1)

N = 10^10^15 is large enough for that.

I think you can construct much smaller examples by looking at Pythagorean triples. Since 3² + 4² = 5² and 5² + 12² = 13², the lists (9k, 16k, 169k) and (25k, 144k, 25k) have the same sum of square roots (22√k) for all k. Subtract one from one of the numbers, and you’ll have a close match, say with error e.

Do the same for a second pair of Pythagorean triples, giving you an error of f. Compute a good rational approximation p/q of e/f, and linearly combine the pairs of sets to get any arbitrary small difference (edit: that construction works starting with any two sets of two sets of numbers)

And I don’t think there is an “in practice” for this problem. When do you ever have to make this computation? If ever, does it matter if your code fails on a tiny fraction of all inputs?


I expect not very important in practice, because

1. We've settled on using IEEE 754 floating point numbers, and all their tolerance quirks, in practice, and still the bridges don't fall down

2. The author describes it as a little-studied field where "most of the people who would be good at working on this are doing something more useful instead"

But still, useless pure-maths distractions have a way of yielding results in unexpected fields. Maybe, as the author hints, a solution to this problem would bring us some information-theory insights and some new methods for cryptography or compression or whatever else.


This is not a computer science question. It's a math challenge. We simply rely on computers to do the math faster. I challenge any math person to solve this problem and/or provide a proof. Given that their equations will ultimately be so complex they will fail.

For example, just say 1/3. Easy right? 0.333333334 Wrong. It's impossible for a human or a computer to say the answer. NOT a CS issue. We can't combine random and ultimately repeating numbers. Each time you reduce a number via square root you will eventually reach an infinite response and thus it's impossible to solve, let alone with a binary computer system.

I would love the very first quantum computer to do 1/3. Just use up all the CPU and energy it has in a problem it can't solve. Forever in time.

Look not at the computer as the failure scenario but math in general. At some point you stop and round. When that happens, an error is introduced when you extend the answer beyond the rounding point. At some point you will fine an infinite answer and/or answers. I can write you a computer program to write a single infinite answer. Combining two of them produces the same infinite process.

Here is a better answer. Can we shift the base in a way that allows us to answer random questions like these?


>Each time you reduce a number via square root you will eventually reach an infinite response and thus it's impossible to solve, let alone with a binary computer system.

This is incorrect. All the symbolic math programs in the world provide counterexamples. There is no need to deal with infinite length decimal numbers to do a lot of proving of things in mathematics, in the same manner when I prove things by hand I do not need to write out infinite length numbers.

Your mistake is assuming the only way to analyze a sqrt of a number is to write it out in infinite precision. That is not needed. For example, you can prove on a computer that sqrt(19) > sqrt(17) without having to evaluate either side as a decimal number.

For example, Mathematica (or any symbolic program), entering Sqrt[19]>Sqrt[17] returns True.


You may have missed the point. Although fractions may have infinite representations, the question of which sum of fractions is larger is a much easier problem to solve:

"On the other hand, comparing the sums of fractions is pretty easy, because division is nice and well behaved. So the question is how complicated square roots are."


Computing the value would be a problem but determining which is bigger is purely a CS question (just one that cannot be solved directly with our usual floating point tool: IEEE 754 arithmetic).


I think the formulation "this algorithm is in PSPACE" is very inappropriate in this case. Every "sane" algorithm is in PSPACE, e.g. adding two numbers.

Either prove, that the problem is PSPACE-complete (any PSPACE problem can be converted into it), or just say that your algorithm takes exponential time in polynomial space.


A small note: If you allow randomized algorithms, this problem is actually known to be in P^PP^PP^PP which is well within PSPACE in some sense, but still a ridiculously bad bound.

See https://cstheory.stackexchange.com/a/4056.


Would the naive approach "just work" (although slowly) in Python since it has arbitrary precision?


Python doesn't use arbitrary precision for floating point values, it just has arbitrarily long integers.


Oh, derp, you're right.


Python has arbitrary integer precision, that won't help


Python does not have arbitrary precision for floating point


What does “ compute this as a function of the length of the input encoded in binary” mean? Can someone explain it with the other sample used in the article? Does it mean the input will be provided in binary form instead of decimal? How does that change things?


Whether the input is encoded as binary or decimal doesn't change whether the time complexity is in PSPACE, NP or P. You need some finite alphabet, and then encode the input as a string in this alphabet. The time complexity of an algorithm for SSS is measured by how long it takes (in the worst case) as a function of the length of the input string. As long as you use a positional notation like binary or decimal to encode the integers, the big O of the time complexity will remain the same, and so the exact positional notation doesn't actually matter. The size of the alphabet doesn't matter either. If on the other hand you encode the integers using unary notation, then this can land the problem in P.


You're parsing it wrong.

"How quickly can we compute this [this = the solution to the problem], as a function of the length of the input encoded in binary?"

i.e. it's just wondering what's the computational complexity. The input can be provided in any base, it makes no difference.


> There’s a known, reasonably fast algorithm (in BPP) for checking equality of sums of square roots

What is it?


You just have to take out square factors from each, and combine terms with matching sqrt(n), and that will be unique


That is to say, rewrite each sqrt(N_i) as P_i * sqrt(Q_i) where P and Q are integers and Q contains no squares. So 2 becomes 1 * sqrt(2), 4 becomes 2 * sqrt(1) and 8 becomes 2 * sqrt(2).

Now, you can subtract equal terms from each side of the equation and if you can reach 0=0 then the numbers are equal. If you're left with something like sqrt(3) = 5 * sqrt(2) the the numbers are unequal.

This stems from the fact, that I give without proof, that for integers X, Y and Z that contain no squares that sqrt(x) + sqrt(y) is never equal to sqrt(z). So there's no way to (say) add a bunch of square roots of 2 and have it become equal to a square root of 3 or 5.

A number contains no squares if its prime factorization contains no repeated factors. Since this seems to involve factorization, plus a step of matching up numbers from both sides, the computational complexity would seem to be at least the complexity of factorization. The term-matching step is presumablty the easier step of the two.


How do you know its unique?


I don't think there's an elementary proof, but you will see a proof if you study algebraic number theory.


This problem is a perfect example of better is the enemy of good enough. What's striking here is that the simple FP64 solution is for the most part more than good enough for any practical application you would find working in Tech. Anything beyond that is likely unnecessary gold plating.

If I were asked this on a job interview and they didn't accept that answer and started going on about me missing the pure math here that would be a great sign that such a place probably doesn't ship things very often. I would also bring up that no one can solve the traveling sales problem either but that approximately optimal solutions run the world of logistics on a daily basis.

Yet much like you can dedicate an entire supercomputer to calculating the energy of two hydrogen atoms to arbitrary precision, it's a really interesting problem with respect to the pure math. But come on, the guys most likely to bring this question up are relying on 16-bit floating point to train their neural networks.

Exponent.Mantissa.Boom. In practice, I know just about no one anymore who can tell me the format of 32-bit or 64-bit floating point because they're so used to abstracting that away.


No, it's not a perfect example of better being the enemy of good enough. You're just missing the point. Nobody is using this as an interview question.


But the article presents it as an interesting pure math problem, and doesn't claim that the FP64 solution isn't good enough.

I should add that the people who write state of the art "practical" TSP solvers spend a lot of time thinking about the "theoretical" complexity of it too. Turns out it's a good way to engineer those "good enough" algorithms, provide bounds etc


Sure and there are a lot of simple ways to improve the accuracy of the simple approach before you go for the pure math sledgehammer here. It's all a question of how accurate the answer needs to be.

For example, It's pretty easy to come up with an FP64 error bound on the sum. And if the difference between the two sums is greater than that error bound you don't need to do anything more complicated. Where this would get complicated IMO is if you were dealing with 128-bit or higher integers.

Edit: I am learning so much about the mindset here and what triggers acute attacks of the heebie jeebies. This place has really changed in the past decade since I joined and not for the better. Yes good, good, more downvotes. Show your love with downvotes. I'm stuck at 2970 karma, only you can help me on the journey to zero. Can we beat -4? Let's find out.


Looking at your original reply, you started bringing in "job interviews" and "companies that ship". I imagine that's why people downvoted you -- this post (and replies) are about maths, not worrying about really companies.

The original post wasn't about practical software, or shipping products, or getting a "good enough answer". It was about an interesting (to many people) maths problem.

If you want ycombinator to just be about companies and shipping products, then indeed it "has changed". But many people (myself included) like a good pure maths puzzle, and are interested in if there is a "true answer".


And I admitted it's a cool pure math problem. The challenge in the tech industry is balancing the rigor of a pure but nearly intractable solution with using the theory to pump up an O(n) imperfect solution to acceptable reliability. I find the latter more interesting, YMMV.

But if that isn't as interesting as the original problem, maybe stay in academia? AI itself is an example of tailoring activation and pooling functions to deliver imperfect solutions that are "good enough" to ship. It's unfortunate these two viewpoints are seen as contradictory rather than complementary, but that does echo our political balkanization so I guess I shouldn't be surprised.


Making things hard sometimes leads to insights down the line. For instance, the formula for solving cubic equations seems kind of silly from a numerical point of view -- you can just use a general-purpose numerical algorithm to solve cubics. But the research into solving cubics using restricted operations led to the discovery of the complex numbers, group theory, and the theory of fields, etc. So it was worth doing things the hard way and sticking to the letter of the problem.

Likewise, the ideas people use to solve the problem in the article may have applications elsewhere.

I do sometimes have doubts about whether this is the most efficient way of discovering scientific ideas: Posing puzzles and seeing what tools people throw at them. So I can see the criticisms coming...


Not going to criticize you, I think it's a cool math problem, but I've experienced people throwing problems like this at me at job interviews in the past so it echoed. And in those cases, they weren't looking for an answer, they were looking for the answer they knew about and no other answer would do. I once ended an interview early over one of these questions.

I am relentlessly production focused but that doesn't mean I don't like math. But by the time you get to FP64 with this thing, the probability of finding a fail case seems insanely low and in fact for the most part you can provably bound the error of the sum and therefore likely prove you have found the solution. Anything beyond that is a corner case as the mathematicians love to say. And now I see the criticisms coming.


I think people are downvoting you because you seem like you're attacking a strawman. We all agree that 1) it's an interesting math problem 2) for real-life instances, the FP64 solution is good enough.

For e.g. when you say "anything beyond that is likely unnecessary gold plating", I know you mean "exact algorithms (like the PSPACE algorithm) are unnecessary for solving real-life instances", but without context it sounds like you're saying "there is no conceivable reason people should care about the exact algorithms"


Back in the day, the downvotes here would flow like a river at the suggestion that an engineering career behooved one to understand Calculus and Linear Algebra with Statistics and Differential Equations if possible thrown in for good measure because Peter Thiel said college degrees were worthless and he was even offering $100K to some lucky winners to prove his point. Admittedly, I would have probably taken that offer had I been 18 and then gone to college afterwards, but it seemed to deliver mixed results.

For the above math comes up again and again writing software. Pure math like real analysis, graduate level algebra, number theory and topology not so much. I'm guessing the rise of AI demonstrated the previous mindset against math was poppycock all along so now we've gone to the opposite extreme and made it a religion? I think people are bringing their own biases into that interpretation and you are right about which biases.

But, also, not my problem. My personal bias is that I recognize this problem as exactly the sort of thing that gets brought up as a gotcha interview question by people who couldn't come up with that exact solution themselves yet understand it just enough to use it to make an interviewee squirm.


Exactly! Write the best, simplest, acceptable working code you can, and then put a comment in there that if someone has an issue with it, they are welcome to improve it...

if (.1 + .1 != .2): reason = "we can't have nice things."


I'm not sure why the square root matters in here, it seems like a more general problem of storing an unbounded array of arbitrary precision digits. Wouldn't you run into the same problem with doing simple addition on an array that includes pi to infinite precision?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: