Hacker News new | past | comments | ask | show | jobs | submit login

The original "signs of life discovered on Venus" announcement was discussed a couple months ago here: https://news.ycombinator.com/item?id=24463423

Numerous comments about the promoter's non-scientific background, the sensationalism of the original announcement and skepticism over the findings. Please note that the scientists themselves weren't sensationalizing their findings, but others were.

As this article points out: it was a "big, if true" story that had numerous red flags from the start.




Interestingly in the old comment thread, and the other ones I read at the time, there was lots of discussion about how the phosphine might have originated non-biologically, but no one suggesting a problem in the data analysis. Now it looks like that is what happened.

It also means just about no one read the real paper...a twelfth-order polynomial would certainly be a red flag for me.

Interesting moral here for interpreting science: surprising results don't usually have interesting explanations. It's usually a bug.


It might have been read by people without the proper background. Even though I can grok the math, upon reading this I would go "oh, interesting what they resort to in this field".


Lots of people said that the data analysis might fall apart upon looking closely, as has happened many times before.


The 12 degree polynomial is red flag as big as the Soviet Union. No one made a comment about that.

I made a comment in that thread, but I didn't notice the 12 degree polynomial.

Perhaps I'm too optimistic, but I expect someone to read the paper and make a comment about that in HN.


can you explain for someone not super math savy?


Polynomial fitting can be used to generate curves which fit any dataset perfectly given enough degrees. Like if you give me a data set with 150 points in it that are apparently randomly distributed throughout the sample space, I can give you a nice high-order polynomial that perfectly passes through all 150 data points.

Making any claims based on that kind of curve fitting is a huge red flag, especially if you don't discuss that.

One of the reasons we would prefer lower-order fits to higher-order fits is that there is a very real risk of overfitting in the interior of a data set but then providing completely inaccurate results anywhere but at the data points. Seeing that kind of overfitting in a scientific paper without any justification suggests that the author of the paper is making statements without any basis in science.


Would saying it's like getting too many returns with a greedy regex be a close enough example non-mathy coders could grok?


It's more like someone is asking you to create a regex that matches phone numbers and they give you an example number for you to work with. You have the bright idea of writing the example number verbatim as your regex and voilà. It matches the sample perfectly, job done. In reality that regex can't be applied to anything else.


No.

It like saying that a rule with as many special cases as data points is not actually a general rule. It is just a collection of special cases that explains nothing.

More precisely, if you give a model enough parameters, you can always have it describe your data well. The question is whether it is likely to predict future data. And the answer is no.


It's more like, high-order polynomials are fundamentally wild animals. You can use a math trick to make them to go through specified points, and to early statisticians & data analysts, this seemed like a good way to model a nonlinear trend based on a set of sample points. But that trick doesn't make polynomials tame. And it gets worse the more points you need to interpolate - you have to add a degree to the polynomial every time you want to go through another point. Each new degree makes the polynomial wilder and wilder outside of the points you're interpolating.

We later discovered that the tame functions that usually work well for extrapolating from a sample are things called splines.


I don't think so. You can easily get too many returns without any overfitting at all.

If you want to make it simple, think of using a 12th order curve as being similar to picking whatever 12 points you want to match and then drawing lines from point to point.


I'm not the person you asked, but in general with polynomials, if you need to go much past the second or third order order to fit your data, you should probably be fitting with a different set of functions. Take a look at Runge's Phenomenon for an example where higher order polynomials really aren't making the approximation better in a useful way:

https://en.wikipedia.org/wiki/Runge%27s_phenomenon


Draw whatever crazy line you want on a whiteboard, and I can find a 12-degree polynomial to transform it into the desired shape for whatever it is I want to prove. It proves nothing because it can prove anything.

EDIT: Also see the last panel of this XKCD: https://xkcd.com/2048/


The idea is that you want to split your data as a sum of three parts:

(a) a smooth function that is not interesting

(b) noise

(c) a big unexpected peak in an interesting part

There is a nice preprint posted a few days ago: https://arxiv.org/abs/2010.09761 https://arxiv.org/abs/2010.09761.pdf

They reanalyze the data than in the phosphine paper.

---

If you look at figure 3, they have the data that is the skyscraper-like line, and they use a polynomial of degree 3 to approximate the signal, that is that curved unhappy smooth line.

The smooth line is (a), when you subtract this smooth function, you get the other part of figure 3, that is the noise (b). There are some high and low parts, but nothing too high or low that look special. So their conclusion is that there is no interesting part (c).

---

If you look at figure 2, top left, they use a slightly smaller interval, but now they use a polynomial of degree 12 instead of a polynomial of degree 3. This is the smooth function (a).

This is a reconstruction of the process in the original paper. They fit the polynomial using the data, but excluding the central part.

The problem with the polynomial of degree 12 is that is has too much freedom, so it fits the actual smooth curve, but it also fit the noise.

The polynomial of degree 3 has to go somewhat in the middle of the data, because it can't go up and down too many times. The polynomial of degree 12 can follow the local bumps and fit the noise.

When you subtract the polynomial of degree 12 you get the the graph in the third row, with the noise that is (b). It is copied in Figure 2, and it is very similar to the graph in the original paper.

Since the polynomial of degree 12 fit the noise, the noise is too small, so you underestimate the noise level.

And since the central part was skip in the fit, in some case you get a big bump like here. It is bigger than the apparent level of noise so it looks like an unexpected peak (c).

But here the problem is that you are comparing the peak with the surrounding noise level, but the noise level is underestimated because the polynomial of degree 12 overfit.

---

They repeat the same kind of analysis in other regions, and they get a few additional fake peaks. This are the other 5 graph it the top of Figure 3.


It means that you're not just adding up some simple curves, you're taking your variable to extremely high powers, in this case all the way up to x^12. The more orders/powers you add into a fit equation, the more it's going to get artificially closer inside your data window (as you sledgehammer it into a nearly arbitrary shape). And the more it's going to immediately shoot to infinity the moment it gets out of your data window, because those extreme powers of your variable are all fighting each other and have no real connection to the underlying data; what physical process is causing x^10 and x^11 and x^12 curves or even x^5 and x^6?

See the last example here: https://xkcd.com/2048/


> or even x^5 and x^6?

Highest I know is Lighthill's (aptly called) eighth power law which says that the sound power created by a turbulent flow scales with the eight power of the characteristic turbulent velocity: https://en.wikipedia.org/wiki/Lighthill%27s_eighth_power_law, anyone knows something higher?


Another is a simplified model of the interatomic force https://en.wikipedia.org/wiki/Lennard-Jones_potential that is F = a/d^12 - b/d^6

The difference is that in my example and in your example there are only a few coefficients to tweak to fit the data. So the shape of the curve if fixed and it is very difficult to overfit the data.

In the paper they used a full polynomial of degree 12, that has 13 coefficients to tweak and it is very easy to get weird shapes.


Nice example! But yes, I agree. My field is certainly not astronomy but... that method to remove noise seems extremely weird. XKCD-level joke weird. Even in the rebuttal arxiv paper cited here where they use a 3rd degree poly to remove the noise it seems... not to fit very well? Seems strange to use a random fit without any guesstimate of the underlying cause/model of the noise.


Sorry for the very late response...

> they use a 3rd degree poly to remove the noise it seems... not to fit very well

They are not trying to fit the noise, they are trying to fit the hidden smooth signal that if hidden by the noise. In some cases it is difficult to make a formula for the real signal, so you can approximate it locally with a polynomial.

The idea is that after you subtract the smooth part, you get only the noise. So you can calculate the expected noise level.

And if the "noise" has a big peak, you can guess there is something strange, like a big absorption line.

See also my other comment: https://news.ycombinator.com/item?id=24985680


Not only are the other replies correct, in that you can 'prove' anything with enough degrees in your poly fit, but any attempt to actually use a 12th-degree fit will probably suffer from math precision errors.

Often it's better to use a spline fit or other interpolation technique, once you find yourself needing to go beyond five- or six-degree polynomial fits.


For an idea of scale: 12th order is enough to fit every transcendental function in the C math library well past the point where a human could tell the difference without doing deeper analysis. Most of them will be less than one ulp of error off.


Don't have the funeral yet.


> ... scientists themselves weren't sensationalizing their findings ...

Yeah right.

They were saying things like "please prove us wrong" while doing interview rounds, and (I'm sure) celebrating, loving and looking forward to more and more limelight, so strictly speaking they were not "sensationalizing their findings".

In a related story, OJ and Casey Anthony did not kill anyone.


If you read the paper they are very very careful with their claims. Also please prove us wrong is the default position in science, thats what peer review is all about. I dont see a problem with that.



Somewhat like the scientists you are criticizing, you ascribe a lot of meaning to little data. Support for your claims that they were celebrating and reveling in fame are very thin. I would like to see more evidence before we criticize not just their work but also them as people.


[flagged]


Your sarcasm and hostility disproportionate to this discussion. I wish you well with whatever you are going through.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: