Why Self-Track? The Possibility of Hard-to-Explain Change

doktrin · on Aug 24, 2012

I have to admit I cringed at some of the examples provided, as well as the associated causal evidence.

1. Eating Fruit for Breakfast => Reduced Sleep Quality [in fact, "...I figured out that any breakfast, if eaten early, disturbed my sleep".

2. Removal of Mercury Fillings => Improved Brain Function. The evidence was surprisingly lacking :

"The evidence for causality — removal of mercury amalgam fillings improved my arithmetic score — rests on three things: 1. Four other explanations made incorrect predictions. 2. The improvement, which lasted months, started within a few days of the removal. Long-term improvements (not due to practice) are rare — this is the only one I’ve noticed. 3. Mercury is known to harm neural function (“mad as a hatter”). As far as I’m concerned, that’s plenty."

I am all for self-observation, but I can't help but think that some of the causal links the OP has established may not be particularly well founded.

beagle3 · on Aug 25, 2012

If this was the only writeup Roberts ever produced, the cringing would be well deserved. However, he does have a long history of correctly quantifying changes in himself and identifying their source.

Specifically, his (on its face, ridiculous) shangri-la diet works well for 80% of the people -- probably better than any other diet (Although it should be noted that about 20% of people who try report it does not work at all).

Also, his flax seed, butter and vitamin D results have been replicated consistently by many (though again, not all). You can wait 15 years until these things get verified/shown wrong, or just try them yourself.

And about "well founded" - it's easy to dismiss some guy on the internet (regardless of his track record and academic qualifications, as this particular writeup was not done in the context of these qualifications). But you should be aware of your confirmation bias - e.g., if you ever considered any of the examples I give here http://news.ycombinator.com/item?id=4427910 well founded.

SwellJoe · on Aug 24, 2012

But, but, we always rub the bald guy's head before the game, and we totally play better!

I wrote it down, so it is now science.

rubidium · on Aug 24, 2012

"Professional scientists almost never use this method."

That's because of confirmation bias. All of his examples seem to be:

1) Experience a 1 to 2-sigma variation in normal performance.

2) Look for something different in what you did that day.

3) Keep doing that, and voila, a minor improvement.

You'll find yourself doing some pretty goofy things if you follow his suggestion too regularly.

Edit: misspelling/layout.

npsimons · on Aug 24, 2012

You'll find yourself doing some pretty goofy things if you follow his suggestion to regularly.

Isn't that what science is? It may be biased, and it's very hard to isolate variables, but by the very nature of self-experimentation it's very difficult to do double blind tests. At least some people are trying to live the examined life and look for causes, rather than blaming it on genetics or luck.

kennywinker · on Aug 24, 2012

He gets to the "hypothesis forming" stage, and draws a conclusion. That's fine if you're trying to figure out how to live your life best. The negative consequences of removing his mercury fillings are probably null, he can accept his hypothesis as true and move on. But this is bad science, without question. We don't know that his mental function has improved because of removing fillings... it's a potential hypothesis, sure, but to say "Mercury Damage Revealed by Brain Test" in a headline seems a bit bold for a sample size of n with no control and a bunch of other confounding variables at play.

npsimons · on Aug 24, 2012

Yes, he does seem to pull out the "jump to conclusions" mat a bit much. That being said, I can't help but think that this would only justify more data. Log more variables; have someone review your methodology; etc. I've always wondered, how do you test your own intelligence? If you're writing and scoring the test, doesn't that invalidate scores you get? Ditto for taking the same test more than once. Maybe I'm just not smart or creative enough to think of ways around this (is that a test? :)

beagle3 · on Aug 25, 2012

> I've always wondered, how do you test your own intelligence?

He is not measuring intelligence; He's measuring reaction times (how long he takes to solve a very simple arithmetic problem; or how long he takes to see which letter in a 4 letter string appears twice) and fine balance/motor function (how long he can keep his balance on an unstable platform he has built)

He went over these things in his blog in great detail over the years. He keeps trying new tests, keeps using those that are statistically stable in a "stable" environment (without changing nutrition / sleep / etc, and after a learning period) - and after he has a statistically significant baseline, he measures how e.g. eating more butter or flax oil changes the results of these tests.

Yes, this is drawing conclusions for n=1, but for himself, these conclusions are very scientific - and us humans are not unique enough for them to have wider applicability.

No, you cannot deduce from n=1 to the entire human race. Yes, his conclusions are plausible and most of them are verified by others. Read e.g. gwern.net on vitamin D.

drostie · on Aug 25, 2012

I'm sorry, but science is neither "we measure the things" nor is it "we look for the cause." Yes, those are things which scientists do, but no, that is not the essence of science.

There is no attempt being made to come to any wholesome understanding. That is, a scientist is trying to model nature, and this consists in two parts: (a) develop a model, (b) test it against nature. Seth Roberts has perhaps hit upon a way to develop models, but he doesn't seem to test them. That is, cum hoc, ergo propter hoc -- "with this, therefore because of this" -- is not actually a test for causation, but merely a guess for causation.

To do a proper test, you need variable isolation, and the stability of test results is no guarantee that it is an isolated variable. A good example of an isolated variable is a simple light switch. A terrible example of an isolated variable would be your time of waking up, because that time cannot be isolated from your own thoughts and beliefs -- that is, many people, especially if they're not sleep-deprived, can wake up earlier simply by telling themselves "I'm going to wake up early tomorrow." (I myself usually wake up before my alarm clock goes off.)

Humans in general are not light switches. As has been documented repeatedly in medicine, dual effects of placebo and hypochondria plague us; things we expect to be medicinal often placate us even when they have no medicinal content and you can suddenly feel the symptoms of a malady soon after reading about it on Wikipedia.

So even if you want to say "for himself, these conclusions are very scientific," you're going to have to account for the problem that he is probably going to confirm whatever he already expects. That is, any "follow up" tests after the first guess are already tainted by the fact that Seth knows what's being tested.

beagle3 · on Aug 25, 2012

> There is no attempt being made to come to any wholesome understanding

Do you actually follow Roberts, or is your understand based on this one short summary? Because, e.g. "What Makes Food Fattening" (77-page pdf here http://media.sethroberts.net/about/whatmakesfoodfattening.pd... ) is very much an attempt to come to wholesome understanding, develop a model, test it against nature. (The test is "putting it out in the wild", and the result is "works beautifully for 80% of people who try, not at all for the remaining 20%". He's not funded to test this, nor is he making any money of it - I don't think there's a better route for him to take).

> To do a proper test, you need variable isolation, and the stability of test results is no guarantee that it is an isolated variable

Yep. Except, in the real world, NO published result related to nutrition, and almost no published result regarding medicine, is a proper test with isolated variables. Including almost every guideline your doctor works by.

> that is, many people, especially if they're not sleep-deprived, can wake up earlier simply by telling themselves "I'm going to wake up early tomorrow." (I myself usually wake up before my alarm clock goes off.)

That is true. And yet, a lot of people want diets to work, and they don't. A lot of insomniacs want a placebo to work, and it doesn't. Placebo is powerful, but is not all-powerful. It is stronger with some people, weaker with others.

Discarding results just because they were not the result of a double-blind placebo-controlled test is not rational. Neither is accepting results just because the author thinks that they are double-blind placebo-controlled:

e.g., almost all placebos today are sugar pills; If the tested-against-material has a side effect, such as causing flushing or a dry tongue (though does not produce the wanted outcome -- which is very often the case), this is not in face a double-blind; the patient knows that they did not get a placebo, and the whole test is useless. Yet, this is the gold standard.

Furthermore, if you read the NEJM / BMJ / Nature / Science medical papers, you'd notice that their result are tested on (and are therefore only valid for) a small ethnic group, a small age group, etc. But then, very unscientifically, it is assumed (by others -- usually not by the authors of said paper) to apply much more widely.

> That is, any "follow up" tests after the first guess are already tainted by the fact that Seth knows what's being tested.

Yes, that does not make the results any less true or useful: Butter makes seth faster at arithmetics, in a consistent and statistically significant way. That may not be true for others. And may be entirely a placebo. Nevertheless, if Seth wants to be faster at arithmetics, he can do that by eating butter, regardless of the cause. Is that not science?

drostie · on Aug 26, 2012

(1) Indeed, I am commenting on this particular mode of understanding, where one keeps a journal, waits for a deviation of a test from normal, and then tries to correlate the journal with the deviations after the fact.

(2) "NO published result related to nutrition" is overbroad. Obesity for example is clearly related to nutrition and there are many published results where many variables which could affect obesity are quite well-isolated -- twin studies to isolate genetics, studies of how obesity rates vary based on physical location in various cities, and so forth.

(3) I aim to "discard the results" only insofar as they claim that they have measured an agentic relationship, which these results have indeed claimed to measure. Seth's official conclusion is that this was "data that suggested butter improved my mental function." If the variable is not isolated, then attributing the agency to butter is worthless. It might have been an otherwise random variation at around the same time as he switched to butter -- the data he's describing has a 40ms variance as well as certain long-term patterns and the effect that he's describing is a 30ms improvement, so it's quite possible that in an autoregressive model you would just see one substantial "step down" of 80ms which doesn't get compensated due to hysteresis. Or perhaps the correlation is indeed correct but wrongly attributed -- perhaps Seth has not noticed but he tends to use butter to fry peppers and pork fat to fry meats, and now that he is using butter his diet contains 2% more vegetables. Perhaps now that he uses more butter, he eats more toast and gets 2% more fiber in his diet, without having written that down in his journal.

The problem is precisely the word "makes" in your sentence "butter makes Seth faster at arithmetic." You have absolutely no evidence that it's the butter which is making Seth faster at it. And that is why the conclusion "butter makes Seth faster at arithmetic" is not a scientific conclusion.

beagle3 · on Aug 26, 2012

> Obesity for example is clearly related to nutrition and there are many published results where many variables which could affect obesity are quite well-isolated -- twin studies to isolate genetics, studies of how obesity rates vary based on physical location in various cities, and so forth.

Care to show me, for example, a single nutrition study where "skin color of participant" was a variable controlled for? It is known that cholesterol metabolism (eventually into) Vitamin D is greatly dependent on skin color -- and yet, it does not appear in a single nutrition study, despite the greatly known significance of both serum cholesterol and serum vitamin D.

I'm taking this ad absurdum in an attempt to show you that if you apply this standard rigorously (as one should), then no result really holds up.

And by the way, "obesity is clearly related to nutrition" is only true in the sense that "everything is related to nutrition". There are no results of things that affect obesity that are "quite well isolated" in humans that I'm aware of.

> Or perhaps the correlation is indeed correct but wrongly attributed -- perhaps Seth has not noticed but he tends to use butter to fry peppers and pork fat to fry meats, and now that he is using butter his diet contains 2% more vegetables ...

If you actually followed him, you'd know that's not the case. He is very diligent about isolating variables as much as possible, and thoroughly documenting his day, including food intake, how much TV he watched, etc -- and the butter hypothesis was actually an attempt to strengthen a hypothesis having to deal with animal derived fats (which among other things, included bacon). It wasn't just observational - it was part of a variable isolation experiment, that (within limits) was comparably rigorous to any other non-blinded test (which are far more common than you'd think; in fact, many if not most supposedly double blinded tests aren't).

> You have absolutely no evidence that it's the butter which is making Seth faster at it.

He has much more evidence for than you'd expect from this short posting, if you care to look at it. This conclusion is indeed supported by data, and is scientifically valid.

drostie · on Aug 27, 2012

> Care to show me, for example, a single nutrition study where "skin color of participant" was a variable controlled for?

http://www.ncbi.nlm.nih.gov/pubmed/19656435 .

> I'm taking this ad absurdum in an attempt to show you that if you apply this standard rigorously (as one should), then no result really holds up.

Then you should be willing to do what happens when a reductio ad absurdum fails: give up and admit you are wrong.

> And by the way, "obesity is clearly related to nutrition" is only true in the sense that "everything is related to nutrition".

Uh, no. Obesity is a nutritional disorder, as distinct from other things like having cats, which are neither caused nor hampered by good nutrition. What the hell are you smoking?

> He has much more evidence for than you'd expect from this short posting, if you care to look at it. This conclusion is indeed supported by data, and is scientifically valid.

I searched his website and all I could find was one particular crappy-looking graph with a bunch of discussion about his responses to vague questions, and an ad-hoc explanation (a competing omega-3 deficiency) when the data did not fit his expected pattern.

beagle3 · on Aug 27, 2012

> http://www.ncbi.nlm.nih.gov/pubmed/19656435

Thanks. For some reason, I was unable to find this when I last looked (might have been before this came out).

> Uh, no. Obesity is a nutritional disorder, as distinct from other things like having cats, which are neither caused nor hampered by good nutrition. What the hell are you smoking?

> http://www.medicalnewstoday.com/articles/249289.php

Are antibiotics with no nutritional value considered nutrition these days? There are hundreds of substances that seem to cause obesity independent of nutrition (that is, when isolated as a variable compared to nutrition), most prominently psychiatric drugs, but -- as is shown by this study -- also antibiotics, which are supposed to be "nutritionally inert".

Also, this was not shown in Humans and might not apply, but http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjo... shows that obesity and type-2 like diabetes can result from gut flora. Are gut flora considered nutrition these days?

> I searched his website and all I could find was one particular crappy-looking graph with a bunch of discussion about his responses to vague questions, and an ad-hoc explanation (a competing omega-3 deficiency) when the data did not fit his expected pattern.

He did not yet bother summarizing these results (the way he did for his weight loss theories, in http://media.sethroberts.net/about/whatmakesfoodfattening.pd... - but there's a lot more discussion of it through the years) , I suspect that it is because he does not believe he nailed anything yet. However, he asks people to "try at home", and several of his results (mostly regarding dietary fat, but also about mood and sleep) have been confirmed by other people as well.

He's under no "publish or perish" stress like most papers you see, so he isn't trying to get anything production quality. I've been following him for 7 years now, and while it isn't blinded or double blinded, it's otherwise as good as (and usually better) than observational studies I find in top rated journals.

sinak · on Aug 25, 2012

Check out http://quantified-mind.com for accurate, repeatable cognitive testing. The UI isn't great, but I'm working on that at the moment.

nyan_sandwich · on Aug 25, 2012

It looks like I have to sign up to play. Trivial inconvenience, I know, but now I'm bored and doing other things. Just FYI.

sinak · on Aug 25, 2012

Good to know - will work on that

dminor · on Aug 24, 2012

I'm guessing he has a program generating the test using random numbers.

dbaupp · on Aug 24, 2012

it's very difficult to do double blind tests

Depending on the nature of the experiment, it's not that hard, e.g. http://www.gwern.net/Nootropics#adderall-blind-testing

npsimons · on Aug 24, 2012

Depending on the nature of the experiment, it's not that hard, e.g. http://www.gwern.net/Nootropics#adderall-blind-testing

:) I did almost type "impossible" instead of "very difficult". I'm one of those people who believes nothing is impossible, not even skiing through a revolving door (with short enough skis . . . ).

freshhawk · on Aug 24, 2012

This is what the first part of science is. He's getting as far as one piece of data that might suggest a hypothesis worth testing, although you'd probably want more than one piece of preliminary data or you'd be hunting shadows all the time.

He then stops before getting to that whole "controlled experiment" part that really makes up the central part of doing science.

beagle3 · on Aug 25, 2012

> "controlled experiment" part that really makes up the central part of doing science.

In theory, that's how it is supposed to work. But in practice, it doesn't. See e.g. John Ionaddis work showing that a significant part of published research (including that "controlled experiment" setting you are thinking about) is wrong; or the recent nature paper about failing to reproduce 47 out of 53 celebrated ("controlled experiment") results from the last 10 years, and what some more research about those results came up with.

Roberts is not doing conventional research. But what he does do is (a) cheap, (b) safe, (c) yields results quickly, and (d) individual testable. He also publishes negative results, and admits mistakes.

I think it is actually more useful than 90% of "proper science" done out there. He is not under a "publish or perish" whereas most scientists in the world are, and it shows.

freshhawk · on Aug 26, 2012

Roberts is collected one long detailed case study. Sample size of 1, no controls, no blinding and tiny experiments.

Ghost hunters and anti-vaxxers have better science behind their pet theories.

We expect this type of endeavour to be extraordinarily prone to influence from noise. The entire modern process of science has been built to control for external factors that Roberts is ignoring or outright embracing.

And no one should be surprised when he finds links between brain function and dental amalgam, which is both preposterous and specific to an existing hypothesis common among other pseudosciences. Isn't this just more evidence that the unblinded nature of Roberts anomaly hunting is being skewed by bias?

That doesn't mean that some of the things he has found aren't true but 100+ years of trial and error and deep thought on this single issue by an enormous number of intelligent people taking the most meticulous records humanity has ever produced all around the world have determined that his methods will produce more false correlations than true ones.

The failure to produce previous controlled experiments is alarming. I'm just baffled as to why you think that removing controls and oversight and experimenter blinding and lowering the sample size, TO ONE, as well as taking a single data point for some "experiments" would in any way make it better?

beagle3 · on Aug 27, 2012

> Roberts is collected one long detailed case study. Sample size of 1, no controls, no blinding and tiny experiments.

You might want to actually read what Roberts collects, rather than what you think he collects, because it is apparent that you extrapolate from this post. It's a good way to stroke your ego, but not a good way to have a discussion.

> Isn't this just more evidence that the unblinded nature of Roberts anomaly hunting is being skewed by bias?

No, it isn't, and if you read what he publishes rather than have a knee-jerk reaction to one short summary, you might be able to appreciate that.

> have determined that his methods will produce more false correlations than true ones.

That's actually not true, as far as I know, and I would appreciate if you specified exactly what you are referring to. (saying "everyone knows" is not a good enough answer)

What I do know is that this meticulous record keeping produces bogus results at an alarming rate because of the things it does not consider, like "publish or perish" pressure and the "negative reuslts go in the drawer" effect. This xkcd describes the mechanism http://xkcd.com/882/ , and these extremely well written, peer-reviewed papers, show that most published results (which are golden according to the school of thought you seem to belong to) is actually wrong: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182327/ http://www.reuters.com/article/2012/03/28/us-science-cancer-... (points to a paywalled nature article). In case you are unable to get a copy of the nature paper: 88% of celebrated cancer results published in the last 10 years CAN NOT BE REPLICATED (that is, are most probably WRONG), despite following the "golden rules of science".

> The failure to produce previous controlled experiments is alarming.

What exactly are you referring to?

> I'm just baffled as to why you think that removing controls and oversight and experimenter blinding and lowering the sample size, TO ONE, as well as taking a single data point for some "experiments" would in any way make it better?

Because I actually live in the real world and not in some fantasy world. I don't think removing controls and oversight is a good idea. However, Roberts has no funding for these experiments because there's no way to make money off them. So he does what he finds interesting, in the best way he can, and lets others try to replicate -- and then collects their results as well. He has stumbled on a diet that seems to be more effective than any other out there (in real world terms, that of keeping weight off years later). It only works for 80% of the people, but for those it works wonders. If you can describe a plausible placebo mechanism for this result, please do - I would love to hear one (and why it seems to skip every other popular diet out there)

I do believe that a cheap, safe, easy to do experiment that is not blinded is infinitely more useful than an expensive one that will never be carried out.

The first is not perfect, but yields results that are easy to later work with, and easy to verify whether they are right or wrong by doing the same test again or other test that would pinpoint the effect. It is a weak result, but it is a result of some kind.

The second yields zero information, since it will never be done.

> lowering the sample size, TO ONE, as well as taking a single data point for some "experiments" would in any way make it better?

You don't have to agree with what Roberts is doing. But you will have to excuse me for pointing out that making comments based on what you infer he is doing or claiming from a ten line summary makes you look like an idiot, even if it makes you feel superior.

freshhawk · on Aug 27, 2012

> That's actually not true, as far as I know, and I would appreciate if you specified exactly what you are referring to. (saying "everyone knows" is not a good enough answer)

Let's take one, specific, example: Blinding. So you think that science added the concept of blinding and double blinded experiments because it didn't result in an improvement in results and a decrease in bias? And a sample size of more than 1? Are you saying there is no evidence that a sample size larger than one leads to better results?

What about my statement that having, rather than not having, basic scientific controls in place results in more trustworthy evidence is "actually not true"

> What I do know is that this meticulous record keeping produces bogus results at an alarming rate because of the things it does not consider, like "publish or perish" pressure and the "negative reuslts go in the drawer" effect. This xkcd describes the mechanism http://xkcd.com/882/ , and these extremely well written, peer-reviewed papers, show that most published results (which are golden according to the school of thought you seem to belong to) is actually wrong: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182327/ http://www.reuters.com/article/2012/03/28/us-science-cancer-.... (points to a paywalled nature article). In case you are unable to get a copy of the nature paper: 88% of celebrated cancer results published in the last 10 years CAN NOT BE REPLICATED (that is, are most probably WRONG), despite following the "golden rules of science".

> Because I actually live in the real world and not in some fantasy world. I don't think removing controls and oversight is a good idea. However, Roberts has no funding for these experiments because there's no way to make money off them. So he does what he finds interesting, in the best way he can, and lets others try to replicate -- and then collects their results as well.

Which side are you arguing? So science is hard and there are a ton of shitty results accepted as valid. But you seem to be suggesting that the solution to this is good intentions and a scrappy can-do attitude. If the well funded experiments with large sample sizes, double blinding and excellent controls still screw it up most of the time because of some bias or error in procedure then why would I expect removing all controls would result in something even worth looking at?

PS Just to clear something up, in science unreplicated experiements are weak evidence, most are expected to be wrong. replicated experiments are stronger. and theories with multiple lines of evidence from multiple experiments in multiple fields and have emerged as the consensus opinion among experts of how things work are strong. The way you pointed out that unreplicated papers are probably wrong seemed like you thought you had some kind of gotcha. Well the "golden rules of science" are pretty clear on the value to be placed on unreplicated papers in a field overrun by media attention.

beagle3 · on Aug 27, 2012

> Blinding. So you think that science added the concept of blinding and double blinded experiments because it didn't result in an improvement in results and a decrease in bias?

There's theory, and then there's practice. In 3 clinical experiments that I've followed closely, it was officially "double blinded" (and at least one was published as such), but because the placebo was sugar, and the substance or treatment under test was not inert (one caused dry mouth, one caused flushing, and in one laser treatment, people found the real thing painful, unlike the placebo light). So the patients knew about it for sure, and the experimenters had to be idiots to not know.

Do you think the published results eventually reported that? The answer is no. That would have killed FDA approval, and potentially cost hundreds of millions of dollars. Is that scientific fraud? I would say yes, but you'd be surprised at the answers you'd get if you ask the people involved (rather than the theoretical question).

Do you think this situation is uncommon? I don't. I might be biased here by my own experience, but I would assume that when hundreds of millions of $ are on the line, that science takes a back seat. FDA slap-on-wrist fines seem to agree with that.

> Are you saying there is no evidence that a sample size larger than one leads to better results?

No, I was not saying that. I was referring to the "hundreds of years, hundreds of smart people" statement which was vague and did not point to anything in particular.

> But you seem to be suggesting that the solution to this is good intentions and a scrappy can-do attitude. If the well funded experiments with large sample sizes, double blinding and excellent controls still screw it up most of the time because of some bias or error in procedure then why would I expect removing all controls would result in something even worth looking at?

Read what you just wrote. Science's "golden standard" is essentially unattainable: 88% of the most (assumed) useful results about cancer in the last 10 years cannot be replicated (therefore, scientifically wrong , or in your own words, shitty results accepted as valid). Read that again. And again, and let that sink. That's not an accident and that's not just because science is hard. That's a direct result of the way science is practiced these days (The xkcd is the super-short summary of one of the reasons, the Ioanaddis piece is the "scientific" version with many identified reasons).

Guess what - if you do not change how science is practiced, you're guaranteed to get more shitty results. In fact, you're guaranteed to get a higher percentage of them with time (I don't remember if this piece by Ioanaddis touches that, but he did in other places). People in science I've talked to expected a 50% "wrong result" ratio from the golden standard that's supposed to give you good faith. No one expected the 88% failure result, but no one thinks it is exaggerated.

Again, these conclusions (results done the same double blind, N>>1) are going to get shittier and shittier, unless you change something about how you do stuff.

> The way you pointed out that unreplicated papers are probably wrong seemed like you thought you had some kind of gotcha.

Question, if Seth did acceptable double-blind N=30 (assume sufficient) experiments, would have pointed out that "well, that's not scientific because it wasn't replicated?", if so - kudos. But you'd be hard pressed to accept any nutrition or medical result published in the last 30 years, because the vast majority does not comply with that standard. Do you have that response to any scientific news you hear?

If indeed you hadn't replied like that, then you are moving the goalposts. But I'll give you the extremely unlikely benefit that you would have.

> Well the "golden rules of science" are pretty clear on the value to be placed on unreplicated papers in a field overrun by media attention.

That is wishful thinking. For a few years, I read the medical literature and followed how things became standard practice (my SO at the time was a doctor, I'm an engineer). Results such as described by Ioannadis and the Amgen guy become accepted in medicine and nutrition without any attempt at replication. And then pulled back 10-20 years later when people are dying of e.g. Vioxx, or statins (the sh*t has to hit the fan on this one; wait another 10 years), or realize that most of the Prozac data was cooked to look much more significant than it was.

> Which side are you arguing?

I'm arguing medical/nutritional science as practiced today (especially its relationship with "publish or perish" academia and "a bad result will cost us upward of $400M" pharma) has painted itself into a corner. The incentive structure essentially guarantees that the percentage of wrong answers is going up. Replication is part of the "gold standard", but is rarely ever practiced. (The Amgen paper is the only modern systematic attempt at replication that I'm aware of; and of the few casual replication attempts that fail, most don't get published).

It's converging into an insane "what must do this, even though we don't have the facilities, but anything else is rotten, so we'll pretend it's ok and not bother replicate unless someone points out we're wrong".

What I'm arguing is not "science is hard" (it is!) or "science is easy" (it isn't). It is "we need to find a new way to do science". The golden "double blinded, n->inf, multiple replication" is the easiest to be convinced of, if done right, but definitely not the only way to do science, and definitely not often done right (and with the existing incentive structure, will be done less and less so).

I am arguing that we should find other ways to make progress, and I think Roberts is on the right track. Following him for the last 7 years, I know that he produced more useful results (safe, easily testable, showing improvement for a large percentage of people who attempt to follow) than most hundreds-of-million-dollars scientifically-golden experiments. And he did it in his free time, and with no funding.

The way science is done needs a disruption. Double blind n->inf multiple independent confirmation is indeed the easiest way to be convinced of a result, but it is NOT the only way, and it is becoming increasingly rare/impossible to carry out.

What I'm arguing is that the gold standard is actually not achieved; You don't disagree, I think. What we disagree about is how meaningful the results that are coming out are compared to the effort/cost. I believe that the existing tradeoff is not acceptable; you seem to believe there is no tradeoff as there is only one way to do science (it's just that the ideal is not achieved).

What I'm arguing is that Roberts is giving an example about how the way science is done could be disrupted. It's immature, it's not easy to replicate (I'm not as placebo-resistant as Seth is, for example). But it has applicable results, and is cheap and effective.

npsimons · on Aug 24, 2012

Most serious athletes already know about this (in fact, on that same site: http://quantifiedself.com/2011/03/personal-data-visualizatio...). Keeping a record of your resting heart rate (plus blood pressure, temperature, etc) every morning will tell you many things. For example, I managed to get my resting heart rate down to mid 40s when I was training regularly (and heavily). Now that I'm more sedentary, my resting HR is in the mid 50s to low 60s, and I definitely don't feel as good or make as good times. Spikes while in training were also usually indicative of overtraining. It's also mentioned in "The Hacker's Diet" that if you weigh every day, but pay attention to the moving average it can be much more encouraging.

asher_ · on Aug 25, 2012

I love the QS movement, but I sometimes worry for it. It is young, and there is a wide variety of methodologies being used to draw conclusions, some are solid, others are far from it.

This can be expected from a group of people that have no requirements for joining, and this is a strength as well as a weakness. I have noticed that QS has attracted some "alternative [to] medicine" types who seem to be happy to have the validation, as well as some seriously interesting projects from hard core science types.

I am hoping that as this movement evolves, it will mature in the right way. Having amateurs participating is incredibly worthwhile, but guidance and best practices will hopefully develop in the community so that when people want to draw conclusions about causality, they can do it legitimately.

freshhawk · on Aug 26, 2012

I feel the same as you, but the movement is evolving in the alt-med/pseudoscience direction really quickly. Amateurs are collecting tons of cool data (amazing!) and then doing their amateur analysis (Ancient Aliens level depressing!).

Without a very strong internal pressure to get their shit together and act like smart people who understand that some problems require expertise the movement is screwed.

mthoms · on Aug 25, 2012

Are we sure this isn't trolling? I mean come on:

"Sleep and breakfast. I changed my breakfast from oatmeal to fruit because a student told me he had lost weight eating foods with high water content (such as fruit). I did not lose weight but my sleep suddenly got worse. I started waking up early every morning instead of half the time. From this I figured out that any breakfast, if eaten early, disturbed my sleep."

beagle3 · on Aug 25, 2012

It isn't trolling. It's a short summary piece. If you are into these things, his blog and semi-formal papers ("what makes food fattening" and others) are a gold mine.

cunninghamd · on Aug 24, 2012

But HOW do you track all that data? Today, with a smartphone, it seems easier, but even then I’d like something akin to a master spreadsheet that I can store it all in.

Seth: what format are you tracking all that data in?

Of course, now that I type that, I’m worried the answer is “pen and paper.”

mahyarm · on Aug 24, 2012

One of my someday projects to create a generic self tracking app that presents a dashboard for quick entry tracking and visualization and syncs with a website. It will also export to a spreadsheet. It will also allow you to visualize data with graphs and so on. There will be a bunch of presets for various special types of graphs, like with blood pressure and so on. You could also specify zones for numbers so you can show 'red' for a danger zone glucose level for example.

Managing spreadsheets on a smartphone is very kludgey.

raelshark · on Aug 25, 2012

That's basically a project I'm actively developing now - generic self-tracking (manual and via sensors and imported from other platforms), with presets and intuitive visualizations. This is a big gap in most tracking tools that are out there - letting you track anything you want, while also making sense of all the information in a useful way.

My motivation is to create something that's effective for sufferers of obscure chronic illnesses, since most tracking tools out there focus on major well-known illnesses, or simply aren't flexible enough to track weird symptoms that are so obscure that the developers have never heard of them. But the side-product is that the design will therefore be flexible enough to be useful for anyone who wants to track any element of their health.

sinak · on Aug 28, 2012

Sounds really interesting, would you mind sending me an email? Im organizing the SF QS group and it'd be great to have you come and demo if you're local, and even if not we're looking for a tool to help us perform group QS tracking. My email is sina dot khanifar at gmail.

npsimons · on Aug 24, 2012

To some people, it's not much better than "pen and paper", but I find org-mode in Emacs very handy. And yes, I've got it on my smartphone. I've not played with the Android or iPhone org-mode apps, but org-mode keeps everything in plain text, the tables export to CSV and TSV, plus there is already quick and dirty graphing built-in (org-plot/gnuplot).

As for more advanced stuff, I've started looking into statistically analyzing the data, also in org: http://orgmode.org/worg/org-contrib/babel/intro.html

jurjenh · on Aug 24, 2012

In one of the linked pages - http://quantifiedself.com/2012/07/nick-winter-a-lazy-mans-ap... he mentions quantifed mind, but I can't find any site with that name.

EDIT: Ooops, a little googling further reveals http://www.quantified-mind.com

After a little link-chasing I did get to a guide / tools page that may be useful: http://quantifiedself.com/guide/

205guy · on Aug 25, 2012

I find the smart-phone answers encouraging. I had never heard of the quantified self movement before, but it is a neat thing. However, just imagine the power for science when thousands and millions of self-reported health data are aggregated (hopefully anonymously). You could watch viruses spread geographically in real time. You could find "cancer hot spots" much faster. You could "see" genetic abnormalities propagate over generations.

ejain · on Aug 25, 2012

https://zenobase.com/ is another, new self-tracking service. It's quite generic, so not as user-friendly or convenient as more specialized tracking services can be, but it's more flexible, and is an improvement over a plain spreadsheet (or pen and paper).

nevster · on Aug 25, 2012

This is the one I use http://your.flowingdata.com/ You can send direct tweets so it's pretty handy to use from a phone

A1kmm · on Aug 25, 2012

This is a method for hypothesis generation, but is not a valid way to draw inferences.

The reason why scientists don't draw conclusion from single data points (analogies) is that you can come up with flawed conclusions as a result. The changes could be purely random, or due to some other explanation that wasn't thought of.

In order of reliability: 1. Do an experiment where an intervention is assigned at random, and scored by a method that is blind to the intervention (preferably on subjects who are blind to the intervention - but that is hard for some cases). 2. Observe historical data where there have been many instances of each of the variables under study, avoiding correlation between variables with things like time. 3. Use anecdotes.