For the curious, this is known as a Genome-wide association study. There's been something of an explosion of these studies done, as cheap genotyping has given us datasets big enough to detect small effects. Wikipedia has an (excruciatingly technical) (edit: not really) overview here: https://en.wikipedia.org/wiki/Genome-wide_association_study
Unsurprisingly, height is 0.40-0.60 heritable, but did you know that "nostril area" is 0.657 heritable, while cilantro tasting is only 0.087?
The NYT article seems to imply this is the first study of its kind to be done, which is absolutely not the case. The GCTA article linked above cites 18 GWASes on general intelligence, starting in 2011. The abstract on the Nature article focuses on the 15 new SNPs found that affect intelligence, in addition to the 336 already known. It's an incremental advance, not a stunning breakthrough.
HN user gwern has covered this field extensively. Shouldn't be surprised to see him show up in this thread and gently correct some of the confused thinking in the light gray comments.
I would highly recommend reading 'A brief history of everyone who ever lived' by Adam Rutherford. IIRC it contains a whole chapter on GWASs (pronounced jee-wazzes).
There are some phenotypes (characteristics) which clearly can be associated with a small number of genes (e.g. autosomal recessive disorders) , however for many phenotypes such as intelligence, it takes a lot more effort to correlate them with underlying genotypes. The fact that intelligence is approximately normally distributed is an indicator that there are many individual effects at play.
ganonm said "The fact that intelligence is approximately normally distributed is an indicator that there are many individual effects at play."
Intelligence test scores are normally distributed by construction: the weighting of questions is adjusted to ensure normality on a sample population. That test scores are normally distributed in samples other than the normative one tells us nothing other than that the test designers succeeded.
Contrast this with height, the measure of which is not defined in terms of a population distribution.
Could you help explain how to interpret the list of effect sizes? I'm having trouble understanding what the numbers mean. For example, black hair is 0.00 heritable. Does that mean one cannot inherit black hair? (Is this black in the sense of asiatic hair?)
> However, if the GCTA estimate was ~0%, then that would imply one of three things: a) there is no genetic contribution, b) the genetic contribution is entirely in the form of genetic variants not included, or c) the genetic contribution is entirely in the form of non-additive effects such as epistasis/dominance
This isn't GxE heritability (i.e. genetics vs. environment), but rather a metric for how SNPs influence additive traits. It's best considered a lower bound to what we would normally consider heritability.
In this case, it's not an additive, multi-factorial trait; it's epistatic. There's a 'black hair' gene that overrides whatever else is happening in your genome. This is often the case with color traits: a dark color overrides anything lighter. It's the classic example of dominance.
> The abstract on the Nature article focuses on the 15 new SNPs found that affect intelligence, in addition to the 336 already known
It's a little ambiguous, but that's not how I read the abstract. The relevant sentence:
> We identify 336 associated SNPs in 18 genomic loci, of which 15 are new.
That seems to be saying that 15 of the 18 genomic loci are new, not that 15 of the 336 SNPs are new. Given that they "identified" 336 SNPs, presumably they all are new?
I thought I'd register to add a few interesting points to this thread.
'Classical' GWAS are a top-down approach, where we try to find strong associations betweens a genetic locus and a trait (in this case intelligence), and they rely on very stringent p-value corrections due to the number of tests performed. This is why although height is >0.6 heritable (meaning genetics explain more than 60% of the trait variance in a given population), only 20% of the variance can be attributed to GWAS finds [1], such as the 52 genes in this article. Bear in mind that the height results were obtained with the largest GWAS ever with more than 250,000 samples and 300 institutes [2].
So where is the missing heritability? This is the starting point of a bottom-up approach, where we try to predict a trait from the genotype. With some clever but simple calculation, we know that most heritability can be found in the data (even if it does not pass the stringent significant thresholds). This is what is called 'chip heritability', the heritability explained by all the common genetic variants present on commercial chips taken together. With height, the chip heritability reaches 62.5% [1]. The bottom-up approach is especially looked at for psychiatric disorders, and already delivered some interesting results such as 0.8 genetic correlation between schizophrenia and bipolar disorder [3]. But useful clinical diagnostics based on genetic data are still a long way off.
This should please the HN crowd to know that we still have one tough prediction problem to crack. Current results can be improved either with more sophisticated machine learning methods or by integrating various biological annotations to try to unearth the missing heritability that GWAS can't seem to find with the current sample sizes.
Here are a few more links for those interested [4] [5].
>"This is why although height is >0.6 heritable (meaning genetics explain more than 60% of the trait variance in a given population), only 20% of the variance can be attributed to GWAS finds"
I don't see how you can assign numbers like this. As a simple example, say the amount of protein absorbed while growing up affects adult height. Then you can have a certain protein catabolyzing gene that is highly correlated with height when protein in the diet is low, but has little relationship when dietary protein is high (since the person gets enough absorbed protein either way).
So the variance explained (either by a given genetic locus or in total) is not going to be any kind of constant value, and can in fact vary wildly.
He is probably measuring something called "narrow-sense heritability", which just includes just additive genetic effects (Gene A + Gene B = Sum(A, B)) and excludes other things such as gene interaction. Most of the statistical techniques used to estimate the narrow-sense heritability assume environmental and other, non-additive, genetic effects follow a Gaussian. Your example clearly exposes a situation where these studies may not be powerful.
The inability to explain heritability as more than a "sum of the parts" is a problem which I'm not familiar with a solution for at the genome scale.
Do you know of an example of someone doing one of these heritability calculations, saving the model, then plugging in data from a new dataset into the same model? (ie the new data did not exist at the time the model was created)
FWIW, I only read the first couple of sections of your first Wikipedia link (the overview). I didn't find it excruciatingly technical, and have no training in biology other than high school. Anyone who knows words like allele and phenotype from high school can feel free to click. It's interesting.
Ah, somebody's tamed the article considerably since I last read it, a couple years ago. The article for linkage disequilibrium has some of the math I remember being in the GWAS article. https://en.wikipedia.org/wiki/Linkage_disequilibrium
Wikipedia has a fun list of effect sizes here: https://en.wikipedia.org/wiki/Genome-wide_complex_trait_anal...
Unsurprisingly, height is 0.40-0.60 heritable, but did you know that "nostril area" is 0.657 heritable, while cilantro tasting is only 0.087?
The NYT article seems to imply this is the first study of its kind to be done, which is absolutely not the case. The GCTA article linked above cites 18 GWASes on general intelligence, starting in 2011. The abstract on the Nature article focuses on the 15 new SNPs found that affect intelligence, in addition to the 336 already known. It's an incremental advance, not a stunning breakthrough.
HN user gwern has covered this field extensively. Shouldn't be surprised to see him show up in this thread and gently correct some of the confused thinking in the light gray comments.