1200 mutations away from RATG-13 is how significant exactly? I will propose that it is not particularly significant. One virus I work with, I have 14 variants ranging from 300 to 500 base-pair differences. That is from passing in a laboratory only. I have one variant that has a 14000 BP deletion! (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7217056/#mmi144... TABLE S2) However, it is noteworthy for that reason. That said, these are dsDNA viruses with comparatively much slower mutation rates. 1200-base differences are almost nothing.
You're comparing apples and oranges. 14 mutational sites across a virus with 17k ssDNA genome is not comparable to RATG-13 vs SARS-2, which have not just 1200 mutations different, they're spread out over HUNDREDS of SNPs.
That's the important comparator. And why it will take so long to mutate one into the other by natural mutation rates.
>The two non-canonical arginines don't make it less likely?
Not particularly. CGG exists in MERS 15 times. NL63, 29 times. It even exists twice in a row in Human coronavirus 229E. Throughout all of the known alphacoronaviruses (94 described) CGG exists 1575 times. In betacoronaviruses, it exists more times than my processor can count without hanging, and I believe it tops out at 9,999 events.
Why is it so unlikely that synonymous mutational drift over the course of 70 years of infections in millions of viral generations could create these arginine codons that are not the most optimized but still work in the mammals this virus infects? CGG works. It makes an arginine when this virus infects its host.
Why couldn't it be a recombination event between SARS-2 and one of the known coronaviruses with an extremely similar cleavage site? We know already coronaviruses have recombined with viruses totally outside of their family on occasion: https://www.virology.ws/2016/10/27/genome-recombination-acro...
>You're comparing apples and oranges. 14 mutational sites across a virus with 17k ssDNA genome is not comparable to RATG-13 vs SARS-2, which have not just 1200 mutations different, they're spread out over HUNDREDS of SNPs.
If memory serves, that was a single generation. An rdrp produces one error per thousand bases, which comes out to 30 mutations per generation, so we're talking 40 generations away? That hardly seems significant.
>Not particularly. CGG exists in MERS 15 times. NL63, 29 times. It even exists twice in a row in Human coronavirus 229E. Throughout all of the known alphacoronaviruses (94 described) CGG exists 1575 times. In betacoronaviruses, it exists more times than my processor can count without hanging, and I believe it tops out at 9,999 events.
How many of those are in frame for an amino acid? How many of those are two in frame for arginine in a protein, next to each other?
>Why is it so unlikely that synonymous mutational drift over the course of 70 years of infections in millions of viral generations could create these arginine codons that are not the most optimized but still work in the mammals this virus infects? CGG works. It makes an arginine when this virus infects its host.
There are many reasons why it's unlikely. The first is pure statistics. But the statistics are almost certainly influenced by millions of years of biology.
>Why couldn't it be a recombination event between SARS-2 and one of the known coronaviruses with an extremely similar cleavage site?
It's absolutely possible and I would never dispute this. The possibility of it isn't a refutation of the possibility of other hypotheses, and I don't think it's in the scientific spirit to discount other viable hypotheses.
> If memory serves, that was a single generation. An rdrp produces one error per thousand bases, which comes out to 30 mutations per generation, so we're talking 40 generations away? That hardly seems significant.
SARS-CoV-2 is mutating at the rate of about 2 changes/month (68,69,70,71), out there in society, circulating in millions of humans. 2/month in the overall population of millions of tiny viruses, among 30,000 letters in each genome.
So, at the fixation rate (~2 fixed mutations/month), with all the many billions of SARS-2 viruses making copies inside all those people, how long would it take to change RaTG-13 into SARS-CoV-2?
Answer: about 50 years. 30 years before the world even knew about SARS or MERS or any other pandemic-potential coronavirus. Before we knew these viruses even existed. Before we knew they liked to live in bats (72,73,74,75). And, for the record, they didn’t even build a BSL4 (the kind of lab you really need to handle this kind of virus in animals) in Wuhan until 2016 (76).
And that estimate (50 years) is with all the many mutations that are happening in all the many infected humans during a pandemic situation.
We know that with a smaller group of lab animals (or even human subjects), the virus is much slower at “finding” mutations that “stick around” (77,78). You have to picture it kind of like a big room full of millions of slot machines. Each machine is a virus, pulling the lever each time it makes a copy of itself. And you only win a payoff when you’ve found a change in the virus that A) makes it look different, and B) doesn’t screw it up, so it can still survive and do its job (infect people). A lot of these mutations screw the virus up, so they wouldn’t be a payoff. They wouldn’t be a “fixed” mutation.
>>>>>>>>>>>>
We can't just focus on random mutation, we have to think about fixation. Because virus populations don't just evolve in one concerted direction. They evolve /outwards/ in a cloud. It's not simply A to B, it's A to B then back to A then over to C then to D, then back to B, then over to A again, then finally settled on C. A random walk.
That's why the population level data is so important.
> How many of those are in frame for an amino acid? How many of those are two in frame for arginine in a protein, next to each other?
Who said recombination events only happen in frame? Or that viruses only drift in frame?
> There are many reasons why it's unlikely. The first is pure statistics. But the statistics are almost certainly influenced by millions of years of biology.
I would respond to this if it contained any evidence other than 'statistics say it is so.'
> I don't think it's in the scientific spirit to discount other viable hypotheses.
Who said I'm discounting anything? I said it's just not very likely. Extraordinary claims require extraordinary evidence.
>We can't just focus on random mutation, we have to think about fixation. Because virus populations don't just evolve in one concerted direction. They evolve /outwards/ in a cloud. It's not simply A to B, it's A to B then back to A then over to C then to D, then back to B, then over to A again, then finally settled on C. A random walk.
Fixation relies on selection, which is entirely different in a laboratory environment than in a population with immune systems.
>Who said recombination events only happen in frame? Or that viruses only drift in frame? No one, but the fact is that two in-frame CGGs next to each other are substantially less likely "to be fixed" than two CGGs out of frame, or in a non-coding region. Almost every gene evolves more slowly than non-coding space, beyond very few specific non-coding regions. CGGCGG is very different than ACGGCGG and we both know that. You get an entirely different peptide out of each one. And selection almost exclusively happens at the protein level.
This is the part that's confusing me. I haven't made any claims other than that it's possible? And I have provided plenty of evidence that it's possible. I'm not sure what claims you think I'm making.
Hi, I want to apologize. I was the one who did not see that you from the very beginning said you did not put much stock into the lab theory, and I basically ignored that. I didn't see it. I also did not read closely enough some other things as well. I was attacking some strawmen, some of which I have seen in the real world and others I have not.
I'm sorry, that colored a lot of my responses to you because I made unfair assumptions about what you were saying. I sent you a longer form email to your protonmail about it.
>Fixation relies on selection, which is entirely different in a laboratory environment than in a population with immune systems.
This is actually extremely controversial. There's a great deal of evidence that random walks are more important than selection in fixation events. Does selection play a role? Yes! Definitely! But the evidence is mounting and almost at consensus that random chance is actually what dictates most fixation events. It just can't be /deleterious/ but it does not have to be /helpful/ to fix. The evidence shows that most mutations, on median, are neutral or slightly deleterious. But the ones that are beneficial are so beneficial that the average is neutral-to-net-positive. A lot more of it is actually stochastic than you think! A lot of the transmission between hosts, for example, is stochastic and not selective.
See these reviews/studies from Bloom, Audino, etc:
This is why it's less easy to maintain a virus at the proper S/NS ratio in the lab. It becomes too stochastic. too little selection. So your mad scientist would have to have extremely few viral genera.
You say every gene evolves more slowly than non-coding, which is true. but synonymous mutations happen at the same rate as one would expect it to occur in both.
Are you trying to say it isn't ever going to happen? From what I'm reading, there's actually no reason to believe the CGG in that site is fixed in any way. It's not always CGG. In fact, it rarely is in CoV-2 isolates. Maybe that was a fluke of the sequencer or the isolate?
Wow, now I'm starting to think Yuri just didn't do his homework.
but that's just in the isolates they're analyzing.
"All human coronaviruses analyzed in this study did not use two synonymous codons (CGC, CGG) for arginine as well as CCG for proline and UGA for stop codon at all"
I couldn't find CGG in the cleavage site sequence anywhere in the early pandemic. Not in any of the earliest papers.
I could only find it in clinical sequences from later on in the pandemic, suggesting it may have been a random mutant that fixed /after/ the emergence into humans. Which pokes a big ol hole in the idea that it represents a smoking gun of genetic manipulation:
I shouldn't have taken it as a given that CGGCGG was actually there in the beginning. Looks like it wasn't It certainly isn't in the refseq.
And the thing I'm asking here is: Are you really saying you think the lab leak is /as probable/ as the zoonotic crossover? Given that the A) the CGGCGG wasn't even there when the crossover happened, B) the probabilistic arguments I've made above, and C) the fact that you can't provide any actual evidence of a mechanism? I gave you lots of mechanisms and examples of how it would happen in nature. Why is one not more likely than the other?
You're comparing apples and oranges. 14 mutational sites across a virus with 17k ssDNA genome is not comparable to RATG-13 vs SARS-2, which have not just 1200 mutations different, they're spread out over HUNDREDS of SNPs.
That's the important comparator. And why it will take so long to mutate one into the other by natural mutation rates.
>The two non-canonical arginines don't make it less likely?
Not particularly. CGG exists in MERS 15 times. NL63, 29 times. It even exists twice in a row in Human coronavirus 229E. Throughout all of the known alphacoronaviruses (94 described) CGG exists 1575 times. In betacoronaviruses, it exists more times than my processor can count without hanging, and I believe it tops out at 9,999 events.
I just ran blastn to figure that out. Using these datasets: https://www.ncbi.nlm.nih.gov/datasets/coronavirus/genomes/
Why is it so unlikely that synonymous mutational drift over the course of 70 years of infections in millions of viral generations could create these arginine codons that are not the most optimized but still work in the mammals this virus infects? CGG works. It makes an arginine when this virus infects its host.
Why couldn't it be a recombination event between SARS-2 and one of the known coronaviruses with an extremely similar cleavage site? We know already coronaviruses have recombined with viruses totally outside of their family on occasion: https://www.virology.ws/2016/10/27/genome-recombination-acro...