Hacker News new | past | comments | ask | show | jobs | submit login
SARS-CoV-2 contains part of a patented genetic sequence (frontiersin.org)
207 points by johndcook on Feb 24, 2022 | hide | past | favorite | 128 comments



> We did not find the 19-nucleotide sequence CTCCTCGGCGGGCACGTAG in any eukaryotic or viral genomes except SARS-CoV-2 with 100% coverage and identity in the BLAST database (Supplementary Tables 1–3).

Huh? I just run BLAST (blastn, specifically) with that nucleotide sequence and found several eukaryotic genomes containing it in non MutS-like contexts, aligning sense and anti-sense strands, with 100% coverage and 100% identity. Besides, there is no “BLAST” database (I used the non-redundant -nr- one, for instance; BLAST is just the name of the tool). One possible reason they didn’t find the sequences is that BLAST, by default, only returns the first 100 hits, which in their case might have been prokaryotic ones. I didn’t check viral genomes but might later.

Disclaimer: I know and have used BLAST since my undergrad years, about a decade and a half ago.


That sounds more like an issue in the reporting being unclear about the constraints used for the identification then the actual finding.


Exactly, and that could change the interpretation of their findings completely. I haven’t checked their credentials, but I would be surprised if any of the authors has a background in bioinformatics or computational biology.

BLAST is used on a daily basis by a lot of scientists (its paper is among the most cited of all time), but many of them lack the theoretical knowledge to analyze in depth their results. This paper doesn’t have a single line about methodological details, such as the parameters used in their BLAST runs. The authors don’t even refer to P- or E-values either, despite being prominently displayed on the output table of every run!


I am a molecular biologist but not a virologist. This article is stupid. The furin cleavage site, with almost identical sequences is present in several ancestral coronaviruses to Sars cov II. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7836551/

Serious virologists went over the furin cleavage site in close detail already and none of them seemed very convinced it was anything but natural in 2020.

Also, stick any 13 bp dna sequence into BLAST and you will find some strange matches if you include the right databases. That this particular 13 base pairs matches some bit of the human genome (inverted, mind you) is not really that surprising.


Omg you fell for that article. Look at those dendrograms carefully. They bias to load up the dendrograms with "degenerate sequences" with very close/obscure accession numbers, and basically hide the diversity of viral sequences between the furin-bearing coronaviruses and SARS-CoV-2 (fig 5, which completely omits betacoronaviruses, a clade that contains SARS-CoV-2).

Furin cleavage sites are rare in coronaviruses, and even rarer in the closest ones to SARS-CoV-2. It's the case that there was a research proposal to put a furin cleavage site into bat coronaviruses to fuck around and find out precisely because they are rare in most classes but conspicuous (and possibly related to pathogenicity) in others (https://www.documentcloud.org/documents/21066966-defuse-prop...).

(Ps i am also a former molecular biologist that worked in gain of function for non pathogenic bioengineering and I used to look at dendrograms to steal ideas from nature to improve enzyme function -- and I was successful: https://jbioleng.biomedcentral.com/articles/10.1186/1754-161...)

Edit: clarified that betacoronaviruses contains SARS-CoV-2, because some armchair analysts don't stop to understand figures


This comment would be better without the first sentence.


At minimum, an explanation of how the linked research (which shows that the cleavage sites are indeed common, as the parent comment explained) is somehow wrong. It doesn’t make any sense to insist they aren’t common in response to a paper showing specific examples of them occurring in nearby SARS-CoV-2 ancestors.


[flagged]


You can't write things like "if indeed you did anything besides read the abstract" on HN.


[flagged]


You badly broke the site guidelines repeatedly in this thread, and you also discredited your own case by giving open-minded readers a strong reason to discount what you're saying. None of that does anybody any good, so please stop doing it.

https://news.ycombinator.com/newsguidelines.html

Obviously we moderate and eventually ban accounts that post like this—we have to, or this place will destroy itself even faster than it already is. But regardless of whether that moves you or not, you should take in the point that by being an asshole in the comments, you're creating the very situation that is presumably frustrating you in the first place, by discrediting the view that you believe to be the truth.

It would be far more in your interest to make your substantive points neutrally, thoughtfully, and respectfully. Then you wouldn't be undermining your own argument so badly. See https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor... for past explanations.


The point was that the furin cleavage site does appear in other Coronaviruses, and therefore it’s not a random appearance in this one.

The specific details of the paper you’re nit-picking aren’t relevant. The sequence either appears elsewhere or it doesn’t, but it clearly appears elsewhere in Coronaviruses.

Therefore, comparing the appearance of the sequence in another Coronavirus to random chance is a flawed comparison.


do you understand that there is no one "the sequence"?

At best you could be referring to it as "a furin cleavage site"/"a sequence".

Moreover, we know that the specific furin cleavage sequence in the SARS-Cov-2 occurs nowhere else among coronaviruses.


The next question, to my mind, is "How common is that cleavage site in other viruses, especially other viruses endemic to the origin location of SARS-Cov-2?" With a follow-up of "How possible would it be for multi-virus splicing to generate that sequence from two otherwise-unrelated, possibly damaged sequences ending up head-to-tail in the DNA of a host cell?"

We know of other novel viruses that have resulted from sequences of multiple viruses being spliced together in a living host naturally. I don't know how we'd disambiguate that possibility from human synthesis.


These questions are answered in the op paper


> do you understand that there is no one "the sequence"?

The linked paper is, literally, about a specific sequence. That’s the paper we’re discussing in this comment section.


I'm sorry, that is simply not the case, neither at the protein level nor at the DNA level. Furin cleavage sites have diversity.

The canonical minimal furin cleavage site sequence is RX(R/K)R, which had a highly variable amino acid and a choice between the two basic amino acids in another position. This variability is sampled across the coronavirus furin cleavage site sequences.


[flagged]


[flagged]


> Did you look at figure five and wonder why the fuck betacoronaviruses were not in the dendrogram?

So the lack of a single family of coronaviruses in a single figure inverts the entire claim? Come on.

> Peer review is not perfect.

I’m inclined to trust it far more than a random internet commenter who can’t really explain their argument but instead spends more time swearing and writing childish insults like this:

> Use that thing under your cranial bone, and exercise discernment.


...What the poster did is "Peer Reviewing". You just seem upset because the journal citation wasn't published in some journal.

One of the more toxic attitudes in science I might add.


You're free to call me an armchair analyst and trivialize my 10 years of experience in the biological sciences, sometimes working 100 days straight, 100 hour weeks (paid at 26k USD in an expensive us city, no less), more importantly brutally failing about half of those years to learn enough to come out with a successful bioengineering project that used the exact skills necessary to perform a critical analysis of the paper.

I'm also free to claim that you're just projecting.

>single family of coronaviruses in a single figure inverts the entire claim

Yes, considering that SARS-CoV-2 is part of the betacoronaviruse clade. Did you miss that?


Would it be too much to ask to keep discourse civil here?


[flagged]


Please do not respond to a bad comment by breaking the site guidelines yourself. That only makes everything worse.

https://news.ycombinator.com/newsguidelines.html


That's a lot of pent up anger being thrown around there.


Please do not respond to a bad comment by breaking the site guidelines yourself. That only makes everything worse.

Also, we've had to ask you about this kind of thing frequently in the past, and you said you wouldn't do it any more.

https://news.ycombinator.com/newsguidelines.html


I felt it was an observation not an attack.


I hear you. Personal observations using negative psychological language are likely to come across as attacks no matter what you do, and they're also not particularly substantive, as well as nearly always off topic. Best to avoid.

https://hn.algolia.com/?dateRange=all&page=0&prefix=false&so...

https://hn.algolia.com/?sort=byDate&type=comment&dateRange=a...


> I am a molecular biologist but not a virologist. This article is stupid. The furin cleavage site, with almost identical sequences is present in several ancestral coronaviruses to Sars cov II. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7836551/

It seems the presence of the furin cleavage site in any other Coronavirus would be enough to invalidate this paper’s comparison to pure random chance.

Why would the authors misuse statistics comparing this to random chance when there are published examples of the furin cleavage site appearing elsewhere? It doesn’t make any sense.


Yes, possible DNA sequences do not occur with equal frequencies. The whole point of selection pressure is that it selects for specific sequences. The same sequence occurring in a host species and a parasite is also not surprising. Rare events could result in horizontal transfer of genetics between host and parasite. Such things have been documented before.


>This article is stupid.

Whoa, that says a lot about what kind of scientist you are ...

I have a published paper under the "Frontiers in" brand and can assure you that the people behind it (staff and editors) are among the most professional in what they do. "Frontiers in" does not publish fringe science, they are quite rigorous in their process.

Also the paper is written by quite reputable people and backed by several institutions. If their claims were "stupid" (as you arrogantly put it) they would have spotted it before you, trust me.


There have been some controversies though: https://en.wikipedia.org/wiki/Frontiers_Media#Controversies


They seem to have 123 retractions across their whole editorial offer.

http://retractiondatabase.org/RetractionSearch.aspx#?jou%3df...

That's actually quite a good record for a publisher with 80 or so journals and ~270,000 published papers.

Edit: you're right @flobosg, I was looking at another comment with a similar username and my mind just squashed them together :P.


> I know you stated somewhere else that this is not your domain

You are confusing me with someone else.


Anyone taking a discipline seriously would never link to Wikipedia, with the exception of when Wikipedia, itself, is topical. It is usually a sign of laziness and weak positions on the part of the person linking to the article.


Are you suggesting the editors of Frontiers in Virology are not serious virologists?


Not my domain but I had to search the publisher and journal. Frontiers Media has several editorial related controversies.

https://en.m.wikipedia.org/wiki/Frontiers_Media

The fact is that the state of research, peer review, and publication in academia is in complete shambles, in my professional opinion. The amount of system gaming in research has lead me to question most publications and claims I see anymore, unfortunately. I know the system, many of pressures / incentives, and I know a lot of the games played to optimize around these. It's impossible to be any bit lazy and give into any semblance of authoritative findings anymore unless you too just want to play the publication ranking game to stay afloat or get ahead in your career.

We need a massive overhaul of research culture in the US, IMHO.


Wikipedia has never been a source. It is demeaning to pretend it is one, and naive to believe it is a neutral aggregator.


Don't throw the baby out with the bathwater.

While Wikipedia has never been a source, the particular linked wiki page clearly links to multiple controversies in the footnotes. Take a look starting at [37] https://en.m.wikipedia.org/wiki/Frontiers_Media#cite_note-37


When it comes to heavily manipulated "sources" like Wikipedia, I "throw the baby out with the bathwater" (yet another lazy attempt at reasoning is to use phrases like that as justification for, well, anything) is because you cannot see what is not presented in the article, usually in the form of others who will contradict the claims of whatever criticism the page is presenting with equal or stronger sources. Wikipedia guards such open discussion with a lot of dirty tactics which include collusion between article guardians and admins who will have your account locked and IP address blocked from editing in less than one hour of a "violation", and they do not notify you through email of their activities so you have to go discovering what they are doing through their Kafkaesque procedures of adjudication which are decided without any defense from the accused and evidence that doesn't truly qualify as evidence based on any reasonable standard for evidence.

A relatively newer policy implemented through their MediaWiki project is a "feature" whereby an edit can be permanently hidden from view, so future editors cannot go back and see what edits were removed or reverted. This goes completely against the original concept of a wiki and bolsters my claims that Wikipedia should never be used as a source because in the most contentious articles they will hide removed edits containing information they don't like.

The result of this is editors who have contradicting opinions from the Wikipedia-approved opinion will stop editing simply because it isn't worth the effort as Wikipedia always wins (because it is Wikipedia's website).

Wikipedia is a dumpster fire full of bias, pettiness, and structured manipulation. Linking there or using those pages demonstrates laziness and a near complete lack of higher reasoning.


Well, it is not lazy to use the phrase, "throw the baby out with the bathwater." That is a well defined, well understood phrase of the English language that means, "Discard something valuable along with something not wanted." [1]

If you throw the baby out with the bath water, you lose the good parts of something as well as the bad parts, because you reject it as a whole instead of just removing what is bad. [2]

The meaning is that you were throwing out the clear links in the wiki article due to them being presented at wikipedia.org. Footnote 37 and onwards.

[1] https://www.dictionary.com/browse/throw-out-the-baby-with-th...

[2] https://www.collinsdictionary.com/dictionary/english/to-thro...


The phrase, as with many others, is used to hand-wavy dismiss arguments. For example, you completely ignored basically >90% of my comment, which is what people with really poor positions do to attempt to maintain some position of superiority in their minds.


The points you made don't apply here, because they were already answered - in an answer you completely ignored and were tone-deaf about in order to type again about the dangers of wikipedia. What you have glossed over is that I was talking not about a wikipedia article but the links in the footnote section of that wikipedia article.


I specifically stated and made the case that Wikipedia will only allow what Wikipedia wants published to their site. To conclude some measure of criticism exists without competing claims against the criticism is a naive position. This applies to what you are linking to.

Your points do not acknowledgement this central thesis. Wikipedia is garbage, using it as a base of aggregated links is ignorant or naive.

To put it more bluntly: I don't care about the opinions you express regarding this topic, as you are demonstrating very arrogant behavior, particularly in linking to a dictionary dot com article on a common English phrase, as if I've demonstrated some lack of understanding in English require some assistance there. Then completely ignoring my points. You are terrible at discussion.

We are done.


So this is the very beginning of an argument.

"Serious virologists" isn't proof.

Which virologists think it's natural, and which think it's unnatural?

Now lets see the data and thorough reasoning by each, and see the rebuttals of each for their opposition's reasoning, etc.

Until this is laid out clearly it's all shallow discussion - that no one should blindly trust or believe.

The above is the scientific process, method.

There clearly isn't consensus yet.


> Which virologists think it's natural, and which think it's unnatural?

Literally no one, including the authors of this paper, contends the sequence is "unatural".


Are you meaning to say natural or unnatural?


What do you think? Kinda a pedantic response for a spelling mistake.


Both are 1 letter off from the presumed correct word:

unatural + n == correctly spelled unnatural

unatural - u == correctly spelled natural

How the hell are you supposed to guess as if the context of viral sequencing is somehow easy to parse correctly?


Seriously? Is this the quality of discussion HN is regularly becoming? You want me to make an assumption? Is that how you operate in conversation - assumptions to fill in the gaps when something isn't clear?

You realize that their saying "natural" or "unnatural" completely changes what they're saying to the polar opposite, right?

So now clarity is pedantic?

Seriously?

If HN had subreddits there'd be one similar to /r/WatchRedditDie


It's not an assumption for people who read the article. You'd know what the authors were claiming - natural, unatural, or unnatural.


There are arguments from improbability in the article that rely on implicit assumptions that are obviously wrong once made explicit, akin to a creationist arguing for the improbability of the human genome's complexity as though it arose like a shuffled deck of cards.


> Serious virologists went over the furin cleavage site in close detail already and none of them seemed very convinced it was anything but natural in 2020.

https://www.projectveritas.com/news/military-documents-about...

DoD analysts letter: https://assets.ctfassets.net/syq3snmxclc9/2mVob3c1aDd8CNvVny...


Interestingly, the observation in this paper was surfaced on a substack several months ago. https://arkmedic.substack.com/p/how-to-blast-your-way-to-the...

I do not see the substack author among the paper authors or references to that writeup. I wonder if there was any collaboration or if this was an independent finding?

That substack made the rounds here about 5 weeks ago. While it was heavily critiqued for being a rather poor writeup, there was a fair bit of discussion about the observation itself that should be relevant here: https://news.ycombinator.com/item?id=29938732


This has been known for a while, but I'm not complaining about it. Ever since this whole thing started, several scientists have been finding things which are "interesting to take a look into them further", to put it in some way.

I have a close friend who works w/ population genetics and basically does phylogenies all day. For those who don't know, a phylogeny is an analysis where you study how similar/dissimilar are sequences between known things in order to infer what are the most plausible evolutionary relationships that (could have) happened.

So, this friend of mine has been taking a look at some of the published sequences ever since they came out and his personal conclusion is that there is no way that SARS-CoV-2 came to be in a "natural" way.

Among other interesting things, he claims that the virus does not seem to have a single "origin". I'll try to explain, so while you're figuring out what the "history" of something is (i.e. ancestors, lineage), you usually get some sort of tree-like arrangement where some structural rules are strongly (but not completely) preserved, like if A -> B and B -> C then A -> C, you get the idea. In SARS-CoV-2's case, such rules (or lack thereof) suggest that different regions of the sequence cannot be assumed to have followed the same evolutionary history; which is kind of weird, honestly. This is something that has been observed naturally, but under very specific constraints which are unlikely to apply to SARS-CoV-2 and how it has spread throughout the world.

Anyway, I'm happy that these sort of studies are finally coming to light and that, for whatever reason, people are now allowed to talk about it. The whole point of science is to engage in rational discussion and build on each other's knowledge in order to attain the truth.

PS: I have degrees in Genomics and Molecular Biology, and have worked for around 15 years on the field.


> same origin

Sure, but that can also indicate it's a virus resulting from natural stitching together of DNA sequences from multiple infections in one living host, like HIV.

How do we disambiguate novel virus arising from multi-infection interference from human synthesis?


>How do we disambiguate novel virus arising from multi-infection interference from human synthesis?

Best you can do is work with probabilities and try to infer which theory is more plausible than others.


> different regions of the sequence cannot be assumed to have followed the same evolutionary history

This is because of recombination, which is extremely common among coronaviruses.

If you look at the phylogenetics of the relatives of SARS-CoV-2 (forget about SARS-CoV-2 for a moment), you get different trees if you look at different genes. Did the Wuhan Institute of Virology engineer all the different viruses found in nature in bats? That's the conclusion your friend would have to draw, based on the logic you've presented.


>> Examination of SEQ ID11652 revealed that the match extends beyond the 12-nucleotide insertion to a 19-nucleotide sequence: 5′-CTACGTGCCCGCCGAGGAG-3′ (nt 2733-2751 of SEQ ID11652), such that the resulting mRNA would have 3′- GAUGCACGGGCGGCUCCUC-5′, or equivalently 5′- CU CCU CGG CGG GCA CGU AG-3′ (nucleotides 23547-23565 in the SARS-CoV-2 genome, in which the four bold codons yield PRRA, amino acids 681–684 of its spike protein). This is very rare in the NCBI BLAST database.

I don't like "this is very rare"

How rare?

It's a database. Query it, and confidently give numbers to support your statement.


> How rare?

Zero other exact matches, in the database.

An n of 1 has infinite variance. The relevant rarity, though is not the database, but rather "in the universe of viral sequences. The database biases for sequences that we've sequences, so at best you can give a really shitty estimate.

Fwiw I believe the story that this was a stack overflow copy-paste operation from the moderna sequence, but I can only ever call this a strong belief[0], with no numbers behind it, unless someone comes forward and admits having done it.

[0] why strong? Because it follows the scientific method. If the hypothesis is that it's a lab leak, then your prediction is that existing sequences would bleed through. A bit crazy that we only found this now, hell I could have done this blast search years ago, but it is what you would expect to find.


I would say statistically it is not rare. I mean it's a sequence of 12 where each item can only be C,T,A OR G from the little I understand about DNA. It would be quite a bad password even though it's 12 characters long.


statistics in sequence matching depend on underlying base rates; out of the 4*12 (2*24) possible sequences, you will see some never, some many, and many some times.


What is the value of this comment?


Imagine that a virus is detected in your body and then you get sued by a patent troll for having unlicensed genetic sequences in your body.


Or GM seeds apparently getting blown into your fields and then getting sued for growing unlicensed crops:

https://www.washingtonpost.com/archive/politics/2001/03/30/f...


The main reason why the EU bans GMO crops.


Federal patent law defines patent infringement as “mak[ing], us[ing], offer[ing] to sell, or sell[ing]” a patented invention. Not an attorney, but seems like it would be quite a stretch to argue that becoming infected with a virus would fall under one of those categories.


Well, a virus replicates by forcing your cells to make more of the virus. Just sayin'.


I swear that's the plot to something I have heard of... wish I could be more helpful and suggest a link.


Wuhan discussing "synthetically derived viruses":

https://web.archive.org/web/20200212011902/http://english.wh...

2017 conference at (Wuhan Institute of a virology) with gain of function research being top priority:

http://web.archive.org/web/20200221213643/http://english.whi...

Ecohealth Alliance partnership:

https://web.archive.org/web/20210323171425/http://english.wh...

US Gov from state department:

http://web.archive.org/web/20210116001621/https://www.state....

EcoHealth Alliance Peter Daszak discussing gene editing in coronaviruses in december 2019 - https://www.youtube.com/watch?v=5-Y843FFJvI


My imagination has a lot of ideas about what this means. But I also dropped biology in high school and went into computers, so I'm very poorly armed to assess this.

Can someone tell me what the real implications of this are, and what the totally reasonable explanations might be?


> The presence in SARS-CoV-2 of a 19-nucleotide RNA sequence encoding an FCS at amino acid 681 of its spike protein with 100% identity to the reverse complement of a proprietary MSH3 mRNA sequence is highly unusual. Potential explanations for this correlation should be further investigated.


Sure, the implication is that you can be sued for catching COVID, for patent infringement.


No, the implication is that Moderna was involved in gain of function research that led to the creation of SARS-CoV-2 which explains why they happened to conveniently have a vaccine available right from the start. Which, if true, would be one of the greatest scandals in history.


I just tried to read it and got nothing out of it. The whole thing is above my head, lol.


> Conventional biostatistical analysis indicates that the probability of this sequence randomly being present in a 30,000-nucleotide viral genome is 3.21 ×10−11

That seems like a pretty small number.


I don't find that calculation convincing as it ignores that these viruses already have a furin cleavage site that by definition must be pretty similar to the sequence here. So the probablity would have to be calculated as "how likely is it that a virus with a related furin cleavage site accumulates the number of mutations necessary to arrive at this particular optimized one". And even then as this sequence provides a fitness advantage a naive calculation could be seriously off as well.


The probability is also some argument by increduality anyway: 10-11, or you know, an occurrence rate of 10 after a trillion attempts.

In an "average" COVID-19 infection course in an adult human, it is estimated that at peak infection a person has 10^9 to 10^11 virion particles in their body alone. Multiply by the all the people infected, + all the animals, + parallel gene transfer with other viruses in the ecosystem...

[1] https://www.pnas.org/content/118/25/e2024815118


Each virus copy isn’t a different random sequence though. They have to share most of their code to work.


Of course, but if you're going to make arguments about probability in nature, they only mean something compared to the attempt space.


You know, I really know nothing about this stuff, but this use of statistics (across many fields we see the odds of this occurring are X) kinda feels like nonsense. The odds of this occurring is 1. Because it did, and we found it occurred, and then went back and concocted these numbers. It’s like a poker hand you draw, and wow, the odds of getting this hand are miniscule. But the odds of getting a set of cards, if you draw them, is 1.

Going back and finding the “odds” of an event that already happens is mostly meaningless. Extremely unlikely things happen constantly, see any particular game of poker.

What I think would have been at stat that would be interesting is what are the odds of ANY patented sequence appearing in this genome.


It is, though I'm not certain it means much. In fact it's misstated since it's the probability that this exact sequences appears in 30,000 uniformly random base pairs and it appears in a library of 24712 sequences of sequences of 3300 uniformly random base pairs (see Figure 2). Which to me doesn't say much. It's like pointing out it's very unlikely it would have been typed by a monkey mashing on a typewriter.

A somewhat more honest question would be to look at what they did, which is look at a length 12 sequence (and it's complement) that they apparently found interesting and see if it showed up in a library of 24712 sequences with an average length of 3300.

You can estimate this probability by calculating the average number of matches you'd expect to find. This can be done by taking the number of positions a match could be in, approximately 24712*3300 - 24712*11, times 2 to account for its complement showing up, and multiply this by the chance two length 12 sequences will match, estimated by them at (1/4)^12 (in reality this probability should probably be higher since base pairs aren't uniformly random). Since this is an expected value we can ignore that matches at adjacent positions aren't independent.

This suggests you'd expect to find about 10 matches. So I'm not sure if they simply picked the longest one or if there was just the 1 match. You do get a lowish number of 0.000059 of matching sequences of length 19, that said this chance would be significantly higher when you know there exists a match of length 12, especially if the sequences aren't simply uniformly random.


It’s misleading.

Imagine this same analogy for computer programs. Instead of genetic code, considers bits. Some author without understanding of how code works finds a small matching bit sequence in two different executables, then implies that the shared sequence is evidence of some conspiracy.

As a programmer you’d know that the bits of a program aren’t random because they actually do something specific for the programs execution, so it’s not surprising that different programs would share short bit sequences. But someone who doesn’t understand programming might assume it’s legitimate evidence of a conspiracy.

Same thing is happening here. Genetic code isn’t random because it corresponds to specific functions. It’s not surprising that different viruses would end up sharing bits of genetic sequence if they have similar functions.

The author is misusing statistics for random distributions on something that isn’t random. Don’t take them seriously.


In the world of StackOverflow driven software authorship, a particular sequence of source code instructions appearing in source code, on the presence of HTTP requests to that article, coupled with people doing work in that area?

Yeah... I don't need statistics for that one. The bloody tooling speaks, and the chart of human agency/interest speaks for itself. To be honest, ai find it amazing the backflips people are doing to justify that there's no possible way that an interested grad student with the right tools would have tried something like this.

I'd have. Of course I avoid doing science in those types of areas because Murphy finds a way.


I'm a total layperson, but looks like virus sheddings are measured as 10^x copies/ml in serum, and 10^5.5 copies is a typical amount for COVID-19. 10^6ml is 1kL, or 1m^3(3x3x3ft). Doesn't look all that small to me.


Headline here on HN seems somewhat misleading, and I fear for how this is likely to be spun. In this context "patented" does not mean "engineered" or "artificial". This is the complement to a genuine human sequence we all have. It's 19 nucleotides (28 bits) long, so unlikely (but not impossible in a cryptographic sense) to have arisen randomly. We do know viruses will pick up these sequences from their hosts though, so by itself this doesn't actually say much absent some statistics about how common that is.

Basically: please be careful here, a lot of people are going to want to treat this as a smoking gun, and it's not. It sure is interesting though.


This isn't smoking gun by itself, but there sure are a lot of warm guns laying around this particular topic.


I’m having trouble getting the patent to load on my phone, but my interpretation of the “codon optimized” patented sequence is that although the sequence codes for a naturally occurring protein sequence, the RNA triplets that are actually used are not naturally occurring.

MolBio 102 for software engineers: each amino acid is coded for by 3 RNA bases. However, the number of possible RNA triplets (4^3=64) is greater than the number of amino acids (20), so some amino acids have multiple triplets assigned to them. For instance, both UUA and CUG code for the same amino acid (leucine), but one of them is more efficient to express. The implicit argument is that consistent preferential use of these optimal codons is evidence of genetic engineering rather than natural evolution.


I would love some clarification on what "patented" does mean. For instance, I can find a "patent" here: https://seqdata.uspto.gov/?pageRequest=viewSequence&DocID=US... that categorizes it as an "Artificial Sequence".

Is this more like a naturally occurring sequence that has been documented before, isolated, and then claimed by the team that isolated it?


Could you elaborate on what “complement to a genuine human sequence “ means?


That some gene is patented does in itself means nothing. At some point in the nineties the patent offices around the world started accepting patents for discovered gene sequences (i.e. sequences read out of some organism). This meant that patents rights could be granted for any drug targeting the protein the gene codes for, and not just a specific drug that does it. It does not mean that this is an invented DNA sequence. In fact it seems like one that is naturally occurring in humans. The fact that the sequence is patented is irrelevant, and should be ignored. It does not seem like the article make much fuzz about it either.


> "Conventional biostatistical analysis indicates that the probability of this sequence randomly being present in a 30,000-nucleotide viral genome is 3.21×10^−11".

This is just going to feed more conspiracy theory nutters.

First, a 1 in a 100 billion chance is absolutely nothing. There have been hundreds of millions of infections in humans, each representing billions upon billions of replications of the virus. Every replication carries the chance of mutation. Given that,

Second, mutations, whether random or not, are subject to selective pressure and incremental progress. A mutation that moves a sequence just slightly in the right direction, coding for a slightly different but close protein, will be selected for. Basically, evolution optimizes some structures (and therefore sequences) by hill climbing, accepting mutations that improve fitness and discarding mutations that don't. It doesn't roll all the dice at once and start over every time.


> hundreds of millions of infections in humans

This article is of course based on an early sequence of SARSCov2, from before the pandemic spread.

Which means you didn't even begin to understand the question before spouting off, and everything you say on this subject can be disregarded. Do better.


> an early sequence of SARSCov2

Which apparently had never replicated before and thus was not subject to evolution...? My point was just to illustrate how many trillions of replications happen for viruses in the wild. 100 billion is absolutely nothing for biology.

The snide ad hominem that makes up the rest of your comment doesn't belong here. If you had just left it out, this would be a conversation and not an unnecessary source of anger and stress.


> snide ad hominem

You came straight out the gate with "conspiracy nutters", then typed two paragraphs of misleading nonsense which did nothing to further the conversation.

This isn't a dialogue, this is me tagging you for other's sake as someone to ignore on this subject. Which is an important subject.


> You came straight out the gate with "conspiracy nutters"

I stand by everything I wrote. It will definitely feed the conspiracy theory people. I was on r/conspiracy last night and saw a link to a similar article that had identified the stop sequence of the Moderna vaccine, part of the non-protein-coding RNA, as a patented sequence. Well, of course the stop sequence could be patented. It's non-coding DNA. The comments were filled with totally bizarre ranting by people who literally had no understanding.

> two paragraphs of misleading nonsense

Wonderful dynamic here. If you want to actually engage with the ideas I wrote above, please do, but I'd rather assert that the entirety of what you have written here is subtractive and unhelpful. And you're literally and by your own omission not trying to "further the conversation" but stamp it out and have literally zero constructive dialog to offer other than "this person is crazy."

> This isn't a dialogue

So good, please don't continue to reply then. There's [-] for you to click on. You're welcome to downvote. It'll be less effort than firing off half-cocked, completely misunderstanding what I wrote, and creating a stinky atmosphere which devalues all of HN.


Thanks for doing that!!!


Having no idea what this means, I checked reddit. I found this comment [0] which I think says this could have happened naturally, but it uses a lot of words that sound like a Star Trek episode, so maybe someone can translate it?

[0]: https://old.reddit.com/r/science/comments/sysag6/msh3_homolo...


It only sounds like Star Trek gibberish because it's not IT jargon. It's some new kids on the block out there, our time has peaked.


Not new kids—more like different kids. Think Biohacker News vs Hacker News.



I looked a while for this. Thank you, i really mean it.


We get stories like this about once every other week. Here's a recent one, with some expert contributions:

https://news.ycombinator.com/item?id=30279180


Does that make them any less accurate or valid? (Sincere question not rhetorical)


They tend to glue together a scientific fact and some vague conspiracy-esque implied conclusions that are anywhere from debatable to entirely nonsense.

In this case, it’s accurate to say that the same sequence found in SARS-CoV-2 is indeed found in a patent. That’s valid and accurate.

The claim that the occurrence is an impossibly unlikely random chance is the flawed part, though. Generic fragments aren’t actually random because they correspond to actual functions of the virus, and it shouldn’t be surprising that viruses with similar functions have similar genetic code fragments.

It’s like if someone examined the binaries of two executables from different companies, identified a small sequence of bits that was common to both, then tried to imply that they were secretly written by the same author because it’s statistically unlikely for that random sequence of bits to appear twice. As a programmer you’d immediately dismiss the claim because that small sequence of bits could be common code like a standard library part or a common algorithm. But the claim may appear to be proof of a conspiracy to the uneducated.

That’s basically what’s going on with this article, but swap bits for genetic code.


Excellent explanation. Thank you.


You should have a look at the linked thread to see what people had to say about a previous, similar story.


No, he shouldn't. There's no point in giving any value to anything argued here in news.ycombinator.com (a computer programming, technology entrepreneur forum) related to a genomics/biology subject. People here only argue to argue, and the level of discourse in these subjects are terrible, in both "sides" (the fact that people take sides is even a demonstration of how terrible the discourse is).

What has to be done with this scientific paper, from a scientific journal is to read it, maybe go to scholar.google.com to look for its references or other sibling papers. And maybe go to science.reddit.com askscientists.reddit.com or any other internet forum were people educated in the subject interact, to get some good quality dialogue.

Over here at HN? we are just speculating.


On the other hand, even in this very thread are biologists with many years of experience.


Yes and they are frustrating to see as someone with some understanding of molecular biology. The science is so far very clear cut that there is no evidence for a lab engineered virus and only circumstantial evidence for a lab leak (but plausible).

But people keep falling back on political arguments which have nothing more compelling than an argument from some authority. Programmers can see that the media and govt lies when there is a “hack” which was just a result of poor management, and extend that to literally anything reported on by the news.


This really can't be taken seriously if they don't even bother addressing the BANAL-52 or RmYN02 sequences which very nearly have a functioning FCS:

https://ibb.co/T1YtShy

There is a QTQTN deletion in circulating SARS-CoV-2 at the flanking region at 675-679 which has been observed in circulating human strains at the level of a few percent along with mouse-adapted strains. RmYN02 is closer in nucleotide sequences to those variants of SARS-CoV-2 than to ancestral (the SARS-CoV-2 variants carrying that deletion not show in that figure).

If there are variants of RmYNO2 which retain the QTQT sequence at 675-679 then that virus would be one amino acid insertion away from a functioning QTQTNSPRAAR FCS sequence. Then flipping another amino acid would give the SARS-CoV-2 sequence. The relative probably would be discovered with a higher level of surveillance and sequencing of sarbecoviruses in wildlife.

And the rest of the argument in this paper is some weird genetic numerology.


I am not a molecular biologist or virologist, far from it. I am reading the manner in which people who are those things are responding. Some quite scientifically, and others dogmatically. The one tidbit of information I had heard about the FCS was that the SARS-COV-2 had the exact 12-nucleotide sequence, no more no less, so it was an exact match for an FCS proposed for study that was rejected by the DoD, DARPA, or some other U.S. government agency. I believe this is not the same in the other SARS-COV viruses, at least not exactly 12, but found somewhere along a sequence. Can someone here clarify this for me as a layman? I'd appreciate it. It is fascinating to me, and as a result I just picked up an Introduction to Genomics book by Arthur Lesk. Thank you.


Excellent news! Now patent lawyers will sue SARS-CoV-2 to oblivion and end this illegal pandemic. /s


If this occurred naturally, shouldn’t that automatically invalidate the patent?


I think the point is that it might not have occurred naturally in this case.


Could part of it have existed naturally in the past, and the patent captured this part? Surely nobody is writing the entire organism from scratch; they're still making tweaks to natural sequences?


Isn't your statement dangerous?


So now we are back to the idea that this coming from a lab is dangerous idea that shouldn't be talked abbout?

It was something to consider last summer but now it is a dangerous idea again.

We have become such a total joke intellectually.


No. We should consider all non-impossible hypotheses, weighted by their prior probabilities. What's dangerous is to proceed with sanctions against china without having an airtight case (with multiple cross-consistent supporting data inspecting by many parties). Nobody has made an airtight case that the virus was human-engineered (for whatever reason) or leaked (intentionally or accidentally). ANd it's most likely any evidence that did exist has been scrubbed.

However, that's all speculation, not fact.


Yes. That’s why so few dare think it.


No, it's not that they don't dare to think it, it's that they realize it is stupid.


The patent in question: https://patents.google.com/patent/US9587003B2/en

> 14. The method of claim 13, wherein the route of administration is intramuscular

This does not match how sars-cov-2 is administered


I asked this exact question 20 years ago when I learned artificial (IE, synthetic) sequences could be patented. Eventually somebody would patent something with "prior art" existing in nature.


Do we have to pay license fees now, if we already enjoyed SARS-CoV-2? SCNR


Someone call Moderna. They filed the patent so I’m sure they will clarify


I though this claim was BS but indeed you're right.

https://patents.google.com/patent/US9587003B2/en


Generally, the reverse complement of a translated sequence is uninteresting. The function is not encoded in the reverse complement.


something are so clearly obvious (I don't think "obvious" is the exact word I need) that to consider them property of the intellectual kind is silly.

I'm thinking about certain mathematically expressed concepts and ideas which are nonetheless patented and treated as property nowadays; I don't see why DNA would be any different.

It's silly for anybody to "own" (with the exclusivity, royalties deserving modern way it is being done) a gene sequence.

I think that whomever has it already owns it and these things can and in reality are non-exclusively owned. because what is next? "oh you have so and so patented genes hence you must pay the owners royalties?"

I like thinking about who really owns the English (or any natural) language to try and makes sense of how it ought to be....

then again who wouldn't like to come up with some idea and then keep getting paid for this until they die? this is similar to rent-seeking, it is rent-found.


>oh you have so and so patented genes hence you must pay the owners royalties?"

This is exactly how new plant foods are created. Here is a potato as an example:

https://ctl.cornell.edu/wp-content/uploads/plants/Cornell_po...


[flagged]


The article isn't about the vaccine, it's about the RNA sequence directly in COVID.


Someone call Moderna, they filed the patent so I’m sure they will clarify


David Martin went to EUPACO in 2007 and predicted the 2008 crisis there.

His company M-Cam is copied all the patents in the world, and making intelligence reports based on that.

They made an analysis about all the DNA sequences of SARS-CoV-2, and came to the conclusion that it was manipulated by man.

Someone would have to request them the scripts and the patents to be able to reproduce their assessment. Which was flagged as 'complotist'.

https://un-denial.com/2021/07/20/dr-david-martin-covid-is-a-...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: