Knowing nothing about biotech – if Moderna and Pfizer were working from the same sequencing data, why would their resulting vaccine mRNA sequences be different? Even slightly?
Edit: I guess what I'm asking is: presumably these vaccines both target the spike protein. Do both of these sequences express the same protein? Or is there a "close enough!" thing in the immune system, where it can be a little different and still be targeted by the immune system?
The sequence can be changed and optimized for several reasons:
* There are untranslated regions (UTR) that could influence the regulation or stability of the mRNA.
* Since most amino acids are encoded by more than codon, the coding region for the spike protein can be codon optimized. Altering the codon composition can improve protein expression.
* Likewise, enrichment of G:C content in the mRNA sequence might result in increased mRNA and expressed protein yields in vivo.
> Do both of these sequences express the same protein?
In this case both vaccines express exactly the same amino acid sequence.
> Or is there a "close enough!" thing in the immune system, where it can be a little different and still be targeted by the immune system?
It depends on how different the sequence is. For instance, if it is a little different the immune response should be very similar because, for example, the three-dimensional conformation of the spike protein chain should remain very similar as well. This is why the vaccines can be effective against several SARS-CoV2 variants.
Sequences are different because they are differently codon optimized. See https://en.wikipedia.org/wiki/Codon_usage_bias, especially "Effect on transcription or gene expression" section.
That is a super interesting article, thank you for posting it!
But, I guess my question is more about why the abstraction of "protein chunks" doesn't fall apart when there are relatively significant "diffs" in the RNA sequence.
The most significant diffs between both vaccines occur in the untranslated regions located around the protein coding sequence and will never be present in the actual spike protein.
Regarding the protein coding region, because of the degeneracy/redundancy of the genetic code, all changes within it are synonymous and code for identical amino acids.
That is a fascinating read (and the perfect level of depth in this field for me). How did you happen across it? Always looking to add a good source to my RSS feed list