[R] Covid Mutations: Cumulative?

Leonard Mada |eo@m@d@ @end|ng |rom @yon|c@eu
Tue Jan 31 00:37:30 CET 2023


Dear R-Users,

Did anyone follow more closely the SARS Cov-2 lineages?

I have done a quick check of Cov-2 mutations on the list downloaded from 
NCBI (see GitHub page below); but it seems that the list contains the 
cumulative mutations only for B.1 => B.1.1, but not after the B.1.1 branch:
# B.1 => B.1.1 seems cumulative
diff.lineage("B.1.1", "B.1", data=z)
# but B.1.1 => B.1.1.529 is NOT cumulative anymore;
diff.lineage("B.1.1.529", "B.1.1", data=z)
diff.lineage("B.1.1.529", "BA.2", data=z)
diff.lineage("B.1.1.529", "BA.5", data=z)

# Column id: B(oth) = present in both lineages:
         V   Mutation    P    AA Pos AAi AAm Polymorphism id
899 B.1.1 nsp3:F106F nsp3 F106F 106   F F         TRUE  B
900 B.1.1 RdRp:P323L RdRp P323L 323   P L        FALSE  B
901 B.1.1    S:D614G    S D614G 614   D G        FALSE  B
902 B.1.1    N:R203K    N R203K 203   R K        FALSE  1
903 B.1.1    N:R203R    N R203R 203   R R         TRUE  1
904 B.1.1    N:G204R    N G204R 204   G R        FALSE  1
896   B.1 nsp3:F106F nsp3 F106F 106   F F         TRUE  B
897   B.1 RdRp:P323L RdRp P323L 323   P L        FALSE  B
898   B.1    S:D614G    S D614G 614   D G        FALSE  B
# B.1.1.529 and branches do not have any of the defining mutations of B.1.1;

I have uploaded the code on GitHub:
https://github.com/discoleo/R/blob/master/Stat/Infx/Cov2.Variants.R

1.) Does anyone have a better picture of what is going on?
The sub-variants should have cumulative mutations. This should be the 
logic for the sub-lineages and I deduce it also by the data/post on the 
GitHub pango page:
https://github.com/cov-lineages/pango-designation/issues/361


2.) Cumulative List

It maybe that NCBI kept only the new mutations, as the number of 
mutations increased.


Does anyone know if there is a full cumulative list?

Alternatively, there might be a list or package with the full lineage 
encoding. There is a list on the Pango GitHub project, but I hope to 
skip at least this step; the synonyms in the NCBI file seem uglier to 
process.


Note:

This question may be more oriented towards Bioconductor; but I haven't 
found any real Covid packages on Bioconductor.


Thank you very much for any help.


Sincerely,


Leonard



More information about the R-help mailing list