Choice of Target in the Genomes of Prototypic Strains to Recognize Subgenus of Coronaviruses
Chaley M.B.1, Kutyrkin V.A.2
1Institute of Mathematical Problems of Biology RAS, – Branch of Keldysh Institute of Applied Mathematics RAS, Pushchino, Russia
2Moscow State Technical University n.a. N.E. Bauman, Moscow, Russia
Abstract. Targeted approach to recognition of coronavirus subgenus on the base of codon frequency distribution in the N-gene of nucleocapsid protein was proposed in the work. Deviation of codon frequency distribution in the N-gene of coronavirus genome analyzed from the same distributions for the 67 prototypic strains, which characterize the 23 subgenera in the four coronavirus genera, is calculated on the base of statistics in the approach proposed. The smallest value of such a deviation from certain prototypic strain points at subgenus to which this strain belongs. The approach proposed appeared to be effective and supports significance for recognizing coronavirus subgenus at least 99 %. Populations of the 38 and 7 codons providing for needed efficiency level were selected out of all codons of the genetic code in accordance with their frequency distribution. The codons from the populations outlined fix taxonomic structure of coronavirus subgenus.
Key words: coronavirus subgenus, targeted approach, prototypic strains of coronavirus, coronavirus N-gene, N-gene codon frequency distribution.