Russian version English version
Volume 18   Issue 2   Year 2023
Performance Analysis of Cross-Assembly of Metatranscriptomic Datasets in Viral Community Studies

Bukin Yu.S., Bondaryuk A.N., Butina T.V.

Limnological Institute Siberian Branch of the Russian Academy of Sciences, Irkutsk, Russia


Abstract. We conducted a comparative analysis of individual and cross-assemblies of several metatranscriptomic data sets to study viral communities using several metatranscriptomes of endemic Baikal mollusks. We have shown that, compared to individual dataset assemblies, a Hidden Markov Model-based cross-assembly procedure increases the number of viral contigs (or scaffolds) per sample, the number of virotypes identified, and the average length of scaffolds per sample. The proportion of assembled viral reads from the total number of reads in samples is higher in cross-assembly. De novo cross-genomic assemblies combined with a virus identification algorithm using Hidden Markov Model present the data in a table with the number of reads from different samples for each scaffold. The table allows comparison of samples based on the representation of all viral scaffolds, including those not taxonomically identified, i.e. those that have no analogues in the NCBI RefSeq database. Thus, cross-genomic assemblies allow for comparative analyzes taking into account the latent diversity of viruses. We propose a pipeline for metatranscriptomic data analysis using de novo cross-genomic assembly to study viral diversity.



Key words: metagenomics, transcriptomics, viruses, viral communities, metagenomic assembly, cross-assembly, metatranscriptomic analysis, viral scaffolds.

Table of Contents Original Article
Math. Biol. Bioinf.
2023;18(2):418-433
doi: 10.17537/2023.18.418
published in Russian

Abstract (rus.)
Abstract (eng.)
Full text (rus., pdf)
References

 

  Copyright IMPB RAS © 2005-2024