Russian version English version
Volume 15   Issue 2   Year 2020
Dotsenko G.S., Dotsenko A.S.

Conserved Peptides Recognition by Ensemble of Neural Networks for Mining Protein Data – LPMO Case Study

Mathematical Biology & Bioinformatics. 2020;15(2):429-440.

doi: 10.17537/2020.15.429.

References

  1. Ijaq J., Chandrasekharan M., Poddar R., Bethi N., Sundararajan V.S. Annotation and curation of uncharacterized proteins – challenges. Frontiers in Genetics. 2015;6:119. doi: 10.3389/fgene.2015.00119
  2. Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J. Basic local alignment search tool. Journal of Molecular Biology. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2
  3. Pertsemlidis A., Fondon III J.W. Having a BLAST with bioinformatics (and avoiding BLASTphemy). Genome Biology. 2001;2. Article No. reviews2002. doi: 10.1186/gb-2001-2-10-reviews2002
  4. Tian W., Skolnick J. How well is enzyme function conserved as a function of pairwise sequence identity? Journal of Molecular Biology. 2003;333:863–882.
  5. Yoon B.-J. Hidden Markov models and their applications in biological sequence analysis. Current Genomics. 2009;10:402–415. doi: 10.2174/138920209789177575
  6. Choo K.H., Tong J.C., Zhang L. Recent applications of hidden Markov models in computational biology. Genomics, Proteomics and Bioinformatics. 2004;2:84–96. doi: 10.1016/S1672-0229(04)02014-5
  7. HMMER: Biosequence Analysis Using Profile Hidden Markov Models. http://hmmer.org/ (accessed 01.09.2020).
  8. El-Gebali S., Mistry J., Bateman A., Eddy S.R., Luciani A., Potter S.C., Qureshi M., Richardson L.J., Salazar G.A., Smart A., Sonnhammer E.L.L., Hirsh L., Paladin L., Piovesan D., Tosatto S.C.E., Finn R.D. The Pfam protein families database in 2019. Nucleic Acids Research. 2019;47(Database Issue):D427–D432. doi: 10.1093/nar/gky995
  9. Sigrist C.J.A., de Castro E., Cerutti L., Cuche B.A., Hulo N., Bridge A., Bougueleret L., Xenarios I. New and continuing developments at PROSITE. Nucleic Acids Research. 2013;41(Database Issue):D344–D347. doi: 10.1093/nar/gks1067
  10. Busk P.K., Lange L. Function-based classification of carbohydrate-active enzymes by recognition of short, conserved peptide motifs. Applied and Environmental Microbiology. 2013;79:3380–3391. doi: 10.1128/AEM.03803-12
  11. Busk P.K., Lange M., Pilgaard B., Lange L. Several genes encoding enzymes with the same activity are necessary for aerobic fungal degradation of cellulose in nature. PLoS ONE. 2014;9:e114138. doi: 10.1371/journal.pone.0114138
  12. Busk P.K., Pilgaard B., Lezyk M.J., Meyer A.S., Lange L. Homology to peptide pattern for annotation of carbohydrate-active enzymes and prediction of function. BMC Bioinformatics. 2017;18:214. doi: 10.1186/s12859-017-1625-9
  13. Lu S., Wang J., Chitsaz F., Derbyshire M.K., Geer R.C., Gonzales N.R., Gwadz M., Hurwitz D.I., Marchler G.H., Song J.S., Thanki N., Yamashita R.A., Yang M., Zhang D., Zheng C., Lanczycki C.J., Marchler-Bauer A. CDD/SPARCLE: the conserved domain database in 2020. Nucleic Acids Research. 2020;48(Database Issue):D265–D268. doi: 10.1093/nar/gkz991
  14. Agger J.W., Busk P.K., Pilgaard B., Meyer A.S., Lange L. A new functional classification of glucuronoyl esterases by peptide pattern recognition. Frontiers in Microbiology. 2017;8:309. doi: 10.3389/fmicb.2017.00309
  15. Busk P.K., Lange L. Classification of fungal and bacterial lytic polysaccharide monooxygenases. BMC Genomics. 2015;16:368. doi: 10.1186/s12864-015-1601-6
  16. Hemsworth G.R., Johnston E.M., Davies G.J., Walton P.H. Lytic polysaccharide monooxygenases in biomass conversion. Trends in Biotechnology. 2015;33:747–761. doi: 10.1016/j.tibtech.2015.09.006
  17. Johansen K.S. Lytic polysaccharide monooxygenases: the microbial power tool for lignocellulose degradation. Trends in Plant Science. 2016;21:926–936. doi: 10.1016/j.tplants.2016.07.012
  18. CAZy, carbohydrate-active enzymes database. http://www.cazy.org/ (accessed 01.09.2020).
  19. NCBI protein database. https://www.ncbi.nlm.nih.gov/protein/ (accessed 01.09.2020).
  20. UniProt Database. https://www.uniprot.org/ (accessed 01.09.2020).
  21. Sievers F., Wilm A., Dineen D.G., Gibson T.J., Karplus K., Li W., Lopez R., McWilliam H., Remmert M., Söding J., Thompson J.D., Higgins D.G. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Molecular Systems Biology. 2011;7:539. doi: 10.1038/msb.2011.75
Table of Contents Original Article
Math. Biol. Bioinf.
2020;15(2):429-440
doi: 10.17537/2020.15.429
published in English

Abstract (eng.)
Abstract (rus.)
Full text (eng., pdf)
References
Supplementary data

 

  Copyright IMPB RAS © 2005-2022