Russian version English version
Volume 11   Issue 1   Year 2016
Smetanin Y.G., Ulyanov M.V., Pestova A.S.

Entropy Approach to the Construction of a Measure of Word Symbolic Diverseness and its Application to Clustering of Plant Genomes

Mathematical Biology & Bioinformatics. 2016;11(1):114-126.

doi: 10.17537/2016.11.114.



  1. Lothaire M. Algebraic Combinatorics of Words. Cambridge (UK): Cambridge Univ. Press; 2002. 455 p. doi: 10.1017/CBO9781107326019
  2. Lind D., Marcus B. An introduction to symbolic dynamics and coding. Cambridge (UK): Cambridge Univ. Press; 1995. 495 p. doi: 10.1017/CBO9780511626302
  3. Shannon C.E. A mathematical theory of communication. Bell Syst. Techn. Journ. 1948;XXVII(3):379-423. doi: 10.1002/j.1538-7305.1948.tb01338.x
  4. Shannon C.E. A mathematical theory of communication. Bell Syst. Techn. Journ. 1948;XXVII(4):623-656. doi: 10.1002/j.1538-7305.1948.tb00917.x
  5. Kolmogoroff A.N. Obscaja teorija dinamiceskih sistem i klassiceskaja mehanika (General theories of dynamical systems in classical mechanics). In: Proceedings of the International Congress of Mathematicians 1954 held at Amsterdam under the auspices of the WISKUNDIG GENOOTSCHAP. Groningen: Erven P. Noordhoff N.V.; Amsterdam: North-Holland Ublishing CO.; 1957. P. 315-333 (in Russ.).
  6. Khinchin A.Ya. The concept of entropy in the theory of probability. Uspekhi Mat. Nauk. 1953;3(55):3-20 (in Russ.).
  7. Martin N., Inglend Dzh. Matematicheskaia teoriia entropii. Moscow: Mir; 1988. 350 p. (Translation of: Martin N.F., England J.W. Mathematical Theory of Entropy. 1984. 257 p.).
  8. Smetanin Y.G., Ulyanov M.V. Reconstruction of a Word from a Finite Set of its Subwords under the unit Shift Hypothesis. I. Reconstruction without for Bidden Words. Cybernetics and Systems Analysis. 2014;50(1):148-156. doi: 10.1007/s10559-014-9602-z
  9. Wootton J.C., Federhen S. Analysis of compositionally biased regions in sequence databases. Methods Enzymol. 1996;266:554-571. doi: 10.1016/S0076-6879(96)66035-2
  10. Gusev V.D., Kulichkov V.A., Chupakhina O.M.Y. Complexity analysis of genomes. I. Complexity and classification methods of detected structural regularities. Mol. Biol. (Mosk). 1991;25(3):825-834.
  11. Gusev V.D., Kulichkov V.A., Chupakhina O.M. The Lempel-Ziv complexity and local structure analysis of genomes. Biosystems. 1993;30(1-3):183-200. doi: 10.1016/0303-2647(93)90070-S
  12. Kislyuk O.S., Borovina T.A., Nazipova N.N. Estimation of Redundancy of Genetic Texts by the High Frequency Component of the l-Gram Graph. Biophysics. 1999;44(4):621-630.
  13. Troyanskaya O.G., Arbell O., Koren Y., Landau G. M., Bolshoy A. Sequence complexity profiles of prokaryotic genomic sequences: a fast algorithm for calculating linguistic complexity. Bioinformatics. 2002;18(5):679-688. doi: 10.1093/bioinformatics/18.5.679
  14. Orlov Iu.L. Analiz reguliatornykh genomnykh posledovatel'nostei s pomoshch'iu komp'iuternykh metodov otsenok slozhnosti geneticheskikh tekstov (Analysis of the genomic regulatory sequences using computer techniques bounds for the complexity of genetic texts): Ph. D. Thesis. Novosibirsk; 2004. 148 p. (in Russ.).
  15. Rudakov K.V., Torshin I.Yu. selection of informative feature values on the basis of solvability criteria in the problem of protein secondary structure recognition. Doklady Mathematics. 2011;84(3):871-874. doi: 10.1134/S1064562411070064
  16. Smetanin Y., Ul'yanov M. Determining the characteristics of kolmogorov complexity of time series: an approach based on symbolic descriptions. Bisiness informatics. 2013;2(24):49-54 (in Russ.).
  17. Smetanin Y., Ul'yanov M. Measure of symbolical diversity: Combinatorics on words as an approach to identify generalized characteristics of time series. Bisiness informatics. 2014;3(29):40-48 (in Russ.).
  18. Kormen T., Leizerson Ch., Rivest R., Shtain K. Algoritmy: postroenie i analiz. Moscow: Izdatel'skii dom «Vil'iams»; 2005. 1296 p. (Translation of: Cormen Thomas H., Leiserson Charles E., Rivest R.L., Stein C. Introduction to Algorithms. 2nd. MIT Press and McGraw-Hill; 2001).
  19. GenBank. (accessed 20 March 2016).
  20. European Nucleotide Archive. (accessed 20 March 2016).
  21. DNA Data Bank of Japan. (accessed 20 March 2016).
Table of Contents Original Article
Math. Biol. Bioinf.
doi: 10.17537/2016.11.114
published in Russian

Abstract (rus.)
Abstract (eng.)
Full text (rus., pdf)


  Copyright IMPB RAS © 2005-2024