Title Genome analysis with distance to the nearest dissimilar nucleotide
Author Vera Afreixo, Carlos A C Bastos, Armando J. Pinho, Sara Pinto Garcia, Paulo J S G Ferreira
Journal Journal of Theoretical Biology
Volume 275
Number 1
Pages 52-58
Month April
Year 2011
DOI 10.1016/j.jtbi.2011.01.038
Group (before 2015) Signal Processing Laboratory, Transverse Activity on Innovative Biomedical Technologies
Indexed by ISI Yes


DNA may be represented by sequences of four symbols, but it is often useful to convert those symbols into real or complex numbers for further analysis. Several mapping schemes have been used in the past, but most of them seem to be unrelated to any intrinsic characteristic of DNA. The objective of this work was to study a mapping scheme that is directly related to DNA characteristics, and that could be useful in discriminating between different species.

Recently, we have proposed a methodology based on the inter-nucleotide distance, which proved to contribute to the discrimination among species. In this paper, we introduce a new distance, the distance to the nearest dissimilar nucleotide, which is the distance of a nucleotide to first occurrence of a different nucleotide. This distance is related to the repetition structure of single nucleotides. Using the information resulting from the concatenation of the distance to the nearest dissimilar and the inter-nucleotide distance, we found that this new distance brings additional discriminative capabilities. This suggests that the distance to the nearest dissimilar nucleotide might contribute with useful information about the evolution of the species.