Conference proceedings article

Title Compression of whole genome alignments using a mixture of finite-context models
Author Luís Matos, Diogo Pratas, Armando J. Pinho
Booktitle Proceedings of 9th International Conference on Image Analysis and Recognition, ICIAR 2012
Address Aveiro, Portugal
Volume Aurélio Campilho and Mohamed Kamel (Eds.): Part I, LNCS 7324
Pages 359-366
Month June
Year 2012
Group (before 2015) Signal Processing Laboratory
Indexed by ISI Not known yet
Scope International

Abstract - In the last years, advances in DNA sequencing technology have caused a giant growth in the amount of available data related with genomic sequences. One of those types of data sets is that resulting from multiple sequence alignments (MSA). In this paper, we propose a compression method for compressing these data sets, using a mixture of finite-context models and arithmetic coding. The method relies on image compression concepts, it was tested in the multiz28way data set and attained a compression rate around 0.93 bits per symbol on the sequence data, better than the ≈ 1 bit per symbol attained by a recently proposed method.

Electronic version here