+ Site Statistics
+ Search Articles
+ Subscribe to Site Feeds
EurekaMag Most Shared ContentMost Shared
EurekaMag PDF Full Text ContentPDF Full Text
+ PDF Full Text
Request PDF Full TextRequest PDF Full Text
+ Follow Us
Follow on FacebookFollow on Facebook
Follow on TwitterFollow on Twitter
Follow on LinkedInFollow on LinkedIn

+ Translate

Multidimensional mutual information methods for the analysis of covariation in multiple sequence alignments

Multidimensional mutual information methods for the analysis of covariation in multiple sequence alignments

Bmc Bioinformatics 15(): 157-157

Several methods are available for the detection of covarying positions from a multiple sequence alignment (MSA). If the MSA contains a large number of sequences, information about the proximities between residues derived from covariation maps can be sufficient to predict a protein fold. However, in many cases the structure is already known, and information on the covarying positions can be valuable to understand the protein mechanism and dynamic properties. In this study we have sought to determine whether a multivariate (multidimensional) extension of traditional mutual information (MI) can be an additional tool to study covariation. The performance of two multidimensional MI (mdMI) methods, designed to remove the effect of ternary/quaternary interdependencies, was tested with a set of 9 MSAs each containing <400 sequences, and was shown to be comparable to that of the newest methods based on maximum entropy/pseudolikelyhood statistical models of protein sequences. However, while all the methods tested detected a similar number of covarying pairs among the residues separated by < 8 Å in the reference X-ray structures, there was on average less than 65% overlap between the top scoring pairs detected by methods that are based on different principles. Given the large variety of structure and evolutionary history of different proteins it is possible that a single best method to detect covariation in all proteins does not exist, and that for each protein family the best information can be derived by merging/comparing results obtained with different methods. This approach may be particularly valuable in those cases in which the size of the MSA is small or the quality of the alignment is low, leading to significant differences in the pairs detected by different methods.

(PDF emailed within 0-6 h: $19.90)

Accession: 054488441

Download citation: RISBibTeXText

PMID: 24886131

DOI: 10.1186/1471-2105-15-157

Related references

Mutual information in protein multiple sequence alignments reveals two classes of coevolving positions. Biochemistry 44(19): 7156-7165, 2005

Embedding strategies for effective use of information from multiple sequence alignments. Protein Science 6(3): 698-705, 1997

Advantages of using multiple sequence alignments over pairwise alignments when sequence similarity is low. Abstracts of Papers American Chemical Society 203(1-3): BIOL60, 1992

Measuring covariation in RNA alignments: physical realism improves information measures. Bioinformatics 22(24): 2988-2995, 2006

Gleaning structural and functional information from correlations in protein multiple sequence alignments. Current Opinion in Structural Biology 38: 1-8, 2017

Sequence Diversity Diagram for comparative analysis of multiple sequence alignments. Bmc Proceedings 8(Suppl 2 Proceedings of the 3rd Annual Symposium on Biologica): S9-S9, 2014

Building multiple sequence alignments with a flavor of HSSP alignments. Genetics and Molecular Research 5(1): 127-137, 2006

Computational methods for protein secondary structure prediction using multiple sequence alignments. Current Protein & Peptide Science 1(3): 273-301, 2002

MIA: Mutual Information Analyzer, a graphic user interface program that calculates entropy, vertical and horizontal mutual information of molecular sequence sets. Bmc Bioinformatics 16(): 409-409, 2016

Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. Journal of Molecular Biology 264(4): 823-838, 1996

The influence of gapped positions in multiple sequence alignments on secondary structure prediction methods. Computational Biology and Chemistry 28(5-6): 351-366, 2004

Exploratory analysis of multiple sequence alignments using phylogenies. Computer Applications in the Biosciences 10(3): 243-247, 1994

Creation and analysis of protein multiple sequence alignments. Methods of Biochemical Analysis 43: 215-232, 2001

ESPript: Analysis of multiple sequence alignments in PostScript. Bioinformatics (Oxford) 15(4): 305-308, April, 1999