+ Site Statistics
+ Search Articles
+ Subscribe to Site Feeds
EurekaMag Most Shared ContentMost Shared
EurekaMag PDF Full Text ContentPDF Full Text
+ PDF Full Text
Request PDF Full TextRequest PDF Full Text
+ Follow Us
Follow on FacebookFollow on Facebook
Follow on TwitterFollow on Twitter
Follow on LinkedInFollow on LinkedIn

+ Translate

Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments

Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments

Journal of Molecular Biology 264(4): 823-838

The relative performances of four strategies for aligning a large number of protein sequences were assessed by referring to corresponding structural alignments of 54 independent families. Multiple sequence alignment of a family was constructed by a given method from the sequences of known structures and their homologues, and the subset consisting of the sequences of known structures was extracted from the whole alignment and compared with the structural counterpart in a residue-to-residue fashion. Gap-opening and -extension penalties were optimized for each family and method. Each of the four multiple alignment methods gave significantly more accurate alignments than the conventional pairwise method. In addition, a clear difference in performance was detected among three of the four multiple alignment methods examined. The currently most popular progressive method ranked worst among the four, and the randomized iterative strategy that optimizes the sum-of-pairs score ranked next worst. The two best-performing strategies, one of which was newly developed, both pursue an optimal weighted sum-of-pairs score, where the pair weights were introduced to correct for uneven representations of subgroups in a family. The new method uses doubly nested iterations to make alignment, phylogenetic tree and pair weights mutually consistent. Most importantly, the improvement in accuracy of alignments obtained by these iterative methods over pairwise or progressive method tends to increase with decreasing average sequence identity, implying that iterative refinement is more effective for the generally difficult alignment of remotely related sequences. Four well-known amino acid substitution matrices were also tested in combination with the various methods. However, the effects of substitution matrices were found to be minor in the framework of multiple alignment, and the same order of relative performance of the alignment methods was observed with any of the matrices.

(PDF emailed within 0-6 h: $19.90)

Accession: 009416167

Download citation: RISBibTeXText

PMID: 8980688

DOI: 10.1006/jmbi.1996.0679

Related references

MICAN: a protein structure alignment algorithm that can handle Multiple-chains, Inverse alignments, C(α) only models, Alternative alignments, and Non-sequential alignments. Bmc Bioinformatics 14: 24-24, 2013

Iterative refinement of structure-based sequence alignments by Seed Extension. Bmc Bioinformatics 10(): 210-210, 2009

Building multiple sequence alignments with a flavor of HSSP alignments. Genetics and Molecular Research 5(1): 127-137, 2006

Extracting multiple structural alignments from pairwise alignments: a comparison of a rigorous and a heuristic approach. Bioinformatics 21(7): 1002-1009, 2004

State of the art: refinement of multiple sequence alignments. Bmc Bioinformatics 7: 499-499, 2006

State of the art: refinement of multiple sequence alignments. Bmc Bioinformatics 11(1): 3-0, 2010

Advantages of using multiple sequence alignments over pairwise alignments when sequence similarity is low. Abstracts of Papers American Chemical Society 203(1-3): BIOL60, 1992

Generation and interpretation of protein sequence and structural multiple alignments. Protein Journal 11(4): 389-389, 1992

PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Research 34(Web Server Issue): W609-W612, 2006

Genomic multiple sequence alignments: refinement using a genetic algorithm. Bmc Bioinformatics 6: 200-200, 2005

Gleaning structural and functional information from correlations in protein multiple sequence alignments. Current Opinion in Structural Biology 38: 1-8, 2017

Formatt: Correcting protein multiple structural alignments by incorporating sequence alignment. Bmc Bioinformatics 13(): 259-259, 2013

An integrated approach to the analysis and modeling of protein sequences and structures. III. A comparative study of sequence conservation in protein structural families using multiple structural alignments. Journal of Molecular Biology 301(3): 691-711, 2000

Use of covariance analysis for the prediction of structural domain boundaries from multiple protein sequence alignments. Protein Engineering 15(2): 65-77, February, 2002

ViTO: tool for refinement of protein sequence-structure alignments. Bioinformatics 20(18): 3694-3696, 2004