+ Site Statistics
+ Search Articles
+ Subscribe to Site Feeds
Most Shared
PDF Full Text
+ PDF Full Text
Request PDF Full Text
+ Follow Us
Follow on Facebook
Follow on Twitter
Follow on LinkedIn

+ Translate
+ Recently Requested

Key2Ann: a tool to process sequence sets by replacing database identifiers with a human-readable annotation

Key2Ann: a tool to process sequence sets by replacing database identifiers with a human-readable annotation

Journal of Integrative Bioinformatics 8(1)

Deducing common properties or degrees of phylogenetic relationship by analyzing a grouping or clustering of sequence sets is a frequently used technique in computational biology. If interpreted by means of visual inspection, the conclusions depend for many of these applications on meaningful names for the input data. In accordance with the aim of the analysis, the sequences should be provided with names indicating the function of the genes or gene-products, the phylogenetic position or other properties characterizing the contributing species. However, sequences extracted from databases are most often annotated with identifiers which only implicitly contain the desired information. To solve this problem, we have designed and implemented a tool named Key2Ann, which replaces in multiple fasta files the database keys with short terms indicating the taxonomic position or other features like the gene name or the EC-number. In addition, properties like habitat, growth temperature or the degree of pathogenicity can be coded for microbial species. To allow for highest flexibility, the user can control the composition of the names by means of command line parameters. Key2Ann is written in Java and can be downloaded via http://www-bioinf.uni-regensburg.de/downl/Key2Ann.zip. We demonstrate the usage of Key2Ann by discussing three typical examples of phylogenetic analysis.

(PDF emailed within 0-6 h: $19.90)

Accession: 054040354

Download citation: RISBibTeXText

PMID: 21372341

DOI: 10.2390/biecoll-jib-2011-153

Related references

Evaluation of human-readable annotation in biomolecular sequence databases with biological rule libraries. Bioinformatics 15(7-8): 528-535, 1999

A database of unique protein sequence identifiers for proteome studies. Proteomics 6(16): 4514-4522, 2006

The use of a database of unique protein sequence identifiers in microbial proteomics. 2006

ASAP: automated sequence annotation pipeline for web-based updating of sequence information with a local dynamic database. Bioinformatics 19(5): 675-676, 2003

The Human Virome Protein Cluster Database (HVPC): A Human Viral Metagenomic Database for Diversity and Function Annotation. Frontiers in Microbiology 9: 1110, 2018

DNannotator A web-based sequence annotation tool kit and its application on analyses of 13q32-33 sequence. American Journal of Human Genetics 71(4 Supplement): 392, October, 2002

The RESID Database of Protein Modifications as a resource and annotation tool. Proteomics 4(6): 1527-1533, 2004

Protein Sequence Annotation Tool (PSAT): a centralized web-based meta-server for high-throughput sequence annotations. Bmc Bioinformatics 17: 43, 2016

CHPVDB--a sequence annotation database for Chandipura virus. Bioinformation 3(7): 299-302, 2009

UniSave: the UniProtKB sequence/annotation version database. Bioinformatics 22(10): 1284-1285, 2006

Drosophila genomic sequence annotation using the BLOCKS+ database. Genome Research 10(4): 543-546, 2000

MitoProteome: mitochondrial protein sequence database and annotation system. Nucleic Acids Research 32(Database Issue): D463-D467, 2003

CHIKVPRO - a protein sequence annotation database for chikungunya virus. Bioinformation 5(1): 4-6, 2011

Improved detection and annotation of transposable elements in sequenced genomes using multiple reference sequence sets. Genomics 91(5): 467-475, 2008

Challenges in the association of human single nucleotide polymorphism mentions with unique database identifiers. Bmc Bioinformatics 12 Suppl 4: S4, 2011