+ Site Statistics
+ Search Articles
+ PDF Full Text Service
How our service works
Request PDF Full Text
+ Follow Us
Follow on Facebook
Follow on Twitter
Follow on LinkedIn
+ Subscribe to Site Feeds
Most Shared
PDF Full Text
+ Translate
+ Recently Requested

Detecting virus integration sites based on multiple related sequencing data by VirTect

Detecting virus integration sites based on multiple related sequencing data by VirTect

Bmc Medical Genomics 12(Suppl 1): 19

Since tumor often has a high level of intra-tumor heterogeneity, multiple tumor samples from the same patient at different locations or different time points are often sequenced to study tumor intra-heterogeneity or tumor evolution. In virus-related tumors such as human papillomavirus- and Hepatitis B Virus-related tumors, virus genome integrations can be critical driving events. It is thus important to investigate the integration sites of the virus genomes. Currently, a few algorithms for detecting virus integration sites based on high-throughput sequencing have been developed, but their insufficient performance in their sensitivity, specificity and computational complexity hinders their applications in multiple related tumor sequencing. We develop VirTect for detecting virus integration sites simultaneously from multiple related-sample data. This algorithm is mainly based on the joint analysis of short reads spanning breakpoints of integration sites from multiple samples. To achieve high specificity and breakpoint accuracy, a local precise sandwich alignment algorithm is used. Simulation and real data analyses show that, compared with other algorithms, VirTect is significantly more sensitive and has a similar or lower false discovery rate. VirTect can provide more accurate breakpoint position and is computationally much more efficient in terms both memory requirement and computational time.

Please choose payment method:

(PDF emailed within 0-6 h: $19.90)

Accession: 066446562

Download citation: RISBibTeXText

PMID: 30704462

DOI: 10.1186/s12920-018-0461-8

Related references

VISA--Vector Integration Site Analysis server: a web-based server to rapidly identify retroviral integration sites from next-generation sequencing. Bmc Bioinformatics 16: 212, 2016

Integration of knowledge from disparate data resources for detecting uncontrolled and abandoned waste disposal sites. Computer Techniques in Environmental Studies 4: 749-766, 1992

A high-throughput method for cloning and sequencing human immunodeficiency virus type 1 integration sites. Journal of Virology 80(22): 11313-11321, 2006

Dr.VIS v2.0: an updated database of human disease-related viral integration sites in the era of high-throughput deep sequencing. Nucleic Acids Research 43(Database Issue): D887-D892, 2015

Ub-ISAP: a streamlined UNIX pipeline for mining unique viral vector integration sites from next generation sequencing data. Bmc Bioinformatics 18(1): 305, 2017

VirusFinder: software for efficient and accurate detection of viruses and their integration sites in host genomes through next generation sequencing data. Plos One 8(5): E64465, 2013

Methods for detecting mislabeled data from multiple sites per patient. Controlled Clinical Trials 10(3): 354-0, 1989

Transmission and decorrelation methods for detecting rare variants using sequencing data from related individuals. Bmc Proceedings 10(Suppl 7): 203-207, 2016

Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. American Journal of Human Genetics 91(5): 839-848, 2013

Prediction of phosphorylation sites based on the integration of multiple classifiers. Genetics and Molecular Research 16(1), 2017

Common proviral integration sites in C57BL mouse lymphomas induced by radiation leukemia virus and absence of novel virus-related sequences in radiogenic lymphoma DNA. Leukemia Research 6(3): 285-297, 1982

Caprine arthritis-encephalitis virus may have multiple sites of integration in its host genome. Abstracts of the General Meeting of the American Society for Microbiology 96(0): 585, 1996

Computational framework for the prediction of transcription factor binding sites by multiple data integration. Bmc Neuroscience 7 Suppl 1: S8, 2006

cisASE: a likelihood-based method for detecting putative cis-regulated allele-specific expression in RNA sequencing data. Bioinformatics 32(21): 3291-3297, 2016

Multiple integration sites for Moloney murine leukemia virus in productively infected mouse fibroblasts. Journal of Virology 30(3): 657-667, 1979