+ Site Statistics
+ Search Articles
+ Subscribe to Site Feeds
Most Shared
PDF Full Text
+ PDF Full Text
Request PDF Full Text
+ Follow Us
Follow on Facebook
Follow on Twitter
Follow on LinkedIn
+ Translate
+ Recently Requested

QAPgrid: a two level QAP-based approach for large-scale data analysis and visualization

QAPgrid: a two level QAP-based approach for large-scale data analysis and visualization

Plos One 6(1): E14468

The visualization of large volumes of data is a computationally challenging task that often promises rewarding new insights. There is great potential in the application of new algorithms and models from combinatorial optimisation. Datasets often contain "hidden regularities" and a combined identification and visualization method should reveal these structures and present them in a way that helps analysis. While several methodologies exist, including those that use non-linear optimization algorithms, severe limitations exist even when working with only a few hundred objects. We present a new data visualization approach (QAPgrid) that reveals patterns of similarities and differences in large datasets of objects for which a similarity measure can be computed. Objects are assigned to positions on an underlying square grid in a two-dimensional space. We use the Quadratic Assignment Problem (QAP) as a mathematical model to provide an objective function for assignment of objects to positions on the grid. We employ a Memetic Algorithm (a powerful metaheuristic) to tackle the large instances of this NP-hard combinatorial optimization problem, and we show its performance on the visualization of real data sets. Overall, the results show that QAPgrid algorithm is able to produce a layout that represents the relationships between objects in the data set. Furthermore, it also represents the relationships between clusters that are feed into the algorithm. We apply the QAPgrid on the 84 Indo-European languages instance, producing a near-optimal layout. Next, we produce a layout of 470 world universities with an observed high degree of correlation with the score used by the Academic Ranking of World Universities compiled in the The Shanghai Jiao Tong University Academic Ranking of World Universities without the need of an ad hoc weighting of attributes. Finally, our Gene Ontology-based study on Saccharomyces cerevisiae fully demonstrates the scalability and precision of our method as a novel alternative tool for functional genomics.

(PDF emailed within 1 workday: $29.90)

Accession: 055302091

Download citation: RISBibTeXText

PMID: 21267077

Related references

Sparse network modeling and metscape-based visualization methods for the analysis of large-scale metabolomics data. Bioinformatics 33(10): 1545-1553, 2017

Using the QAPgrid Visualization Approach for Biomarker Identification of Cell-Specific Transcriptomic Signatures. Methods in Molecular Biology 1526: 271-297, 2018

QuickRNASeq lifts large-scale RNA-seq data analyses to the next level of automation and interactive visualization. Bmc Genomics 17: 39, 2016

Cluster analysis and data visualization of large-scale gene expression data. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing 1998: 42-53, 1998

TASUKE: a web-based visualization program for large-scale resequencing data. Bioinformatics 29(14): 1806-1808, 2013

SLIDE - a web-based tool for interactive visualization of large-scale -omics data. Bioinformatics 2018, 2018

Visualization of large-scale aqueous solubility data using a novel hierarchical data visualization technique. Journal of Chemical Information and Modeling 46(3): 1054-1059, 2006

The meaning of it all: web-based resources for large-scale functional annotation and visualization of DNA microarray data. Trends in Genetics 18(11): 589-592, 2002

FuncTree: Functional Analysis and Visualization for Large-Scale Omics Data. Plos One 10(5): E0126967, 2016

Visualization analysis of large-scale three-dimensional scalar data of ocean simulation. Journal of Visualization 9(4): 356-356, 2006

Workflows for automated downstream data analysis and visualization in large-scale computational mass spectrometry. Proteomics 15(8): 1443-1447, 2016

MetaReg: a platform for modeling, analysis and visualization of biological systems using large-scale experimental data. Genome Biology 9(1): R1, 2008

Ssecrett and NeuroTrace: interactive visualization and analysis tools for large-scale neuroscience data sets. IEEE Computer Graphics and Applications 30(3): 58-70, 2010

SpindleSphere: A Web-based Platform for Large-scale Sleep Spindle Analysis and Visualization. AMIA ... Annual Symposium Proceedings. AMIA Symposium 2017: 1159-1168, 2018

Interactive Visualization and On-Demand Processing of Large Volume Data: A Fully GPU-Based Out-Of-Core Approach. IEEE Transactions on Visualization and Computer Graphics 2019, 2019