+ Site Statistics
+ Search Articles
+ PDF Full Text Service
How our service works
Request PDF Full Text
+ Follow Us
Follow on Facebook
Follow on Twitter
Follow on LinkedIn
+ Subscribe to Site Feeds
Most Shared
PDF Full Text
+ Translate
+ Recently Requested

Testing for Multivariate Normality in Mass Spectrometry Imaging Data: A Robust Statistical Approach for Clustering Evaluation and the Generation of Synthetic Mass Spectrometry Imaging Data Sets



Testing for Multivariate Normality in Mass Spectrometry Imaging Data: A Robust Statistical Approach for Clustering Evaluation and the Generation of Synthetic Mass Spectrometry Imaging Data Sets



Analytical Chemistry 88(22): 10893-10899



Spatial clustering is a powerful tool in mass spectrometry imaging (MSI) and has been demonstrated to be capable of differentiating tumor types, visualizing intratumor heterogeneity, and segmenting anatomical structures. Several clustering methods have been applied to mass spectrometry imaging data, but a principled comparison and evaluation of different clustering techniques presents a significant challenge. We propose that testing whether the data has a multivariate normal distribution within clusters can be used to evaluate the performance when using algorithms that assume normality in the data, such as k-means clustering. In cases where clustering has been performed using the cosine distance, conversion of the data to polar coordinates prior to normality testing should be performed to ensure normality is tested in the correct coordinate system. In addition to these evaluations of internal consistency, we demonstrate that the multivariate normal distribution can then be used as a basis for statistical modeling of MSI data. This allows the generation of synthetic MSI data sets with known ground truth, providing a means of external clustering evaluation. To demonstrate this, reference data from seven anatomical regions of an MSI image of a coronal section of mouse brain were modeled. From this, a set of synthetic data based on this model was generated. Results of r2 fitting of the chi-squared quantile-quantile plots on the seven anatomical regions confirmed that the data acquired from each spatial region was found to be closer to normally distributed in polar space than in Euclidean. Finally, principal component analysis was applied to a single data set that included synthetic and real data. No significant differences were found between the two data types, indicating the suitability of these methods for generating realistic synthetic data.

Please choose payment method:






(PDF emailed within 0-6 h: $19.90)

Accession: 058985779

Download citation: RISBibTeXText

PMID: 27641083

DOI: 10.1021/acs.analchem.6b02139


Related references

Analysis and interpretation of imaging mass spectrometry data by clustering mass-to-charge images according to their spatial similarity. Analytical Chemistry 85(23): 11189-11195, 2013

ImzML: Imaging Mass Spectrometry Markup Language: A common data format for mass spectrometry imaging. Methods in Molecular Biology 696: 205-224, 2011

Comparison of clustering pipelines for the analysis of mass spectrometry imaging data. Conference Proceedings 2014: 4771-4774, 2014

Automatic registration of mass spectrometry imaging data sets to the Allen brain atlas. Analytical Chemistry 86(8): 3947-3954, 2014

Multiorder correction algorithms to remove image distortions from mass spectrometry imaging data sets. Analytical Chemistry 85(21): 10249-10254, 2013

Metb-10. Evaluation Of Non-Supervised Matrix-Assisted Laser Desorption / Ionization Mass Spectrometry Imaging (Maldi) Mass Spectrometry Imaging (Msi) Combined With Microproteomics For Determination Of Glioblastoma Heterogeneity. Neuro-Oncology 19(Suppl_6): vi130, 2017

Spatial segmentation of imaging mass spectrometry data with edge-preserving image denoising and clustering. Journal of Proteome Research 9(12): 6535-6546, 2011

Multivariate Analysis of MALDI Imaging Mass Spectrometry Data of Mixtures of Single Pollen Grains. Journal of the American Society for Mass Spectrometry 29(11): 2237-2247, 2018

Tools and strategies for visualization of large image data sets in high-resolution imaging mass spectrometry. Review of Scientific Instruments 78(5): 053716, 2007

Memory efficient principal component analysis for the dimensionality reduction of large mass spectrometry imaging data sets. Analytical Chemistry 85(6): 3071-3078, 2013

MALDI imaging mass spectrometry: statistical data analysis and current computational challenges. Bmc Bioinformatics 13(Suppl. 16): S11, 2012

Correcting mass shifts: A lock mass-free recalibration procedure for mass spectrometry imaging data. Analytical and Bioanalytical Chemistry 407(25): 7603-7613, 2015

Tryptic peptide reference data sets for MALDI imaging mass spectrometry on formalin-fixed ovarian cancer tissues. Journal of Proteome Research 12(1): 308-315, 2013

MsIQuant--Quantitation Software for Mass Spectrometry Imaging Enabling Fast Access, Visualization, and Analysis of Large Data Sets. Analytical Chemistry 88(8): 4346-4353, 2016

Hyphenation of surface plasmon resonance imaging to matrix-assisted laser desorption ionization mass spectrometry by on-chip mass spectrometry and tandem mass spectrometry analysis. Analytical Chemistry 81(18): 7695-7702, 2009