+ Site Statistics
+ Search Articles
+ Subscribe to Site Feeds
Most Shared
PDF Full Text
+ PDF Full Text
Request PDF Full Text
+ Follow Us
Follow on Facebook
Follow on Twitter
Follow on LinkedIn
+ Translate
+ Recently Requested

Selection of differentially expressed genes in microarray data analysis

Selection of differentially expressed genes in microarray data analysis

Pharmacogenomics Journal 7(3): 212-220

One common objective in microarray experiments is to identify a subset of genes that express differentially among different experimental conditions, for example, between drug treatment and no drug treatment. Often, the goal is to determine the underlying relationship between poor versus good gene signatures for identifying biological functions or predicting specific therapeutic outcomes. Because of the complexity in studying hundreds or thousands of genes in an experiment, selection of a subset of genes to enhance relationships among the underlying biological structures or to improve prediction accuracy of clinical outcomes has been an important issue in microarray data analysis. Selection of differentially expressed genes is a two-step process. The first step is to select an appropriate test statistic and compute the P-value. The genes are ranked according to their P-values as evidence of differential expression. The second step is to assign a significance level, that is, to determine a cutoff threshold from the P-values in accordance with the study objective. In this paper, we consider four commonly used statistics, t-, S- (SAM), U-(Mann-Whitney) and M-statistics to compute the P-values for gene ranking. We consider the family-wise error and false discovery rate false-positive error-controlled procedures to select a limited number of genes, and a receiver-operating characteristic (ROC) approach to select a larger number of genes for assigning the significance level. The ROC approach is particularly useful in genomic/genetic profiling studies. The well-known colon cancer data containing 22 normal and 40 tumor tissues are used to illustrate different gene ranking and significance level assignment methods for applications to genomic/genetic profiling studies. The P-values computed from the t-, U- and M-statistics are very similar. We discuss the common practice that uses the P-value, false-positive error probability, as the primary criterion, and then uses the fold-change as a surrogate measure of biological significance for gene selection. The P-value and the fold-change can be pictorially shown simultaneously in a volcano plot. We also address several issues on gene selection.

Accession: 017005923

Download citation: RISBibTeXText

PMID: 16940966

DOI: 10.1038/sj.tpj.6500412

Related references

Microarray data simulator for improved selection of differentially expressed genes. Cancer Biology and Therapy 2(4): 383-391, 2003

The Global Error Assessment (GEA) model for the selection of differentially expressed genes in microarray data. Bioinformatics 20(16): 2726-2737, 2004

Intertwining threshold settings, biological data and database knowledge to optimize the selection of differentially expressed genes from microarray. Plos One 5(10): E13518, 2011

Analysis of differentially expressed genes based on microarray data of glioma. International Journal of Clinical and Experimental Medicine 8(10): 17321-17332, 2016

Microarray data analysis reveals differentially expressed genes in prolactinoma. Neoplasma 62(1): 53-60, 2015

Functional analysis of differentially expressed genes associated with glaucoma from DNA microarray data. Genetics and Molecular Research 13(4): 9421-9428, 2015

Testing for differentially-expressed genes by maximum-likelihood analysis of microarray data. Journal of Computational Biology 7(6): 805-817, 2001

Microarray data analysis: a practical approach for selecting differentially expressed genes. Genome Biology 2(12): Preprint0009, 2002

Analysis of differentially co-expressed genes based on microarray data of hepatocellular carcinoma. Neoplasma 64(2): 216-221, 2017

Effect of normalisation on detection of differentially expressed genes in cDNA microarray data analysis. Italian Journal of Animal Science 6(Supplement 1): 122-124, 2007

Microarray data analysis to identify differentially expressed genes and biological pathways associated with asthma. Experimental and Therapeutic Medicine 16(3): 1613-1620, 2018

A spline function approach for detecting differentially expressed genes in microarray data analysis. Bioinformatics 20(17): 2954-2963, 2004

Identification of differentially expressed genes by meta-analysis of microarray data on breast cancer. In Silico Biology 8(5-6): 383-411, 2009

Identification of differentially expressed genes in pituitary adenomas by integrating analysis of microarray data. International Journal of Endocrinology 2015: 164087, 2015

Ranking analysis of microarray data: a powerful method for identifying differentially expressed genes. Genomics 88(6): 846-854, 2006