# Four hundred or more participants needed for stable contingency table estimates of clinical prediction rule performance

##### Kent, P.; Boyle, E.; Keating, J.L.; Albert, H.B.; Hartvigsen, J.

#### Journal of Clinical Epidemiology 82: 137-148

#### 2017

**ISSN/ISBN: 1878-5921**PMID: 27847252 DOI: 10.1016/j.jclinepi.2016.10.004

Accession: 057015231

To quantify variability in the results of statistical analyses based on contingency tables and discuss the implications for the choice of sample size for studies that derive clinical prediction rules. An analysis of three pre-existing sets of large cohort data (n = 4,062-8,674) was performed. In each data set, repeated random sampling of various sample sizes, from n = 100 up to n = 2,000, was performed 100 times at each sample size and the variability in estimates of sensitivity, specificity, positive and negative likelihood ratios, posttest probabilities, odds ratios, and risk/prevalence ratios for each sample size was calculated. There were very wide, and statistically significant, differences in estimates derived from contingency tables from the same data set when calculated in sample sizes below 400 people, and typically, this variability stabilized in samples of 400-600 people. Although estimates of prevalence also varied significantly in samples below 600 people, that relationship only explains a small component of the variability in these statistical parameters. To reduce sample-specific variability, contingency tables should consist of 400 participants or more when used to derive clinical prediction rules or test their performance.