+ Site Statistics
+ Search Articles
+ PDF Full Text Service
How our service works
Request PDF Full Text
+ Follow Us
Follow on Facebook
Follow on Twitter
Follow on LinkedIn
+ Subscribe to Site Feeds
Most Shared
PDF Full Text
+ Translate
+ Recently Requested

How many data sources are needed to determine diabetes prevalence by capture-recapture?

How many data sources are needed to determine diabetes prevalence by capture-recapture?

International Journal of Epidemiology 29(3): 536-541

Background: Capture-recapture (CR) methods are increasingly used to estimate the size of human populations, including those with diabetes. Few studies have examined the demographic details needed to match patients on the lists used in these techniques, or to determine the optimum number of lists. Methods: Six lists of known diabetic patients attending different medical settings during the study year were obtained. The effects on total enumeration after aggregation of these lists were examined using increasing numbers of demographic data items as patient identifiers. The CR estimates of prevalence were obtained using 15 different combinations of two lists. Estimates were obtained after log-linear modelling for interdependence between different combinations of three and four lists, and after combining the six available lists into three logical lists. Results: For matching patients, adding date of birth to first name and family name as matching criteria increased the total of identified patients from 2500 to 2585 (3% increase), corresponding to a period prevalence of 1.5% (95% CI : 1.41-1.52). Addition of further identifiers, such as partial postcode, only increased the estimate by a further 15 patients (0.5%), and more detailed matching with full postcode introduced uncertainty. The use of two-list CR yielded widely varying estimates of the total diabetic population from 1379 (95% CI : 435-2273) to 9554 (95% CI : 7291-10 983). Log-linear modelling using different combinations of three and four lists produced estimates of 5074 (95% CI : 4417-5947) and 5578 (95% CI : 4918-7081), respectively, after compensating for statistical interdependence between the lists used. The appropriate condensation of six available lists into three lists for modelling yielded estimates of 5492 (95% CI : 4870-6285), corresponding to a CR-adjusted period prevalence of 3.1% (95% CI : 3.03-3.19%). Conclusions: In a Western population, the only demographic data required for matching patients on lists used for CR methods are first name, family name and date of birth, if unique identifiers such as social security numbers are not available. Two lists alone do not produce reliable data, and at least three lists are needed to allow for modelling for 'dependence' between datasets. The use of more than three lists does not substantially alter the absolute value or confidence of enumeration, and multiple lists (if available) should be condensed into three lists for use in CR calculations.

Please choose payment method:

(PDF emailed within 0-6 h: $19.90)

Accession: 010762737

Download citation: RISBibTeXText

PMID: 10869328

DOI: 10.1093/ije/29.3.536

Related references

Capture-recapture using multiple data sources: estimating the prevalence of diabetes. Australian and new Zealand Journal of Public Health 36(3): 223-228, 2012

Estimation of the Prevalence of Diagnosed Diabetes from Primary Care and Secondary Care Source Data: Comparison of Record Linkage with Capture-Recapture Analysis. Journal of Epidemiology and Community Health (1979-) 56(1): 18-23, 2002

Estimation of the prevalence of diagnosed diabetes from primary care and secondary care source data: comparison of record linkage with capture-recapture analysis. Journal of Epidemiology and Community Health 56(1): 18-23, 2002

Use of capture-recapture analysis in data sources. Revista de Saude Publica 42(2): 377-8; Author Reply 378-9, 2008

Hidden diabetes in the UK: use of capture-recapture methods to estimate total prevalence of diabetes mellitus in an urban population. Journal of the Royal Society of Medicine 96(7): 328-332, 2003

Evaluating the reported prevalence of type 2 diabetes mellitus by the Oguni diabetes registry using a two-sample method of capture-recapture. International Journal of Epidemiology 28(3): 498-501, 1999

Use of capture-recapture analysis in data sources - Author's reply. 2007

Prevalence of diabetes by the capture-recapture technique-advantages and limitations. Diabetologia 40(Suppl. 1): A185, 1997

The use of capture-recapture techniques in determining the prevalence of type 2 diabetes. Qjm 94(7): 341-346, 2001

Effect of variation in probability of ascertainment by sources ("variable catchability") upon "capture-recapture" estimates of prevalence. American Journal of Epidemiology 137(10): 1148-1166, 1993

Assessing the number of TB cases using the capture-recapture method with two data sources. Revista Medico-Chirurgicala a Societatii de Medici Si Naturalisti Din Iasi 110(1): 52-56, 2009

Recapture or precapture? Fallibility of standard capture-recapture methods in the presence of referrals between sources. American Journal of Epidemiology 179(11): 1383-1393, 2014

Estimating prevalence of diabetes mellitus in a Lazio province, Italy, by capture-recapture models. Epidemiologia e Prevenzione 31(6): 333-339, 2008

Capture-recapture-adjusted prevalence rates of type 2 diabetes are related to social deprivation. Qjm 92(12): 707-710, 1999

Estimated prevalence of diabetes mellitus in acute Cerebrovascular disease patients The capture-recapture method. Diabetologia 40(Suppl. 1): A462, 1997