+ Site Statistics
+ Search Articles
+ PDF Full Text Service
How our service works
Request PDF Full Text
+ Follow Us
Follow on Facebook
Follow on Twitter
Follow on LinkedIn
+ Subscribe to Site Feeds
Most Shared
PDF Full Text
+ Translate
+ Recently Requested

Approach to record linkage of primary care data from Clinical Practice Research Datalink to other health-related patient data: overview and implications



Approach to record linkage of primary care data from Clinical Practice Research Datalink to other health-related patient data: overview and implications



European Journal of Epidemiology 34(1): 91-99



Record linkage is increasingly used to expand the information available for public health research. An understanding of record linkage methods and the relevant strengths and limitations is important for robust analysis and interpretation of linked data. Here, we describe the approach used by Clinical Practice Research Datalink (CPRD) to link primary care data to other patient level datasets, and the potential implications of this approach for CPRD data analysis. General practice electronic health record software providers separately submit de-identified data to CPRD and patient identifiers to NHS Digital, excluding patients who have opted-out from contributing data. Data custodians for external datasets also send patient identifiers to NHS Digital. NHS Digital uses identifiers to link the datasets using an 8-stage deterministic methodology. CPRD subsequently receives a de-identified linked cohort file and provides researchers with anonymised linked data and metadata detailing the linkage process. This methodology has been used to generate routine primary care linked datasets, including data from Hospital Episode Statistics, Office for National Statistics and National Cancer Registration and Analysis Service. 10.6 million (M) patients from 411 English general practices were included in record linkage in June 2018. 9.1M (86%) patients were of research quality, of which 8.0M (88%) had a valid NHS number and were eligible for linkage in the CPRD standard linked dataset release. Linking CPRD data to other sources improves the range and validity of research studies. This manuscript, together with metadata generated on match strength and linkage eligibility, can be used to inform study design and explore potential linkage-related selection and misclassification biases.

Please choose payment method:






(PDF emailed within 0-6 h: $19.90)

Accession: 065697606

Download citation: RISBibTeXText

PMID: 30219957

DOI: 10.1007/s10654-018-0442-4


Related references

Cancer recording in patients with and without type 2 diabetes in the Clinical Practice Research Datalink primary care data and linked hospital admission data: a cohort study. Bmj Open 8(5): E020827, 2018

Rheumatoid arthritis and excess mortality: down but not out. A primary care cohort study using data from Clinical Practice Research Datalink. Rheumatology 57(6): 977-981, 2018

Prescribing for young people with attention deficit hyperactivity disorder in UK primary care: analysis of data from the Clinical Practice Research Datalink. Attention Deficit and Hyperactivity Disorders 11(3): 255-262, 2019

Statin use in cancer survivors versus the general population: cohort study using primary care data from the UK clinical practice research datalink. Bmc Cancer 18(1): 1018, 2018

Statin use and the risk of herpes zoster: a nested case-control study using primary care data from the U.K. Clinical Research Practice Datalink. British Journal of Dermatology 175(6): 1183-1194, 2016

Risk of skin cancer among patients with myotonic dystrophy type 1 based on primary care physician data from the U.K. Clinical Practice Research Datalink. International Journal of Cancer 142(6): 1174-1181, 2018

Phosphodiesterase Type 5 Inhibitors and Risk of Malignant Melanoma: Matched Cohort Study Using Primary Care Data from the UK Clinical Practice Research Datalink. Plos Medicine 13(6): E1002037, 2016

Evaluation of a risk score to predict future Clostridium difficile disease using UK primary care and hospital data in Clinical Practice Research Datalink. Human Vaccines and Immunotherapeutics 15(10): 2475-2481, 2019

Association between inactivated influenza vaccine and primary care consultations for autoimmune rheumatic disease flares: a self-controlled case series study using data from the Clinical Practice Research Datalink. Annals of the Rheumatic Diseases 78(8): 1122-1126, 2019

Searching for a prodrome for rheumatoid arthritis in the primary care record: A case-control study in the clinical practice research datalink. Seminars in Arthritis and Rheumatism 48(5): 815-820, 2019

Use of demographic and pharmacy data to identify patients included within both the Clinical Practice Research Datalink (CPRD) and The Health Improvement Network (THIN). Pharmacoepidemiology and Drug Safety 24(9): 999, 2015

Using primary care electronic health record data for comparative effectiveness research: experience of data quality assessment and preprocessing in The Netherlands. Journal of Comparative Effectiveness Research 5(4): 345-354, 2016

Development and validation of prediction models to estimate risk of primary total hip and knee replacements using data from the UK: two prospective open cohorts using the UK Clinical Practice Research Datalink. Annals of the Rheumatic Diseases 78(1): 91-99, 2019

Do pre-diagnosis primary care consultation patterns explain deprivation-specific differences in net survival among women with breast cancer? An examination of individually-linked data from the UK West Midlands cancer registry, national screening programme and Clinical Practice Research Datalink. Bmc Cancer 17(1): 155, 2017

Evaluation of methods to estimate missing days' supply within pharmacy data of the Clinical Practice Research Datalink (CPRD) and The Health Improvement Network (THIN). European Journal of Clinical Pharmacology 73(1): 115-123, 2017