Study warns prescription data alone may misclassify diseases in research

The use of prescription records alone to identify whether patients have certain diseases could lead to misleading results in medical research, according to a new UK study analysing health records from more than 400,000 patients.

Led by researchers from Queen’s University Belfast and the Inflammation and Immunity Driver Programme, the study examined how accurately prescription data reflect whether patients actually have diagnosed health conditions. They found that while prescription records can reliably indicate when someone does not have a condition, they are often far less reliable at confirming when someone does.

These findings highlight a key limitation in studies that rely on electronic health records, which are increasingly used to inform healthcare planning, public health policy and clinical research.

For their study, the research team used anonymised GP records from the Optimum Patient Care Research Database. They randomly selected 425,000 patients and compared disease classification based on prescriptions with diagnoses recorded in clinical notes, which were treated as the ‘gold standard’.

The study assessed 18 common chronic conditions, including asthma, diabetes, dementia, depression, epilepsy and heart failure.

The results showed wide variation in how well prescriptions matched confirmed diagnoses. For some conditions, a prescription was a strong signal that a patient really had the disease — for example, this was true for diabetes, where prescriptions almost always corresponded to a confirmed diagnosis. For other conditions, such as heart failure, prescriptions were much less reliable and often appeared in patients who did not have a confirmed diagnosis.

By contrast, not having a relevant prescription was usually a good indicator that a patient did not have the disease. Across all conditions studied, patients without the relevant prescription were very unlikely to have an underlying diagnosis.

Lead author Dr Christian Schnier, Senior Research Fellow at Queen’s University Belfast, said: “These limitations of using specific prescriptions, say, for insulin to identify patients with diabetes are a considerable limitation for medical studies using electronic health records; however, it is important to stress that health care workers are not using prescriptions to decide if a patient has a disease.”

The findings showed that medications are often prescribed for multiple indications, or may be issued before a formal diagnosis is recorded, emphasising the need for researchers to interpret prescription data with caution.

The study also found that relying on prescription data could significantly distort estimates of how common certain diseases are. For example, with some conditions, the apparent disease prevalence based on prescriptions was considerably higher than the prevalence based on diagnostic records. In extreme cases, prescription data suggested disease rates up to 11 times higher than clinical records indicated.

Notably, the reliability of prescription-based classification varied widely between conditions. Diseases such as diabetes, dementia and obstructive lung diseases, for example, showed relatively stronger agreement between prescriptions and diagnoses. Others however, such as anxiety and heart failure, showed much weaker links. 

Accuracy also differed across patient groups. Age, sex, ethnicity and time period all influenced how well prescriptions predicted disease. Interestingly, however, socioeconomic deprivation had little consistent effect.

Dr Schnier added: “Researchers often ignore that estimates of how well a prescription is associated with a specific diagnosis is crucially dependent on the prevalence of the disease. For example, we estimated that in our study population only 8% of patients with a prescription for a medication associated with heart failure actually had a diagnosis of heart failure. In populations with higher prevalence, say, in a study population of the elderly, a prescription for a medication  related to heart failure would have a much higher positive predictive value.”

Electronic health records have become integral to modern health research, especially when studying large populations or long-term trends. However, as this study indicates, researchers must carefully consider potential sources of error when using routine healthcare data.

“For us as medical health researchers,” said Dr Schnier, “this study demonstrates once again that for accurate health research, we ideally need access to both prescription records and records of clinical diagnosis. In the UK, data sources like OPCRD, the SAIL database and OpenPrescribing/CPRD enable us to conduct valuable and valid research.

 

 

The research team is based at leading UK institutions, including Belfast Health & Social Care NHS Trust, Imperial College London, Observational and Pragmatic Research Institute (OPRI), Queen’s University Belfast, and the University of Oxford.

The study was funded by HDR UK’s Inflammation & Immunity Driver Project, and is a collaboration between higher education, industry (OPRI), and Health and Social Care Northern Ireland.

Full Citation: Schnier C, Busby J, Sheikh A, Quint JK, Price DB, Heaney LG. Validity of Using Prescription Medications to Classify Disease – A Retrospective Observational Study Using Routinely Collected Electronic Health Records from the UK. Pragmat Obs Res. 2026;17:1-14

https://doi.org/10.2147/POR.S553011 

Quote from Christian reading: “For us as medical health researchers,” said Dr Schnier, “this study demonstrates once again that for accurate health research, we ideally need access to both prescription records and records of clinical diagnosis. In the UK, data sources like OPCRD, the SAIL database and OpenPrescribing/CPRD enable us to conduct valuable and valid research.