Methodological issues for using a common data model of COVID-19 vaccine uptake and important adverse events of interest

August 2022: Study examining at whether a common data model could be used to help pool data from across the four UK nations to investigate the national uptake of COVID-19 vaccines, as well as monitor the frequency of rare adverse events of interest (AEIs), such as CVST and anaphalaxis, that may occur post-vaccination.

Methodological issues for using a common data model (CDM) of COVID-19 vaccine uptake and important adverse events of interest (AEIs): the Data and Connectivity COVID-19 Vaccines Pharmacovigilance (DaC-VaP) United Kingdom feasibility study.

Delanerolle, G.; Williams, R.; Stipancic, A.; de Lusignan, S.; et. al.

JMR Formative Research

Published on: 22 August 2022

Available at: http://dx.doi.org/10.2196/37821

 Plain English Summary  

‘Real world’ studies and clinical trials show that the current COVID-19 vaccines are safe and effective in preventing severe disease and death. However, it is important to monitor safety and effectiveness over time in large populations. This process is known as pharmacovigilance. 

Why did we do this research? 

Two of the possible vaccine side effects being monitored are Cerebral Venous Sinus Thrombosis (CVST) and anaphylaxis. These are known as ‘adverse events of interest’, or AEIs. 

Anaphylaxis is a severe immune reaction to things like food or medication. You can find out more about CVST below. 

These AEIs are very rare. Their rarity makes it difficult for researchers to monitor and predict them. Iit also makes it difficult for researchers to reliably determine whether COVD-19 vaccination can be linked to these AEIs.

The best way to monitor rare AEIs is to monitor them in the real world on a very large scale. Ideally, this would involve using data collected from across all four UK nations. Analysing data across England, Northern Ireland, Scotland and Wales is uncommon, because it has challenges.

Monitoring AEIs across the UK

England, Northern Ireland, Scotland and Wales record routinely collected patient data in different ways.

In England, patient data is recorded using the Systemised Nomenculture of Medicine Terms (SNOMED CT) terminology. Northern Ireland, Scotland and Wales all use the Read v2 terminology.

Each method has their own catalogue of codes. Sometimes it is possible to make a direct translation, to link together data of the same thing even though they have different codes.

This process of linking different fields of data together is called mapping. One way to overcome the challenges of mapping data is to use a common data model.

Common data models are tools that can be used to help pool together data from various data sources, even if they use different languages, or ‘coding terminologies.

In a sense, they act as third-party translators, working to create mappings between the different languages used in data sources.

Our research aims

We wanted to see if using a common data model could pool together the relevant healthcare data from across the UK. This would help to monitor AEIs, such as CVSTs and anaphylaxis, in real life world.

To do this, we decided to use a common data model called OMOP (Observational Medical Outcomes Partnership). This model has been shown to be capable of pooling together data from a diverse range of databases. It does this by translating the data sources into its own standardised language.

How did we carry out this research?

We investigated OMOP’s suitability by seeing whether its language contains the vocabulary necessary to report on CVST and anaphylaxis, and track COVID-19 vaccine pharmacovigilance.

Data sources

We used data from the following assets in each of the four UK nations:

UK Nation Data asset
England Research Surveillance Centre
Northern Ireland Honest Broker Service
Scotland EAVE II
Wales SAIL Databank

As well as the number of people who experienced anaphylaxis or CVST after having a vaccine, we also looked at:

  • Demographic factors: A person’s age, sex and socio-economic status
  • Vaccine uptake: How many people have had a vaccine
  • Obesity: measured in terms of a person’s Body Mass Index (BMI)
  • Smoking status: Whether a person smokes, used to smoke or has never smoked

Checking the suitability of OMOP

To check whether OMOP had the vocabulary needed to pool together data from our areas of interest, we first used the Observational Health Data Sciences & Informatics (OHDSI) Athena online browser.

The Athena browser allows researchers to search for what codes are available in OMOP. They can then compare them to the local data sources and discuss which terms to use.

What did we find?

We found that most of the variables needed for COVID-19 vaccine pharmacovigilance existed in the OMOP common data model.

There was one exception: socioeconomic status.

In OMOP, socioeconomic status is a very generic concept. Normally in DaC-VaP studies we break down socioeconomic status into five groups, based on postcode. However, these variables can be added in the future by customizing new terms.

Overall we were able to use OMOP to connect our variables of interest to the various clinical terminologies in the UK. However, there was some variation.

Uptake of different vaccine doses is well recorded in England, but not in the datasets kept by Scotland, Northern Ireland and Wales. For the DaC-VaP project, this is relatively easy to overcome because of collaboration between the data teams in each place.

CVSTs and Anaphylaxis

Overall, we found clear mappings between the SNOMED (England) and Read v2 (NI, Scotland and Wales) coding terminologies, and OMOP.

We found that the vocabulary used to describe both CVST and anaphylaxis in NI, Scotland and Wales was much broader than that used in England.

This means that clinical terminology was often more detailed and varied in how it recorded incidences of CVST and anaphylaxis, such as describing the most likely causes of these events. For example, there were terms used to describe whether anaphylaxis was caused due to an allergic reaction or as an adverse effect to medication.

We identified 56 clinical terms used to code for CVST in England, 47 of which we will use going forward. For the other nations, however, we were only able to identify 15 terms. We also had to be less specific when searching for these terms in order to find them.

Clinical terms for anaphylaxis were slightly more detailed. In England, 60 related terms were identified, of which we will focus on the 10 most relevant codes. In contrast, we identified 9 terms across NI, Scotland and Wales, 4 of which we will continue to use in future studies.

What does this mean?

Our investigation showed that using the OMOP common data model is a viable tool for conducting a UK-wide study of COVID-19 vaccine pharmacovigilance. However, some customization of OMOP will be required.

Overall, we found SNOMED to be a more detailed and descriptive clinical terminology compared to Read v2.

How will we use the model going forward?

We have two future aims for an adjusted version of the model.

  1. To report on vaccine uptake rates across the whole of the UK, including how uptake varies depending on a person’s age group, sex, ethnicity, the type of vaccine they received and the number of doses they have received.

  2. To report on rates of post-vaccine cases of CVST and anaphylaxis in England, NI, Scotland and Wales, as well as the UK overall. Here, we will be looking at two-time frames 0-2 days post-vaccination, and 3-28 days post-vaccination.

In doing so, we aim to make monthly reports detailing this information.

Note

This plain English summary was reviewed by our PPI Leads Antony Chuter and Jillian Beggs.
This research was originally published as a pre-print article. A link to this pre-print can be found below.