Skip to content

Malwoiniak/AdverseDrugEventPipeline

Repository files navigation

AdverseDrugEventPipeline

Drug Side Effect Checker - The Pharmacovigilance Dashboard Prototype

Developed for MSc thesis "Improving Understandability of Drug Safety Data. Design Principles for Drug Safety Dashboards" available here

Table of contents

General info

Background info

Every year, FDA receives over one million adverse event and medication error reports associated with the use of drugs. FDA uses these reports to monitor safety of the drugs. Although these reports are a valuable source of information, this system has limitations, including submission of incomplete, inaccurate, untimely and unverified information. Importantly, the information in the FDA database does not confirm a causal relationship between the drug and the side effect.

To address that issue, our prototype uses data mining techniques that rely on statistical analysis to evaluate whether there may be a causal relationship between a side effect and a drug (i.e., whether an event is potentially related or unrelated to a drug). However, the interpretation of data mining analysis should be done with caution, as it doesn’t represent certainty that a drug caused a side effect. So, just because the side effect was evaluated as unrelated, does not mean you will certainly not experience it.

More info on data source: FAERS Database

How is the data analysed?

Disproportionality Analysis (DPA) with calculation of Proportionality Reporting Ratio (PRR) are be used as data mining techniques to process the data. Retrieved DPA indicators are used to identify safety signals. They can be interpreted as “unexpectedly high reporting associations” and can “signal that there may be a causal association between the particular adverse event and the product”. Source:

FDA White Paper

In essence, in DPA an "expected" count is compared with a "observed" count for a product-event combination. PRR can be defined as “the degree of disproportionate reporting of an adverse event for a product of interest compared to this same event for all other products in the database” (Duggirala et al. 2016; Duggirala et al. 2018). The main assumptions here are 1) that the whole database serves as an “expected background” and 2) that there is no association between events reported and their products. If there is a disproportionately high number of events reported per drug, assumption no. 2 is questionable and the event may be (statistically) associated with a drug. Contingency table is a means of visualisation of this concept (below).

Table 1. PRR Contingency Table. Source:

FDA White Paper

Event Y All other events Sums
Product X a b a + b
All other products c d c + d
Sums a + c b + d Total



In this work, VigilApp is used to calculate DPA indicators, see more here: VigilApp

For more information (eg., exact formulas used for calculations), refer to VigilApp Paper and Evaluation Criteria

What can I find in the prototype?

This is an academic prototype of an interactive dashboard that visualises information on drugs and their possible side effects. The dashboard has two sections. Up the page you will find general analysis for a selected drug and its most common side effects. Down the page you can explore in-depth analysis for a selected drug and one selected side effect.

Events are pre-evaluated as either potentially Related or Unrelated to a drug by DPA algorithm.

Link to the final prototype developed in this project using Tableau: Prototype

What does this code do?

See "Setup" section.

Illustrations

Below you can find example visualisations of drug safety data in the dashboard prototype.

General Overview of Drug Safety Profile

Dashboard1

Selected in-depth analysis for a selected drug and one selected side effect

Dashboard2

Dashboard3

Technologies

  • Python 3.8.12
  • Pandas 1.3.4
  • Selenium 4.0
  • BeautifulSoup4 4.7.1

Setup

General info

There are two scripts in this project: API_Drug_Data_Extract.ipynb & DPA_Scraper.ipynb. Run DPA_Scraper.ipynb first.

DPA Scraper

DPA_Scraper.ipynb retrieves DPA analysis (drug safety signal as either related or unrelated) from VigilApp for drugs + side effects.

All necessary dirs will be created by the script. File events.xlsx in here must be placed in script directory. It contains drugs + their most common side effects (obtained from VigilApp by querying for drug name in basic data extraction or counting). File references.xlsx must be placed in script directory in here.

Script outputs:

  • .xlsx file with DPA analysis per drug+side effect in DA directory
  • file events_signals.xlsx in script directory. It contains drug safety signal information (related/unrelated). This file is required for API_Drug_Data_Extract.ipynb run

API Drug Data Extractor

API_Drug_Data_Extract.ipynb queries API to get all demographic data (per drug + secondary effect) visualised in the dashboard.

All necessary dirs will be created by the script. File events_signals.xlsx in here must be placed in script directory. This file is created with DPA_Scraper.ipynb run. Exemplary events_signals.xlsx file can be found in dir /required_excel_files

Other mandatory files (in script dir):

  • country_codes.xlsx
  • demo.xlsx
  • df_empty_template.xlsx
  • references.xlsx
  • reporter_type.xlsx
  • serious.xlsx
  • signal_df.xlsx
  • years.xlsx

    These files can be found in dir /required_excel_files. They are either providing interpretation for FDA database codes or serve as DataFrame templates.

Script outputs:

  • demographic data on drugs + side effects (as in events_signals.xlsx) in demo directory

Status

Project is finished. TODO:

  • Rewrite Readme
  • Add .xlsx files

Contact

Created by Malwina Kotowicz (m_kotowicz@hotmail.com) and Cláudio Pires (claudiofmpires@gmail.com ) - feel free to contact us!