Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various ETL issues - not ok 1 - number of unique patients who die in the database #72

Open
MattMuffin opened this issue Nov 3, 2021 · 0 comments

Comments

@MattMuffin
Copy link

The query to build the populate the DEATH from MIMIC-III to OMOP looks for patients in two ways:

  1. from the admission table, WHERE deathtime IS NOT NULL, joining then on patients
  2. from the patient table, WHERE dod_ssn IS NOT NULL , and for which the patient has not been selected from the previous subquerry (admission table)

The check calls for 2 counts:

SELECT count(dod) FROM mimiciii.patients;
Result: 15759
SELECT count(death_date) FROM omop.death;
Result: 14849

This gives an error because of the mismatch in result. The issue here is that the count is on the dod field in the table mimiciii.patients , while the ETL is based on either admissions.deathtime or patients.dod_ssn.

If I take patient with subject_id = 263, it shows a dod but dod_ssn is null, resulting in the patient not being included in the DEATH table.

I think there is an issue:

  1. either the check counts the wrong field (dod but not dod_ssn)
  2. either the etl queries on the wrong field (dod_ssn and not dod)

In any case, I think we are comparing apples to oranges as dod does not imply that dod_ssn will be filled.

Please advise.

I'll dig the other errors I received from the etl_checks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant