Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Misclassification of diagnosisstrings? #239

Open
vanessailana opened this issue Feb 14, 2024 · 7 comments
Open

Issue with Misclassification of diagnosisstrings? #239

vanessailana opened this issue Feb 14, 2024 · 7 comments

Comments

@vanessailana
Copy link

vanessailana commented Feb 14, 2024

Hello,

I was working with this dataset and noticed that some codes appear to be misclassified.

For example, the diagnosis string "cardiovascular chest pain / ASHD coronary artery disease / other biological bypass graft" is assigned to I25.810. This code represents "Atherosclerosis of coronary artery bypass graft(s) without angina pectoris"

However, I am wondering if I25.73, which seems similar, is actually more appropriate, as the definition of this code is "Atherosclerosis of nonautologous biological coronary artery bypass graft(s) with angina pectoris"

Could there be an issue with misclassification?

@obadawi
Copy link
Contributor

obadawi commented Feb 15, 2024 via email

@vanessailana
Copy link
Author

vanessailana commented Feb 15, 2024

How are these mappings generated to the custom strings ? Is there a paper that I can refer to to understand how these mappings were derived? I was looking at https://github.com/MIT-LCP/eicu-code/blob/main/notebooks/diagnosis.ipynb. This repo it says that diagnosisstring is the problem documented.

@crista
Copy link

crista commented Feb 15, 2024

One thing to keep in mind is that the diagnosis strings are not ICD 9 or 10 diagnoses. They were custom made and the codes you see were generated by a mapping of the custom strings to their ICD counterparts. It's possible there are errors or gray areas in the mappings but I believe they are fairly accurate.

Hello. I'm @vanessailana's advisor. Where can we find information about this mapping? Even if there are no papers/documents, is there a Python script somewhere that did it? Or was it a manual mapping? Thanks.

@obadawi
Copy link
Contributor

obadawi commented Feb 15, 2024 via email

@crista
Copy link

crista commented Feb 15, 2024

Thanks! To make sure I understand:

  1. each hospital did it on their own?
  2. were the ICD codes already part of the hospital data when you started this dataset and you just used them as they were, or did you do further processing of the codes in constructing this dataset?

@obadawi
Copy link
Contributor

obadawi commented Feb 15, 2024 via email

@crista
Copy link

crista commented Feb 16, 2024

Thank you for the clarifications!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants