Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Label consumables as diagnostics/medicines/other and vital/essential/neither #1351

Merged

Conversation

tbhallett
Copy link
Collaborator

@tbhallett tbhallett commented May 16, 2024

Here we introduce a method for labelling consumables items as either diagnostic, medicine, or other.

We also introduce new options for availability in the Consumables class that uses this information:

  • 'all_diagnostics_available'
  • 'all_medicines_available'
  • 'all_medicines_and_other_available'

Lastly, we update the way availability is updated mid-way through the simulation to use the @property syntax that used in Equipment and Beddays.

(p.s. @sakshimohan -- you've already seen a preview of this in one of the scripting branches)

tbhallett and others added 30 commits March 4, 2024 12:07
* put the helper function for switching scenario into same file a ScenarioSwitcher class
* put tests for class and helper function together

(next step will be to rename and mock-up extended functionality)
…rios, to get upper limit on RAM requirements
@tbhallett
Copy link
Collaborator Author

I've used this snippet to create a flag for essential or vital items,. using the file linked-to above (now also in dropbox)

from pathlib import Path
import pandas as pd

# Resource file as it stands currently (original designations were done "by hand")
filepath_original = Path('resources/healthsystem/consumables/ResourceFile_Consumables_Item_Designations.csv')

# File linked-to above
path_to_dropbox = Path('/Users/tbh03/SPH Imperial College Dropbox/Tim Hallett/Thanzi la Onse Theme 1 SHARE/')
f = pd.read_csv(path_to_dropbox / '07 - Data' / 'essential_medicine_list_categorisation.csv')

# Find the set of item_codes that are labelled essential or viral
essential_or_vital = set(f.loc[f['Therapeutic priority'].isin(['V', 'E']), 'item_code'].values)

# Add column in original_rf dataframe
original_rf = pd.read_csv(filepath_original)
original_rf['is_essential_or_vital'] = original_rf['Item_Code'].isin(essential_or_vital)

# Save the updated rf (over-writing the existing version)
original_rf.to_csv(filepath_original, index=False)

@sakshimohan
Copy link
Collaborator

Thanks, @tbhallett . Looks good. A couple of questions -

  1. In the regression analysis, probability of availability was significantly associated with consumables being categorised as Vital but not with consumables being categorised as Essential. So in the Lancet paper, the binary label is based on classification of drugs as 'V' alone and not 'V' or 'E'. Should we follow the same logic here or would you rather stick with essential_or_vital?
  2. I know you've assumed that all among the 2016 consumables in ResourceFile_Consumables_Item_Designations.csv which are not in essential_medicine_list_categorisation.csv are not essential_or_vital. But this may not be the case because we just haven't looked up a majority of these consumables in the Essential Medicines List (because our search was limited to the 162 consumables in the HHFA). Is the expectation that we'll eventually complete this data?

@tbhallett
Copy link
Collaborator Author

Thanks, @tbhallett . Looks good. A couple of questions -

  1. In the regression analysis, probability of availability was significantly associated with consumables being categorised as Vital but not with consumables being categorised as Essential. So in the Lancet paper, the binary label is based on classification of drugs as 'V' alone and not 'V' or 'E'. Should we follow the same logic here or would you rather stick with essential_or_vital?

Good spot. Yes, we should follow the same logic here. I've add a column 'is_essential' with the code below, and will update the logic in the Consumables class to use the is_essential designation.

from pathlib import Path
import pandas as pd

# Resource file as it stands currently (original designations were done "by hand")
filepath_original = Path('resources/healthsystem/consumables/ResourceFile_Consumables_Item_Designations.csv')

# File linked-to above
path_to_dropbox = Path('/Users/tbh03/SPH Imperial College Dropbox/Tim Hallett/Thanzi la Onse Theme 1 SHARE/')
f = pd.read_csv(path_to_dropbox / '07 - Data' / 'essential_medicine_list_categorisation.csv')

# Find the set of item_codes that are labelled essential or viral
essential = set(f.loc[f['Therapeutic priority'].isin(['E']), 'item_code'].values)
essential_or_vital = set(f.loc[f['Therapeutic priority'].isin(['V', 'E']), 'item_code'].values)

# Add column in original_rf dataframe
original_rf = pd.read_csv(filepath_original)
original_rf['is_essential'] = original_rf['Item_Code'].isin(essential)
original_rf['is_essential_or_vital'] = original_rf['Item_Code'].isin(essential_or_vital)

# Save the updated rf (over-writing the existing version)
original_rf.to_csv(filepath_original, index=False)
  1. I know you've assumed that all among the 2016 consumables in ResourceFile_Consumables_Item_Designations.csv which are not in essential_medicine_list_categorisation.csv are not essential_or_vital. But this may not be the case because we just haven't looked up a majority of these consumables in the Essential Medicines List (because our search was limited to the 162 consumables in the HHFA). Is the expectation that we'll eventually complete this data?

I suppose so. But, I was perhaps (subconsciously!) assuming that the most essential items with have been in the HHFA (i.e. there would be few essential items not included in the HHFA, and so few missed by our categorisation). Is that a faulty assumption?

@tbhallett tbhallett changed the title Label consumables as diagnostics/medicines/other Label consumables as diagnostics/medicines/other and vital/essential/neither May 28, 2024
@tbhallett
Copy link
Collaborator Author

@tbhallett --- got it wrong --- should have been V only for is_vital todo.

@tbhallett
Copy link
Collaborator Author

tbhallett commented May 28, 2024

argh!

Updated code snippet for generating the ResourceFile we need:

from pathlib import Path
import pandas as pd

# Resource file as it stands currently (original designations were done "by hand")
filepath_original = Path('resources/healthsystem/consumables/ResourceFile_Consumables_Item_Designations.csv')

# File linked-to above
path_to_dropbox = Path('/Users/tbh03/SPH Imperial College Dropbox/Tim Hallett/Thanzi la Onse Theme 1 SHARE/')
f = pd.read_csv(path_to_dropbox / '07 - Data' / 'essential_medicine_list_categorisation.csv')

# Find the set of item_codes that are labelled essential or viral
vital = set(f.loc[f['Therapeutic priority'].isin(['V']), 'item_code'].values)

# Add column in original_rf dataframe
original_rf = pd.read_csv(filepath_original)
original_rf['is_vital'] = original_rf['Item_Code'].isin(vital)

# Save the updated rf (over-writing the existing version)
original_rf.to_csv(filepath_original, index=False)

@sakshimohan
Copy link
Collaborator

sakshimohan commented May 28, 2024

argh!

Updated code snippet for generating the ResourceFile we need:

from pathlib import Path
import pandas as pd

# Resource file as it stands currently (original designations were done "by hand")
filepath_original = Path('resources/healthsystem/consumables/ResourceFile_Consumables_Item_Designations.csv')

# File linked-to above
path_to_dropbox = Path('/Users/tbh03/SPH Imperial College Dropbox/Tim Hallett/Thanzi la Onse Theme 1 SHARE/')
f = pd.read_csv(path_to_dropbox / '07 - Data' / 'essential_medicine_list_categorisation.csv')

# Find the set of item_codes that are labelled essential or viral
vital = set(f.loc[f['Therapeutic priority'].isin(['V']), 'item_code'].values)

# Add column in original_rf dataframe
original_rf = pd.read_csv(filepath_original)
original_rf['is_vital'] = original_rf['Item_Code'].isin(vital)

# Save the updated rf (over-writing the existing version)
original_rf.to_csv(filepath_original, index=False)

Hi @tbhallett. I've now made the update we discussed during our morning call. This update adds the therapeutic code for consumables which were not in the HHFA (and therefore hadn't previously been looked up in the Essential Meds List). As a result, some consumables, mainly mental health, have been classified as vital.

A final note that the path_to_dropbox / '07 - Data' / 'essential_medicine_list_categorisation.csv' file now contains an additional column Type of consumable which categorises consumables as either drug or vaccine or other consumable. None of the other consumable such as blood, oxygen, IV set are included in the Essential Medicine List so they will automatically be classified as NOT Vital in this analysis. I don't think we need to do anything about this but just bringing it to your attention.

P.S. I had to add encoding="ISO-8859-1" to f = pd.read_csv(path_to_dropbox / '07 - Data' / 'essential_medicine_list_categorisation.csv') for it to work.

@tbhallett
Copy link
Collaborator Author

I couldn't help but add a "drug_or_vaccine" column to the resourcefile based on that.

drug_or_vaccine = set(f.loc[f['Type of consumable'] == 'drug or vaccine', 'item_code'].values)
original_rf = pd.read_csv(filepath_original)
original_rf['is_drug_or_vaccine'] = original_rf['Item_Code'].isin(drug_or_vaccine)
original_rf.to_csv(filepath_original, index=False)

@tbhallett tbhallett merged commit d3fb411 into master May 29, 2024
57 checks passed
@tbhallett tbhallett deleted the hallett/consumables-designate-as-diagnostics-or-other branch May 29, 2024 14:53
@tbhallett tbhallett moved this from Ready to merge to Done in PR priorities May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants