Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cervical cancer #1287

Draft
wants to merge 49 commits into
base: master
Choose a base branch
from
Draft

Cervical cancer #1287

wants to merge 49 commits into from

Conversation

andrew-phillips-1
Copy link
Collaborator

Here's a first draft of this module. I need to do another search to try to find additional data for calibration.

I couldn't upload the draft write as it was asking me to use the command line and Git LFS.

@tbhallett tbhallett added this to In progress in PR priorities via automation Mar 4, 2024
@tbhallett tbhallett moved this from In progress to Ready for EM review in PR priorities Mar 21, 2024
@mnjowe
Copy link
Collaborator

mnjowe commented Apr 2, 2024

Thanks @andrew-phillips-1 for this first draft. I think it looks good. I will be adding some few comments/suggestions for your consideration.

@andrew-phillips-1
Copy link
Collaborator Author

Thanks @mnjowe


# ----- SCHEDULE LOGGING EVENTS -----
# Schedule logging event to happen immediately
sim.schedule_event(CervicalCancerLoggingEvent(self), sim.date + DateOffset(months=0))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why schedule logging event immediately yet polling event is starting a month after? Are we interested in logging defaults also?

sim.schedule_event(CervicalCancerLoggingEvent(self), sim.date + DateOffset(months=0))

# ----- SCHEDULE MAIN POLLING EVENTS -----
# Schedule main polling event to happen immediately
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Schedule main polling event to happen immediately
# Schedule main polling event to happen after a month

Comment on lines +88 to +89
Types.REAL,
"probabilty per month of oncogenic hpv infection",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Types.REAL,
"probabilty per month of oncogenic hpv infection",
Types.REAL,
"probability per month of oncogenic hpv infection",

Comment on lines +92 to +93
Types.REAL,
"probabilty per month of incident cin1 amongst people with hpv",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Types.REAL,
"probabilty per month of incident cin1 amongst people with hpv",
Types.REAL,
"probability per month of incident cin1 amongst people with hpv",

Comment on lines +96 to +97
Types.REAL,
"probabilty per month of incident cin2 amongst people with cin1",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Types.REAL,
"probabilty per month of incident cin2 amongst people with cin1",
Types.REAL,
"probability per month of incident cin2 amongst people with cin1",

Comment on lines +100 to +101
Types.REAL,
"probabilty per month of incident cin3 amongst people with cin2",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Types.REAL,
"probabilty per month of incident cin3 amongst people with cin2",
Types.REAL,
"probability per month of incident cin3 amongst people with cin2",

Comment on lines +104 to +105
Types.REAL,
"probabilty per month of incident stage1 cervical cancer amongst people with cin3",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo. change probabilty to probability for Ln 105, 109, 113, 117 and 121.

Comment on lines +596 to +597
def on_hsi_alert(self, person_id, treatment_id):
pass
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you planning on doing something here in the next draft? if not I think we can remove the function.

Comment on lines +331 to +332
# this was not assigned here at outset because baseline value of hv_inf was not accessible - it is assigned
# st start of main polling event below
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is baseline value for hv_inf not accessible yet HIV module is included in the dependencies section and hv_inf has been initialised here in Hiv module? what was the error?

Comment on lines +669 to +687
# this was done here and not at outset because baseline value of hv_inf was not accessible

given_date = pd.to_datetime('2010-02-03')

if self.sim.date < given_date:

women_over_15_nhiv_idx = df.index[(df["age_years"] > 15) & (df["sex"] == 'F') & ~df["hv_inf"]]

df.loc[women_over_15_nhiv_idx, 'ce_hpv_cc_status'] = rng.choice(
['none', 'hpv', 'cin1', 'cin2', 'cin3', 'stage1', 'stage2a', 'stage2b', 'stage3', 'stage4'],
size=len(women_over_15_nhiv_idx), p=p['init_prev_cin_hpv_cc_stage_nhiv']
)

women_over_15_hiv_idx = df.index[(df["age_years"] > 15) & (df["sex"] == 'F') & df["hv_inf"]]

df.loc[women_over_15_hiv_idx, 'ce_hpv_cc_status'] = rng.choice(
['none', 'hpv', 'cin1', 'cin2', 'cin3', 'stage1', 'stage2a', 'stage2b', 'stage3', 'stage4'],
size=len(women_over_15_hiv_idx), p=p['init_prev_cin_hpv_cc_stage_hiv']
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do this in initialise population? I'm interested to know why the value for hv_inf is not accessible at initialise population yet we have included Hiv in the list of dependencies


df.ce_selected_for_via_this_month = False

eligible_population = df.is_alive & (df.sex == 'F') & (df.age_years > 30) & (df.age_years < 50) & \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
eligible_population = df.is_alive & (df.sex == 'F') & (df.age_years > 30) & (df.age_years < 50) & \
eligible_population = df.is_alive & (df.sex == 'F') & (df.age_years.between(30, 50, inclusive="neither") & \

| df.ce_ever_treated)

# -------------------------------- SCREENING FOR CERVICAL CANCER USING XPERT HPV TESTING AND VIA---------------
# A subset of women aged 30-50 will receive a screening test
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the boundaries be included(30yrs and 50yrs) selection on Ln 720 is excluding them

self.sim.schedule_event(
InstantaneousDeath(self.module, person_id, "CervicalCancer"), self.sim.date
)
df.loc[selected_to_die, 'ce_date_death'] = self.sim.date
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is date of death not recorded already in demography?

Comment on lines +991 to +993
# Ignore this event if the person is no longer alive:
if not df.at[person_id, 'is_alive']:
return hs.get_blank_appt_footprint()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tbhallett is this not being handled already by the Healthsystem? If yes then I think we will save some computational time by removing it here and in all other HSI's.

Comment on lines +1128 to +1150
if random_value <= p['prob_cure_stage1'] and df.at[person_id, "ce_date_treatment"] == self.sim.date:
df.at[person_id, "ce_hpv_cc_status"] = 'none'
df.at[person_id, 'ce_current_cc_diagnosed'] = False
else:
df.at[person_id, "ce_hpv_cc_status"] = 'stage1'

if random_value <= p['prob_cure_stage2a'] and df.at[person_id, "ce_date_treatment"] == self.sim.date:
df.at[person_id, "ce_hpv_cc_status"] = 'none'
df.at[person_id, 'ce_current_cc_diagnosed'] = False
else:
df.at[person_id, "ce_hpv_cc_status"] = 'stage2a'

if random_value <= p['prob_cure_stage2b'] and df.at[person_id, "ce_date_treatment"] == self.sim.date:
df.at[person_id, "ce_hpv_cc_status"] = 'none'
df.at[person_id, 'ce_current_cc_diagnosed'] = False
else:
df.at[person_id, "ce_hpv_cc_status"] = 'stage2b'

if random_value <= p['prob_cure_stage3'] and df.at[person_id, "ce_date_treatment"] == self.sim.date:
df.at[person_id, "ce_hpv_cc_status"] = 'none'
df.at[person_id, 'ce_current_cc_diagnosed'] = False
else:
df.at[person_id, "ce_hpv_cc_status"] = 'stage3'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we assuming that patients do recover from cervical cancer same day they receive treatment?

Comment on lines +1273 to +1282
# Schedule another instance of the event for one month
hs.schedule_hsi_event(
hsi_event=HSI_CervicalCancer_PalliativeCare(
module=self.module,
person_id=person_id
),
topen=self.sim.date + DateOffset(months=1),
tclose=None,
priority=0
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tbhallett don't we have frequency argument in HSI events? could be useful here

Comment on lines +1295 to +1296
self.repeat = 30
super().__init__(module, frequency=DateOffset(days=self.repeat))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about changing days to months i.e.

Suggested change
self.repeat = 30
super().__init__(module, frequency=DateOffset(days=self.repeat))
self.repeat = 1
super().__init__(module, frequency=DateOffset(months=self.repeat))

self.repeat = 30
super().__init__(module, frequency=DateOffset(days=self.repeat))

def apply(self, population):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the use of groupby can be more efficient in computing the statistics below?

Comment on lines +1475 to +1476
# warnings.warn(UserWarning(f"Couldn't find priority ranking for TREATMENT_ID \n"
# f"{hsi_event.TREATMENT_ID}"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be uncommented

@@ -40,7 +40,7 @@
from tlo.util import create_age_range_lookup

logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
logger.setLevel(logging.CRITICAL )
Copy link
Collaborator

@mnjowe mnjowe Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason you are setting this to critical? I think if we don't want .INFO logs from Hiv module, we can configure cervical cancer analyses to only allow logging.INFO from cervical cancer module

@@ -20,7 +20,7 @@
from tlo.util import random_date

logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
logger.setLevel(logging.CRITICAL)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, we can configure this in cervical cancer analyses. This should be as it was

Suggested change
logger.setLevel(logging.CRITICAL)
logger.setLevel(logging.INFO)

@@ -16,7 +16,7 @@
from tlo.progressbar import ProgressBar

logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
logger.setLevel(logging.CRITICAL)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here.

Suggested change
logger.setLevel(logging.CRITICAL)
logger.setLevel(logging.INFO)

@@ -82,7 +82,7 @@ def __init__(self, *, start_date: Date, seed: int = None, log_config: dict = Non
self.rng = np.random.RandomState(np.random.MT19937(self._seed_seq))

def configure_logging(self, filename: str = None, directory: Union[Path, str] = "./outputs",
custom_levels: Dict[str, int] = None, suppress_stdout: bool = False):
custom_levels: Dict[str, int] = None, suppress_stdout: bool = True):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can also do this in analyses file

Suggested change
custom_levels: Dict[str, int] = None, suppress_stdout: bool = True):
custom_levels: Dict[str, int] = None, suppress_stdout: bool = False):

Comment on lines +231 to +232
# print(stats_dict)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# print(stats_dict)

Comment on lines +177 to +180
# todo: not sure what is wrong with this assert as I am fairly certain the intended assert is true

# assert set(sim.modules['SymptomManager'].who_has('vaginal_bleeding')).issubset(
# df.index[df.ce_cc_ever])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is just okay. It is failing because of how test test_check_progression_through_stages_is_blocked_by_treatment has been configured.

Comment on lines +360 to +368
sim.population.props.loc[population_of_interest, "ce_hpv_cc_status"] = 'stage1'

# force that they are all symptomatic
sim.modules['SymptomManager'].change_symptom(
person_id=population_of_interest.index[population_of_interest].tolist(),
symptom_string='vaginal_bleeding',
add_or_remove='+',
disease_module=sim.modules['CervicalCancer']
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will make all >15 yrs females be on stage 1 and have cancer symptoms yes BUT it will not automatically make everyone deemed as ever had cervical cancer in the code Hence check
assert set(sim.modules['SymptomManager'].who_has('vaginal_bleeding')).issubset( df.index[df.ce_cc_ever]) is likely to fail

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
PR priorities
Ready for EM review
Development

Successfully merging this pull request may close these issues.

None yet

2 participants