Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add in new use cases for the flags chapter #133

Open
NathanKelly-ONS opened this issue Jan 26, 2024 · 0 comments
Open

Add in new use cases for the flags chapter #133

NathanKelly-ONS opened this issue Jan 26, 2024 · 0 comments

Comments

@NathanKelly-ONS
Copy link
Collaborator

The flags chapter (located in the spark-analysis section and in raw-notebooks/flags) currently only has one use case for flags (age differences). I'd like to see some additional use cases for flags - for example if we assume somebody would like to conduct some sort of analysis on whether incidents are more common in summer than winter, an idea for a flag would be an "occurs_in_summer" and "occurs_in_winter" flag.

Or we could look at some of the cost variables, for example a flag for whether an incident costs more than £X amount.

Relatively simple flags, but I think they'd cover some of the use cases people would be looking at flags for.

Just to give a realistic example of a use case for flags, back when I was working on the COVID infection survey we were doing a logistic regression to see whether people were more likely to be hospitalised for COVID/Respiratory illnesses/Cardiovascular illnesses in winter vs summer, so we had to create a bunch of flags for covid_in_summer_2021, covid_in_summer_2022, covid_in_winter_2021, covid_in_summer_2022 (repeated for respiratory/cardiovascular). At the time my PySpark knowledge was horrendous so I did something incredibly inefficient (I think I did a for loop doing a bunch of groupbys and then joins), which is why I think we could really help the people that were in the same situation I was back then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant