New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simulation docs #32
base: master
Are you sure you want to change the base?
Simulation docs #32
Conversation
Codecov ReportPatch coverage has no change and project coverage change:
Additional details and impacted files@@ Coverage Diff @@
## master #32 +/- ##
==========================================
- Coverage 83.41% 82.66% -0.75%
==========================================
Files 30 30
Lines 651 623 -28
==========================================
- Hits 543 515 -28
Misses 108 108 ☔ View full report in Codecov by Sentry. |
@JarnoRFB can you please look over this PR :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for neglecting this for so long. Please address the comments that make sense to you, otherwise just go ahead and merge 👍
docs/3_data_simulation.rst
Outdated
------------------ | ||
An endemic timeseries is an timeseries of case counts (as defined in :ref:`outbreak-detection-formalization`.) that occurs naturally and varies between diseases, time, space, sex, age, and other dimensions. The important distinction is that there is no influence in the observed case counts due to an outbreak event. | ||
|
||
A simulation is usually a time-dependent, linear model that produces realistic case counts and can be set to mimic different types of disease dynamics. To achieve realism in epidemiological simulations, you would usually incorporate different effects into that model such as seasonality and trend e.g., a repetition of a certain pattern after one year with increasing case numbers over time. Finally, some distribution is used to make the outcome of the model non-deterministic. Since case counts are whole numbers, a Poisson or negative binomial distribution is used on top of the linear model to introduce some randomness in the observed case counts. These kinds of algorithms are referred to as ``seasonal_noise``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A simulation is usually a time-dependent, linear model that produces realistic case counts and can be set to mimic different types of disease dynamics. To achieve realism in epidemiological simulations, you would usually incorporate different effects into that model such as seasonality and trend e.g., a repetition of a certain pattern after one year with increasing case numbers over time. Finally, some distribution is used to make the outcome of the model non-deterministic. Since case counts are whole numbers, a Poisson or negative binomial distribution is used on top of the linear model to introduce some randomness in the observed case counts. These kinds of algorithms are referred to as ``seasonal_noise``. | |
A simulation is usually a time-dependent, linear model that produces realistic case counts and can be set to mimic different types of disease dynamics. To achieve realism in epidemiological simulations, you would usually incorporate different effects into that model such as seasonality and trend e.g., a repetition of a certain pattern after one year with increasing case numbers over time. Finally, some distribution is used to make the outcome of the model non-deterministic. Since case counts are positive integers, a Poisson or negative binomial distribution is used on top of the linear model to introduce some randomness in the observed case counts. These kinds of algorithms are referred to as ``seasonal_noise``. |
docs/3_data_simulation.rst
Outdated
|
||
Epidemic Timeseries | ||
------------------- | ||
Once we have created a model to simulate an endemic timeseries, we can introduce outbreak events by randomly increasing the case count at certain timepoints :math:`t`. A common approach is to have a chain of switching states (usually produces by a Markov chain) that use senseful transition probabilities to move into or leave the state of an outbreak. Alternatively, one can assign certain timepoints to be in an outbreak to make such timeseries more comparable or tests edge cases of outbreak detection algorithms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we have created a model to simulate an endemic timeseries, we can introduce outbreak events by randomly increasing the case count at certain timepoints :math:`t`. A common approach is to have a chain of switching states (usually produces by a Markov chain) that use senseful transition probabilities to move into or leave the state of an outbreak. Alternatively, one can assign certain timepoints to be in an outbreak to make such timeseries more comparable or tests edge cases of outbreak detection algorithms. | |
Once we have created a model to simulate an endemic timeseries, we can introduce outbreak events by randomly increasing the case count at certain timepoints :math:`t`. A common approach is to have a chain of switching states (usually produces by a Markov chain) that use sensible transition probabilities to move into or leave the state of an outbreak. Alternatively, one can assign certain timepoints to be in an outbreak to make such timeseries more comparable or tests edge cases of outbreak detection algorithms. |
@@ -7,7 +7,7 @@ | |||
# -- Project information ----------------------------------------------------- | |||
|
|||
project = "epysurv" | |||
copyright = "2019, Rüdiger Busche" | |||
copyright = "2020, Rüdiger Busche" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should add yourself to the authors ;)
docs/index.rst
Outdated
@@ -20,7 +20,8 @@ This package was originally developed at the Robert Koch Institute in the `Signa | |||
|
|||
1_quickstart.rst | |||
2_outbreak_detection.rst | |||
3_user_guide.rst | |||
3_data_simulation.rst |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there could be a better name for this, as we are not simulating the data, but the "epidemic", "outbreak", ... . Maybe "Simulating epidemiological dynamics" or just "Simulation"? Is there maybe an accepted term for this in the community?
docs/3_data_simulation.rst
Outdated
@@ -0,0 +1,15 @@ | |||
Data Simulations | |||
================== | |||
Usually, public-health-related data can not be disclosed publicly which makes the development and systematic comparison of outbreak detection algorithms hard. Luckily, using simulations of hypothetical disease spreads is a valid alternative to real data. Thus, epysurv includes a module that allows you to simulate simple, univariate endemic and epidemic timeseries, i.e., with and without outbreak. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there maybe a review paper that backs some of the claims made here that we could reference, as in the outbreak detection section?
Based on comments from JarnoRFB in #32 I have updated a few phrases and rewrote the introduction as it was too unfounded. Looking at the literature, I found that simulations are also beneficial due to known outbreak properties, which makes evaluation possible in the first place in certain situations.
- Flake8 has moved to GitHub: see https://pre-commit.com/hooks.html - Changed exclude since it is regex and not glob: pre-commit/pre-commit#1702 - pre-commit assumes immutable ref. Thus, this commit changes black's ref to a fixed version number. Using an unpinned latest version is not encouraged.
Major version in sphinx-bibtex is incompatible with this library's sphinx version. This commit restricts the version number: executablebooks/jupyter-book#1137 (comment) The latest release of jinja2 introduces was not compatible with this library's sphinx version. This commit restricts the version number: readthedocs/readthedocs.org#9038 (comment) isort is used in pre-commit hook but does not come with the requirements-dev.txt. This commit adds it to have it available during development. Building the docs with a Jupyter notebook in it requires pandoc. The README is updated to include an installation instruction. Using conda is suggested in this stackoverflow question: https://stackoverflow.com/questions/62398231/building-docs-fails-due-to-missing-pandoc
test_simulations.py was throwing dtype errors since sometimes, n_cases was int_32 and sometimes int_64. With this commit, data type is casted to pass the unit test. Imports were not sorted correctly according to isort which was fixed. Docstring of simulate was miswritten and missed a "s". This commit fixes it.
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Add docs for simulation module and change copyright year to make docs up to date