Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simulation docs #32

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
Open

Simulation docs #32

wants to merge 11 commits into from

Conversation

aauss
Copy link
Collaborator

@aauss aauss commented Oct 6, 2020

Add docs for simulation module and change copyright year to make docs up to date

@codecov
Copy link

codecov bot commented Oct 7, 2020

Codecov Report

Patch coverage has no change and project coverage change: -0.75 ⚠️

Comparison is base (7ea780a) 83.41% compared to head (5bc5142) 82.66%.

❗ Current head 5bc5142 differs from pull request most recent head 06b7f30. Consider uploading reports for the commit 06b7f30 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master      #32      +/-   ##
==========================================
- Coverage   83.41%   82.66%   -0.75%     
==========================================
  Files          30       30              
  Lines         651      623      -28     
==========================================
- Hits          543      515      -28     
  Misses        108      108              

see 5 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@aauss
Copy link
Collaborator Author

aauss commented Nov 18, 2020

@JarnoRFB can you please look over this PR :)

Copy link
Owner

@JarnoRFB JarnoRFB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for neglecting this for so long. Please address the comments that make sense to you, otherwise just go ahead and merge 👍

------------------
An endemic timeseries is an timeseries of case counts (as defined in :ref:`outbreak-detection-formalization`.) that occurs naturally and varies between diseases, time, space, sex, age, and other dimensions. The important distinction is that there is no influence in the observed case counts due to an outbreak event.

A simulation is usually a time-dependent, linear model that produces realistic case counts and can be set to mimic different types of disease dynamics. To achieve realism in epidemiological simulations, you would usually incorporate different effects into that model such as seasonality and trend e.g., a repetition of a certain pattern after one year with increasing case numbers over time. Finally, some distribution is used to make the outcome of the model non-deterministic. Since case counts are whole numbers, a Poisson or negative binomial distribution is used on top of the linear model to introduce some randomness in the observed case counts. These kinds of algorithms are referred to as ``seasonal_noise``.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A simulation is usually a time-dependent, linear model that produces realistic case counts and can be set to mimic different types of disease dynamics. To achieve realism in epidemiological simulations, you would usually incorporate different effects into that model such as seasonality and trend e.g., a repetition of a certain pattern after one year with increasing case numbers over time. Finally, some distribution is used to make the outcome of the model non-deterministic. Since case counts are whole numbers, a Poisson or negative binomial distribution is used on top of the linear model to introduce some randomness in the observed case counts. These kinds of algorithms are referred to as ``seasonal_noise``.
A simulation is usually a time-dependent, linear model that produces realistic case counts and can be set to mimic different types of disease dynamics. To achieve realism in epidemiological simulations, you would usually incorporate different effects into that model such as seasonality and trend e.g., a repetition of a certain pattern after one year with increasing case numbers over time. Finally, some distribution is used to make the outcome of the model non-deterministic. Since case counts are positive integers, a Poisson or negative binomial distribution is used on top of the linear model to introduce some randomness in the observed case counts. These kinds of algorithms are referred to as ``seasonal_noise``.


Epidemic Timeseries
-------------------
Once we have created a model to simulate an endemic timeseries, we can introduce outbreak events by randomly increasing the case count at certain timepoints :math:`t`. A common approach is to have a chain of switching states (usually produces by a Markov chain) that use senseful transition probabilities to move into or leave the state of an outbreak. Alternatively, one can assign certain timepoints to be in an outbreak to make such timeseries more comparable or tests edge cases of outbreak detection algorithms.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Once we have created a model to simulate an endemic timeseries, we can introduce outbreak events by randomly increasing the case count at certain timepoints :math:`t`. A common approach is to have a chain of switching states (usually produces by a Markov chain) that use senseful transition probabilities to move into or leave the state of an outbreak. Alternatively, one can assign certain timepoints to be in an outbreak to make such timeseries more comparable or tests edge cases of outbreak detection algorithms.
Once we have created a model to simulate an endemic timeseries, we can introduce outbreak events by randomly increasing the case count at certain timepoints :math:`t`. A common approach is to have a chain of switching states (usually produces by a Markov chain) that use sensible transition probabilities to move into or leave the state of an outbreak. Alternatively, one can assign certain timepoints to be in an outbreak to make such timeseries more comparable or tests edge cases of outbreak detection algorithms.

@@ -7,7 +7,7 @@
# -- Project information -----------------------------------------------------

project = "epysurv"
copyright = "2019, Rüdiger Busche"
copyright = "2020, Rüdiger Busche"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should add yourself to the authors ;)

docs/index.rst Outdated
@@ -20,7 +20,8 @@ This package was originally developed at the Robert Koch Institute in the `Signa

1_quickstart.rst
2_outbreak_detection.rst
3_user_guide.rst
3_data_simulation.rst
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there could be a better name for this, as we are not simulating the data, but the "epidemic", "outbreak", ... . Maybe "Simulating epidemiological dynamics" or just "Simulation"? Is there maybe an accepted term for this in the community?

@@ -0,0 +1,15 @@
Data Simulations
==================
Usually, public-health-related data can not be disclosed publicly which makes the development and systematic comparison of outbreak detection algorithms hard. Luckily, using simulations of hypothetical disease spreads is a valid alternative to real data. Thus, epysurv includes a module that allows you to simulate simple, univariate endemic and epidemic timeseries, i.e., with and without outbreak.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there maybe a review paper that backs some of the claims made here that we could reference, as in the outbreak detection section?

Based on comments from JarnoRFB in
#32 I have updated a few
phrases and rewrote the introduction as it was too unfounded.
Looking at the literature, I found that simulations are also
beneficial due to known outbreak properties, which makes
evaluation possible in the first place in certain situations.
- Flake8 has moved to GitHub: see https://pre-commit.com/hooks.html
- Changed exclude since it is regex and not glob:
pre-commit/pre-commit#1702
- pre-commit assumes immutable ref. Thus, this commit changes
black's ref to a fixed version number. Using an unpinned latest version
is not encouraged.
Major version in sphinx-bibtex is incompatible with this library's
sphinx version. This commit restricts the version number:
executablebooks/jupyter-book#1137 (comment)

The latest release of jinja2 introduces was not compatible with
this library's sphinx version. This commit restricts the version number:
readthedocs/readthedocs.org#9038 (comment)

isort is used in pre-commit hook but does not come with the
requirements-dev.txt. This commit adds it to have it available during
development.

Building the docs with a Jupyter notebook in it requires pandoc. The
README is updated to include an installation instruction. Using
conda is suggested in this stackoverflow question:
https://stackoverflow.com/questions/62398231/building-docs-fails-due-to-missing-pandoc
test_simulations.py was throwing dtype errors since sometimes, n_cases
was int_32 and sometimes int_64. With this commit, data type is casted
to pass the unit test.

Imports were not sorted correctly according to isort which was fixed.

Docstring of simulate was miswritten and missed a "s". This commit
fixes it.
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants