Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data license? #300

Open
chartgerink opened this issue May 16, 2024 · 1 comment
Open

Data license? #300

chartgerink opened this issue May 16, 2024 · 1 comment
Milestone

Comments

@chartgerink
Copy link
Member

chartgerink commented May 16, 2024

Currently epiparameter provides original data, but that data does not carry a license. MIT license covers the code, but in general it is not recommended to do this for data as it is not intended for that (just like CC licenses are not intended for code).

I don't want to discuss about whether data can be copyrighted to begin with (I can if desired 😊 ); I do think that not having a license can make it more uncertain about how it can be reused freely.

I would recommend to use a separate license for the data, specifically a Public Domain Dedication. This is maximally permissive and also the standard for (meta)data that is aimed to be reused and recompiled. CC BY becomes problematic when compiling databases, as keeping track of the origins of all data points quickly overtakes the size of the actual data (especially problematic in larger databases).

I also mentioned this in the Collaboratory forum for the GREP database: https://collab-forum.who.int/t/epiparameter-hackathon-discussion-thread/186/2?u=chartgerink

@joshwlambert joshwlambert added this to the v0.2.0 milestone May 23, 2024
@joshwlambert
Copy link
Member

@chartgerink and I met to discuss the best approach for adding a data license to the {epiparameter} R package. Here I will summarise our discussion and the key points.

  • The {epiparameter} R packages requires a data license. This is currently lacking and given the reasons @chartgerink outlines in the original issue description this should be rectified.
  • We should choose an open and permissive license to allow others to freely use and adapt the data that we distribute with the package. The leading contender is CC0, as this places the database in the public domain.
  • Most epidemiological parameters extracted from published work can be treated as individual data points/facts and thus are free to compile into a database. In other words, the extraction of parameters the literature can happen independent of the license of the publication. However, parameters extracted from pre-compiled sources/databases need to evaluated based on the original license (e.g., provide correct attribution to their sources for CC BY 4.0). We propose to do this on the {epiparameter} pkgdown website for all the sources that require attribution. (We need to check if any of the parameters currently in the database fall into this category).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests

2 participants