The project PRost
is part of the Methodological Network initiative on user interfaces to Eurostat online database.
Quick start
Launch a notebook running both R
and Python
: with packages already installed to access Eurostat database!
Examples
Run your own script into a notebook, like in the examples below:
- a quick and dirty notebook reproducing the tutorial for
R eurostat
package. - an empty R test notebook with the
eurostat
,rdbnomics
,restatapi
andTSsdmx
packages to retrieve data from Eurostat database - a notebook to test the
flagr
package - ...
Notes
This contribution advocates for widening the use of Open Source Software (OSS) , "beyond just R
", to:
- support new modes for production of official statistics,
- create new ways to share official statistics
in a constantly evolving data ecosystem,
While R
is currently the leading OSS within the statistical community, and the most widespread in statistical organisations, it is believed that one should not focus on isolated OSS, instead it should be possible to implement statistical methods in whatever OSS that fit best and integrate them seamlessly into the statistical production system.
Today's technological solutions, e.g. flexible APIs (e.g., Eurostat REST API), interactive notebooks (e.g., Jupyter
notebook) and virtualised containers (e.g., docker
), can support an approach where algorithms are delivered as – portable, scalable, harmonised and encapsulated – services regardless of the software used.
The notebooks are running on the binder platform, which automatically turns the Dockerfile
in this repository into an interactive notebook. Current Dockerfile
is an extension of the Jupyter Data Science Stack
.
About
status | since 2018 – on-going |
contributors | |
license | EUPL |
- EU open data initiatives: pan-European public data infrastructure.
- Eurostat database: online catalog and bulk download facility.
- Eurostat web-services: access to JSON and unicode data, the REST API with its query builder.
Software resources and services
- Package eurostat
R
to access open data from Eurostat. Jupyter
notebook docker stack, in particular the R stack and the Data Science stack. Note also list of existing images, get started and how-to.- Binder environment to run
Jupyter
notebooks. See the how-to. - A cool notebook showing how to represent Eurostat NUTS data over a map using Python package eurostat-api-client.
- Boettiger C. and Eddelbuettel D. (2018): An introduction to Rocker: Docker containers for R, The R Journal, 9(2):527-536.
- Grazzini J., Museux J.-M. and Hahn M. (2018): Empowering and interacting with statistical produsers: A practical example with Eurostat data as a service, in Proc. Conference of European Statistics Stakeholders, doi:10.5281/zenodo.3240557.
- Beaulieu-Jones B.K. and Greene C.S. (2017): Reproducibility of computational workflows is automated using continuous analysis, Nature Biotechnology, 35:342–346, doi:10.1038/nbt.3780.
- Lahti L., Huovari J., Kainu M., and Biecek, P. (2017): Retrieval and analysis of Eurostat open data with the eurostat package, The R Journal, 9(1):385-392.
- Marwick B., Boettiger C., and Mullen L. (2017): Packaging data analytical work reproducibly using R (and friends), The American Statistician, doi:10.1080/00031305.2017.1375986.
- Piccolo S.R. and Frampton M.B. (2016): Tools and techniques for computational reproducibility, Gigascience, 5(1):30, doi:10.1186/s13742-016-0135-4.
- Boettiger C. (2015): An introduction to Docker for reproducible research, ACM SIGOPS Operating Systems Review, Special Issue on Repeatability and Sharing of Experimental Artifacts, 49(1):71-79, doi:10.1145/2723872.2723882.
- How to Dockerize an
R Shiny
App. - Generating Dockerfiles for reproducible research with
R
. - Dockerfile basics and best practices.