Skip to content

Global Publishing Output from Corresponding Authors 2014 - 2018

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

subugoe/oa2020cadata

Repository files navigation

Research compendium for a dataset about corresponding author country affiliations indexed in the Web of Science 2014 - 2018

Launch Rstudio Binder Lifecycle: experimental Travis build status DOI

Overview

This repository provides a dataset about corresponding author country affiliations indexed in the Web of Science 2014 - 2018 using the in-house database from the German Competence Center for Bibliometrics.

To start with, read the Data Descriptor, an overview of the dataset and analytical work.

This repository is organized as a research compendium. A research compendium contains data, code, and text associated with it. The R Markdown files in the analysis/ directory provide details about the data analysis, particularly about how the Web of Science in-house database from the German Competence Center for Bibliometrics was interfaced, and the Data Descriptor. The data/ directory contains all aggregated data. Because of the proprietary nature of the Web of Science, no raw data and no access to the in-house database can be shared.

Analysis files

The analysis/ directory contains the following reports written in R Markdown:

Analytical steps for obtaining the data from the Web of Science in-house database maintained by the German Competence Center for Bibliometrics (WoS-KB), and data enriching were also provided as R Markdown reports:

Data files

The data/ directory contains the resulting datasets stored as comma-separated value files.

Reproducibility notes

This repository follows the concept of a research compendium that uses the R package structure to port data and code.

Because access to the data infrastructure of the German Competence Center for Bibliometrics (WoS-KB) is restricted, there are different levels of reproducibility. Everyone will be able to reproduce the analysis in the data descriptor, the main document of this research compendium written in R Markdown. Users with access to the WoS-KB data infrastructure will also be able to replicate the R code and SQL queries locally, or on the script server.

Data descriptor

Clone the GitHub repository with all data and code.

git clone https://github.com/subugoe/oa2020cadata.git

Open an R session in the directory of this package and install the R package dependencies using a package snapshot from the date this package was build

devtools::install_deps(repos = list(CRAN = 'http://mran.revolutionanalytics.com/snapshot/2019-09-08/'))

To replicate the data descriptor:

rmarkdown::render("analysis/paper.Rmd")

Binder

Using the holepunch-package the project was made Binder ready. Binder allows you to execute the data descriptor in the cloud in your web browser.

Launch Rstudio Binder

User with access to the German Competence Center for Bibliometrics data infrastructure

If you have access to the Competence Center of Bibliometrics data infrastructure, you can replicate how data was obtained from the Web of Science. These steps are described in the R Markdown documents, starting with 00 in the analysis/ folder.

To get started, follow the steps described above to download the research compendium including the necessary R packages. Next, add your database login credentials to your .Renviron file and save it. You can open your .Renviron file from R with usethis::edit_r_environ().

kb_user="najko"
kb_pwd="12345"

Reload your R session.

The Oracle database driver needed to access the remote database is included in this repository.

To replicate the R Markdown documents, call rmarkdown::render().

When working on the script-server, please note that you need to render the documents with knitr::knit(), because pandoc is not available.

Limitations

Using a docker container with all source code and data needed to reproduce this research compendium would reduce the above-described set-up efforts. Unfortunately, access to the German Competence Center for Bibliometrics requires a VPN tunnel that, at least in my local setup, is not accessible from the Docker container. Furthermore, Docker is not available for users on the script-server.

License

Re-used data terms:

This work uses Web of Science data by Clarivate Analytics provided by the German Competence Center for Bibliometrics for research purposes.

Crossref asserts no claims of ownership to individual items of bibliographic metadata and associated Digital Object Identifiers (DOIs) acquired through the use of the Crossref Free Services. Individual items of bibliographic metadata and associated DOIs may be cached and incorporated into the user's content and systems.

ISSN-Matching of Gold OA Journals (ISSN-GOLD-OA) 3.0 and Country Geocodes obtained from Google are made available under CC-BY.

The documentation and data descriptor including the figures are made available under CC-BY 4.0.

Source Code: MIT (Najko Jahn, 2019)

Contributors

in alphabetical order

  • Design and Conceptualization: Colleen Campbell, Kai Geschuhn, Najko Jahn
  • Data Analysis: Najko Jahn
  • Writing and Review: Colleen Campbell, Anne Hobert, Najko Jahn, Birgit Schmidt, Niels Taubert, Anne Hobert

Contributing

This data analytics works has been developed using open tools. There are a number of ways you can help make it better:

  • If you don’t understand something, please let me know and submit an issue.

Feel free to add new features or fix bugs by sending a pull request.

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

Acknowledgment

This work is supported by the Federal Ministry of Education and Research of Germany (BMBF) in the framework Quantitative Research on the Science Sector (Project: "OAUNI Entwicklung und Einflussfaktoren des Open-Access-Publizierens an Universitäten in Deutschland", Förderkennzeichen: 01PU17023A).

Contact

Najko Jahn, Data Analyst, SUB Göttingen. najko.jahn@sub.uni-goettingen.de