Skip to content

Commit

Permalink
:Merge branch 'dev'
Browse files Browse the repository at this point in the history
  • Loading branch information
ablaette committed Jul 28, 2020
2 parents 914e425 + 14b3567 commit e907ada
Show file tree
Hide file tree
Showing 56 changed files with 5,844 additions and 1,706 deletions.
3 changes: 0 additions & 3 deletions .travis.yml
Expand Up @@ -5,9 +5,6 @@ matrix:
include:
- os: linux
dist: trusty
- os: osx
osx_image: xcode9.1
brew_packages: pkg-config glib pcre gsl

addons:
apt:
Expand Down
2 changes: 0 additions & 2 deletions CRAN-RELEASE

This file was deleted.

31 changes: 14 additions & 17 deletions DESCRIPTION
@@ -1,29 +1,27 @@
Package: GermaParl
Type: Package
Title: Download and Augment the Corpus of Plenary Protocols of the German Bundestag
Version: 1.3.0
Date: 2020-04-04
Authors@R: c(person(given = "Andreas", family = "Blaette", role = c("aut", "cre"), email = "andreas.blaette@uni-due.de"))
Version: 1.5.1
Date: 2020-07-27
Authors@R: c(
person(given = "Andreas", family = "Blaette", role = c("aut", "cre"), email = "andreas.blaette@uni-due.de"),
person("Christoph", "Leonhardt", role = "ctb")
)
Depends:
R (>= 3.5.0)
Imports:
polmineR,
cwbtools (>= 0.1.2),
data.table,
RCurl,
topicmodels,
methods,
jsonlite,
RcppCWB
cwbtools (>= 0.3.0),
zen4R
Suggests:
topicmodels,
knitr,
rmarkdown,
testthat
LazyData: yes
Description: Data package with the 'GermaParl' Corpus of Parliamentary Debates (German Bundestag)
prepared in the 'PolMine Project'. The package includes a small subset of the corpus as a demo
and for testing purposes. The package includes functionality to load the full corpus from the
open science 'Zenodo' repository and some auxiliary functions to enhance the corpus.
Description: Data package to disseminate the 'GermaParl' corpus of parliamentary debates of
the German Bundestag prepared in the 'PolMine Project'. The package includes a small subset of
the corpus for demonstration and testing purposes. The package includes functionality to
download the full corpus and supplementary data from the open science repository 'Zenodo'.
URL: https://github.com/polmine/GermaParl
BugReports: https://github.com/polmine/GermaParl/issues
License: GPL-3
Expand All @@ -33,5 +31,4 @@ Collate:
'GermaParl.R'
'download.R'
'lda.R'
'speeches.R'
RoxygenNote: 7.0.2
RoxygenNote: 7.1.1
29 changes: 4 additions & 25 deletions NAMESPACE
@@ -1,35 +1,14 @@
# Generated by roxygen2: do not edit by hand

export(germaparl_add_s_attribute_speech)
export(germaparl_download_corpus)
export(germaparl_download_lda)
export(germaparl_encode_lda_topics)
export(germaparl_get_doi)
export(germaparl_get_version)
export(germaparl_is_installed)
export(germaparl_load_topicmodel)
importFrom(RCurl,getURL)
importFrom(RCurl,url.exists)
importFrom(RcppCWB,cqp_get_registry)
importFrom(RcppCWB,cqp_is_initialized)
export(germaparl_load_lda)
importFrom(cwbtools,corpus_install)
importFrom(cwbtools,cwb_corpus_dir)
importFrom(cwbtools,cwb_registry_dir)
importFrom(cwbtools,registry_file_parse)
importFrom(cwbtools,registry_file_write)
importFrom(cwbtools,s_attribute_encode)
importFrom(data.table,":=")
importFrom(data.table,as.data.table)
importFrom(data.table,data.table)
importFrom(data.table,setcolorder)
importFrom(data.table,setkeyv)
importFrom(data.table,setnames)
importFrom(data.table,setorderv)
importFrom(jsonlite,fromJSON)
importFrom(methods,slot)
importFrom(polmineR,as.speeches)
importFrom(polmineR,decode)
importFrom(polmineR,partition)
importFrom(polmineR,s_attributes)
importFrom(polmineR,size)
importFrom(polmineR,use)
importFrom(topicmodels,topics)
importFrom(utils,download.file)
importFrom(zen4R,ZenodoManager)
46 changes: 45 additions & 1 deletion NEWS.md
@@ -1,4 +1,48 @@
# GermaParl v1.3.0
# GermaParl 1.5.1

- Functions included in older versions of the package that used functions from the RcppCWB package had been dropped. An unnecessary declaration of RcppCWB in the 'Imports:' section of the DESCRIPTION file has been removed.
- The data objects `germaparl_by_lp` and `germaparl_by_year` were included as `data.table` objects, making the presence of the `data.table` package necessary. To reduce the number of packages imported from and to avoid an error that emerged on Windows, these tables are included as `data.frame` objects.
- The documentation of the data objects `germaparl_by_lp` and `germaparl_by_year` now includes an explanation of what is reported in rows and columns.
- The `germaparl_by_year` table now includes a column `unknown_total` and `unknown_share` with the total number of tokens that cannot be lemmatized, and their share, respectively. On this basis, an error in the calculation of the aggregate unknown share for all years can be corrected.


# GermaParl 1.5.0

- The package will not depend on the polmineR package any more. Higher-level functions of the polmineR package have been replaced by lower-level functions.
- The functions `germaparl_encode_speeches()` and `germaparl_encode_lda_topics()` have been moved to the (GitHub-only) [polmineR.misc package](https://github.com/PolMine/polmineR.misc). These are higher-level functions that rely on polmineR classes and methods. Keeping them in the GermaParl package would require to make polmineR a dependency of GermaParl. But as GermaParl is designed to become a dependency of polmineR, we prevent a circular dependecy by removing the functions. What is more, both functions have been designed to augmment GermaParl, but their essence is morge generic. In the long run, a cwbtools.misc package (to be created) might be the most logical place for generic functionality to augment corpora.



# GermaParl 1.4.2

- Most functions now include an argument `sample` that defaults to `FALSE`. If set as
`TRUE`, functionality to retrieve information from the corpus or to modify the corpus
will be applied to the smaller GERMAPARLSAMPLE corpus rather than the GERMAPARL corpus.
- The sample workflow of the overall package documentation object will now rely on the
GERMAPARLSAMPLE corpus rather than the full GERMAPARL corpus.
- A Rmarkdown document in the data-raw folder explains how the topic model for the sample
corpus has been prepared.


# GermaParl 1.4.1

- The 'topicmodels' package has been turned into a suggested package and has been
moved from the 'Depends' section to the 'Suggests' section in the DESCRIPTION
file.
- Rework of the documentation, including examples.

# GermaParl 1.4.0

- To meet CRAN requirements, the corpus is not stored within the package as in
previous version, but in a system corpus directory. The same is applies to
supplementary data such as LDA topic models fitted on GERMAPARL.
- The core of the functionality of `germaparl_download_corpus()` to download the
corpus has been moved to cwbtools (v0.2.0). The `germaparl_download_corpus()`
function is now a convenience wrapper for `cwbtools::corpus_install()` that
ensures that the correct DOI (argument `doi`) is passed to `corpus_install()`.


# GermaParl 1.3.0

- The GermaParl corpus is downloaded now from a storage location at zenodo. The
`germapar_download_corpus()` function has been reworked accordingly. It now
Expand Down

0 comments on commit e907ada

Please sign in to comment.