Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qualtdict: Generating Variable Dictionaries and Labelled Data Exports of Qualtrics Surveys #572

Open
12 of 29 tasks
lyh970817 opened this issue Feb 2, 2023 · 18 comments
Open
12 of 29 tasks

Comments

@lyh970817
Copy link

lyh970817 commented Feb 2, 2023

Submitting Author Name: Yuhao Lin
Submitting Author Github Handle: @lyh970817
Repository: https://github.com/lyh970817/qualtdict
Version submitted: 0.0.0.9000
Submission type: Standard
Editor: @maurolepore
Reviewers: TBD

Archive: TBD
Version accepted: TBD
Language: en

  • Paste the full DESCRIPTION file inside a code block below:
Package: qualtdict
Title: Generating Variable Dictionaries and Labelled Data Exports of Qualtrics
    Surveys
Version: 0.0.0.9000
Authors@R:
    person("Yuhao", "Lin", , "yuhao.lin@kcl.ac.uk", role = c("aut", "cre"),
           comment = c(ORCID = "0000-0001-6357-5731"))
Description: Provides functions that generate variable dictionaries from
    'Qualtrics' <https://www.qualtrics.com/about/> surveys and labelled
    survey data based on the dictionary. This package is built upon the R
    package 'qualtRics' <https://github.com/ropensci/qualtRics/> which
    provides access to 'Qualtrics' survey data and metadata via the 'Qualtrics' API
    <https://api.qualtrics.com/>.
License: MIT + file LICENSE
URL: https://github.com/lyh970817/qualtdict
BugReports: https://github.com/lyh970817/qualtdict/issues
Imports:
    crul,
    dplyr,
    glue,
    haven,
    magrittr,
    openNLP,
    purrr,
    qualtRics,
    rlang,
    sjlabelled,
    slowraker,
    SnowballC,
    stringi,
    stringr,
    tibble,
    tidyr,
    xml2
Suggests:
    covr,
    knitr,
    rmarkdown,
    testthat (>= 3.0.0),
    vcr (>= 0.6.0)
VignetteBuilder: 
    knitr
Config/testthat/edition: 3
Config/testthat/start-first: dict_generate, dict_validate, get_survey_data
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3

Scope

  • Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):

    • data retrieval
    • data extraction
    • data munging
    • data deposition
    • data validation and testing
    • workflow automation
    • version control
    • citation management and bibliometrics
    • scientific software wrappers
    • field and lab reproducibility tools
    • database software bindings
    • geospatial data
    • text analysis
  • Explain how and why the package falls under these categories (briefly, 1-2 sentences):

Qualtrics is an online survey and data collection software platform. While the qualtRics R package implements data retrieval from the Qualtrics platform, this package 'qualtdict' further processes its output to generate variable dictionaries and labelled data designed to be used for data analyses directly.

  • Who is the target audience and what are scientific applications of this package?

The target audience is those who use the Qualtrics survey platform to collect data. This package generates variable dictionaries and labelled data designed to be used for data analyses directly.

No, but there is the similar qualtRics R package that retrieves a broader range of data from Qualtrics than this package utilises. The output formats from qualtRics are much less user-friendly, for example, it retrieves survey metadata in a nested-list, json-like format, while this package rearranges essential parts of this metadata (retrieved using quatRics) into a publishable variable dictionary in a table format that can be visually inspected in, for example, excel.

Yes.

  • If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

  • Explain reasons for any pkgcheck items which your package is unable to pass.

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

  • Do you intend for this package to go on CRAN?

  • Do you intend for this package to go on Bioconductor?

  • Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:

MEE Options
  • The package is novel and will be of interest to the broad readership of the journal.
  • The manuscript describing the package is no longer than 3000 words.
  • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
  • (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
  • (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
  • (Please do not submit your package separately to Methods in Ecology and Evolution)

Code of conduct

@ropensci-review-bot
Copy link
Collaborator

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

@ropensci-review-bot
Copy link
Collaborator

🚀

Editor check started

👋

@ropensci-review-bot
Copy link
Collaborator

Checks for qualtdict (v0.0.0.9000)

git hash: d31c0887

  • ✔️ Package name is available
  • ✔️ has a 'codemeta.json' file.
  • ✔️ has a 'contributing' file.
  • ✔️ uses 'roxygen2'.
  • ✔️ 'DESCRIPTION' has a URL field.
  • ✔️ 'DESCRIPTION' has a BugReports field.
  • ✔️ Package has at least one HTML vignette
  • ✔️ All functions have examples.
  • ✔️ Package has continuous integration checks.
  • ✔️ Package coverage is 86%.
  • ✔️ R CMD check found no errors.
  • ✔️ R CMD check found no warnings.

Package License: MIT + file LICENSE


1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.

type package ncalls
internal base 179
internal qualtdict 118
internal utils 5
internal stats 1
imports magrittr 70
imports rlang 8
imports glue 7
imports qualtRics 3
imports tibble 3
imports openNLP 2
imports sjlabelled 2
imports xml2 2
imports stringi 1
imports tidyr 1
imports crul NA
imports dplyr NA
imports haven NA
imports purrr NA
imports slowraker NA
imports SnowballC NA
imports stringr NA
suggests covr NA
suggests knitr NA
suggests rmarkdown NA
suggests testthat NA
suggests vcr NA
linking_to NA NA

Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.

base

list (66), length (9), names (7), c (6), unique (6), unlist (6), args (4), ifelse (4), is.null (4), max (4), min (4), paste0 (4), all (3), is.na (3), rownames (3), as.matrix (2), colnames (2), factor (2), for (2), grep (2), is.character (2), levels (2), seq_along (2), split (2), structure (2), table (2), vapply (2), which (2), any (1), as.logical (1), character (1), class (1), data.frame (1), do.call (1), if (1), is.function (1), is.logical (1), labels (1), lapply (1), mode (1), numeric (1), q (1), readRDS (1), return (1), sum (1), suppressWarnings (1), tempdir (1), vector (1)

qualtdict

item_or_level_qid (10), rep_level_qid (10), suf_level_qid (9), null_na (7), not_applicable_qid (6), questiontext_qid (6), suf_item_rep_level_qid (6), suf_item_suf_level_qid (6), collapse (5), file_upload_qid (5), rep_level (3), retry (3), calc_keyword_scores (2), check_item (2), check_json (2), check_names (2), easyname_gen (2), label_to_sfx (2), paste_narm (2), qid_recode (2), recode_json (2), rep_item (2), sbs_qid (2), suf_level_suf_item_qid (2), suf_text_qid (2), timing_qid (2), add_text (1), add_text_mc (1), checkarg_isfunction (1), checkarg_isname (1), checkarg_isqualtdict (1), convert_html (1), dict_generate (1), dict_validate (1), get_survey_data (1), is_onetoone (1), order_name (1), suf_nmlabel_qid (1), text (1), which_not_onetoone (1)

magrittr

%>% (70)

rlang

abort (7), hash (1)

glue

glue (7)

utils

txtProgressBar (4), getFromNamespace (1)

qualtRics

fetch_description (1), fetch_survey (1), metadata (1)

tibble

tibble (2), enframe (1)

openNLP

Maxent_POS_Tag_Annotator (1), Maxent_Word_Token_Annotator (1)

sjlabelled

set_label (1), set_labels (1)

xml2

read_html (1), xml_text (1)

stats

setNames (1)

stringi

stri_count_words (1)

tidyr

unite (1)

NOTE: Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.


2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

  • code in R (100% in 10 files) and
  • 1 authors
  • 1 vignette
  • no internal data file
  • 17 imported packages
  • 3 exported functions (median 25 lines of code)
  • 110 non-exported functions in R (median 10 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

  • loc = "Lines of Code"
  • fn = "function"
  • exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure value percentile noteworthy
files_R 10 59.0
files_vignettes 1 68.4
files_tests 7 86.4
loc_R 1152 71.7
loc_vignettes 118 30.8
loc_tests 1014 87.2
num_vignettes 1 64.8
n_fns_r 113 79.3
n_fns_r_exported 3 12.9
n_fns_r_not_exported 110 85.5
n_fns_per_file_r 6 75.4
num_params_per_fn 5 69.6
loc_per_fn_r 11 32.3
loc_per_fn_r_exp 25 55.9
loc_per_fn_r_not_exp 10 31.3
rel_whitespace_R 17 70.0
rel_whitespace_vignettes 25 21.4
rel_whitespace_tests 1 14.7
doclines_per_fn_exp 43 54.1
doclines_per_fn_not_exp 0 0.0 TRUE
fn_call_network_size 57 69.0

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


3. goodpractice and other checks

Details of goodpractice checks (click to open)

3a. Continuous Integration Badges

check-standard.yaml
test-coverage.yaml

GitHub Workflow Results

id name conclusion sha run_number date
4076045888 R-CMD-check success d31c08 11 2023-02-02
4076045893 test-coverage success d31c08 11 2023-02-02

3b. goodpractice results

R CMD check with rcmdcheck

R CMD check generated the following check_fail:

  1. no_import_package_as_a_whole

Test coverage with covr

Package coverage: 85.98

Cyclocomplexity with cyclocomp

No functions have cyclocomplexity >= 15

Static code analyses with lintr

lintr found the following 1 potential issues:

message number of times
Avoid library() and require() calls in packages 1


Package Versions

package version
pkgstats 0.1.3
pkgcheck 0.1.1.11


Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

@maurolepore
Copy link
Member

Dear @lyh970817, FYI I'm still searching for a handling editor. It shouldn't take much longer. Thanks for your patience.

@lyh970817
Copy link
Author

Dear @lyh970817, FYI I'm still searching for a handling editor. It shouldn't take much longer. Thanks for your patience.

Thank you so much!

@maurolepore
Copy link
Member

@ropensci-review-bot assign @maurolepore as editor

@ropensci-review-bot
Copy link
Collaborator

Assigned! @maurolepore is now the editor

@maurolepore
Copy link
Member

maurolepore commented Feb 11, 2023

Dear @lyh970817 I'm delighted to announce that I'll be the handling editor of this submission.

Semantic tags for my comments

To help you track my comments I tagged them with "ml" and numbered sequentially: ml01, ml02, and so on. Comments following bullets are for you to consider -- you may or may not respond to them. Comments following check-boxes are requests for some action -- please respond.

Reviewers

  • ml01. Can you please suggest three reviewers? Following our guidelines I'll use one at most, but I would like your view of the types of expertise needed to review qualtdict.

Checks

Here I list a few things that caught my attention. They are not blockers but the sooner we address them the better.

Package Dependencies

  • ml02. Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.

goodpractice and other checks

  • ml03. R CMD check generated the following check_fail: no_import_package_as_a_whole
  • ml04. Avoid library() and require() calls in packages: 1 time

@lyh970817
Copy link
Author

lyh970817 commented Feb 11, 2023

Thank you so much for taking time to review this. These are my responses.

ml01. Unfortunately I'm not sure if I could name any specific authors. But expertise-wise I thought having someone with a psychology/social science background might be helpful. As qualtdict is centred around creating a variable dictionary giving an intuitive overview of survey data for analysts. The usefulness of such a dictionary is probably best judged by someone who analyses such data on a daily basis (in contrast to a data engineer who implements APIs for such data).

ml02. R CMD Check seems to fail without importing some of the packages that I don't actually use. For instance, without importing haven:

Error in `set_labels_helper(x = .dat, labels = labels, force.labels = forc
e.labels, 
    force.values = force.values, drop.na = drop.na, var.name = NULL)`: Pac
kage 'haven' required for this function. Please install it.

ml03. I use dplyr, purrr and stringr extensively so I import them as a whole. Should I still import functions from them (which will be many) individually?

ml04. I think it comes from this line in the tests:

library(vcr) # *Required* as vcr is set up on loading

which is mandatory for vcr to work.

@maurolepore
Copy link
Member

maurolepore commented Feb 12, 2023

  • ml02. Following your example with the haven package I saw you need to import haven::read_xpt because the sjlabelled package needs it. That surprises me. Usually each package must import any external function it needs, and not ask users to do it. Do you know why that's the case? Also I see haven is listed in .pre-commit.config.yaml -- which I'm not familiar with. So likely there is a good explanation and I just happen to never have encounter a case like this. It would be good to articulate an explanation because reviewers might be surprised too.

  • ml03. Yeah, AFAIK best practice is to either namespace each function each time you call it or import each function individually. For example, each time use something like dplyr::filter() or import it once with usethis::use_import_from("dplyr", "filter") then use it each time just like filter().

  • ml04. I see. Thanks!

  • ml05. When tests run I see a lot of printed output. Please suppress it so that reviewers can see a succinct test report. If the output is not generated from an R condition (e.g. messages, warnings, or errors) it may be hard to suppress. See capture.output() -- you may need to implement a way to capture the output and maybe implement a quietly argument you can set to TRUE during tests.

  • ml06. The test results I see show many warnings. Please address them if you don't expect them or suppress them if you do expect them. If you expect them it's best to make them go away so that you don't develop the habit of ignoring them and risk missing an important one that you don't expect.

[ FAIL 0 | WARN 591 | SKIP 0 | PASS 4 ]
  • ml07. Can you please make your project an RStudio project? Most R developers/contributors work in RStudio. Without an .Rproj file launching the project is hard, and I would like reviewers to enter your package as smoothly as possible. You may use usethis::use_rstudio(). And later it may help to lower the entry-barrier for contributors.

@lyh970817
Copy link
Author

ml02. I believe this is because in sjlabelled, haven is a package in the Suggets field. The function it calls from haven is not actually haven::read_xpt but I needed to import an arbitrary function from haven for the set_labels function to see and load it.

Please see the DESCRIPTION file for sjlabelled: https://github.com/strengejacke/sjlabelled/blob/master/DESCRIPTION.

Package: sjlabelled
Type: Package
Encoding: UTF-8
Title: Labelled Data Utility Functions
Version: 1.2.0.3
Authors@R: c(
    person("Daniel", "Lüdecke", role = c("aut", "cre"), email = "d.luedecke@uke.de", comment = c(ORCID = "0000-0002-8895-3206")),
    person("avid", "Ranzolin", role = "ctb", email = "daranzolin@gmail.com"),
    person("Jonathan", "De Troye", role = "ctb", email = "detroyejr@outlook.com")
    )
Maintainer: Daniel Lüdecke <d.luedecke@uke.de>
Description: Collection of functions dealing with labelled data, like reading and 
    writing data between R and other statistical software packages like 'SPSS',
    'SAS' or 'Stata', and working with labelled data. This includes easy ways 
    to get, set or change value and variable label attributes, to convert 
    labelled vectors into factors or numeric (and vice versa), or to deal with 
    multiple declared missing values.
License: GPL-3
Depends:
    R (>= 3.4)
Imports:
    insight,
    datawizard,
    stats,
    tools,
    utils
Suggests:
    dplyr,
    haven (>= 1.1.2),
    magrittr,
    sjmisc,
    sjPlot,
    knitr,
    rlang,
    rmarkdown,
    snakecase,
    testthat
URL: https://strengejacke.github.io/sjlabelled/
BugReports: https://github.com/strengejacke/sjlabelled/issues
RoxygenNote: 7.2.1
VignetteBuilder: knitr

And the specific lines where haven is loaded: https://github.com/strengejacke/sjlabelled/blob/548fa397bd013ec7e44b225dd971d19628fdc866/R/set_labels.R#L317.

What would be the best way to deal with this?

ml05-7. I was able to capture the outputs when drafting the package so I should be able to do that in the tests. The warnings are not intended and are due to package versions. I will resolve these and create an RStudio project and then update this comment. Thank you so much!

@maurolepore
Copy link
Member

ml02. Thanks for explaining. The best solution will likely vary for each of the "unused" packages.

In the case of heaven, the file you showed me has a single call of the type haven::<some function> so it might be worth looking at the source code of that function and see if you can re-implement it and remove the dependency on haven.

https://github.com/strengejacke/sjlabelled/blob/548fa397bd013ec7e44b225dd971d19628fdc866/R/set_labels.R#L325

More generally, I think a great explanation of the trade-offs in dependencies is that of Jim Hester in his talk "It depends": https://www.youtube.com/watch?v=mum13N7CGUI . So as long as you understand those trade-offs you would be able to make an informed decision for each "unused" package and justify your decision if the reviewers ask.

@maurolepore
Copy link
Member

Dear @lyh970817,
Just checking. Would you be available to address the comments ml05-ml07? We can also put this submission on hold if you need more time. Let me know.

@lyh970817
Copy link
Author

Dear @lyh970817,

Just checking. Would you be available to address the comments ml05-ml07? We can also put this submission on hold if you need more time. Let me know.

Yes, sorry - would just need a couple more days to address these. Thanks.

@maurolepore
Copy link
Member

@ropensci-review-bot put on hold

@ropensci-review-bot
Copy link
Collaborator

Submission on hold!

@ropensci-review-bot
Copy link
Collaborator

@maurolepore: Please review the holding status

@maurolepore
Copy link
Member

maurolepore commented Apr 8, 2024

@lyh970817, how would you like to proceed?

  1. Resume the submission.
  2. Continue on hold.
  3. Withdrawal the submission.

The holding status will be revisited every 3 months, and after one year the issue will be closed.
-- https://devdevguide.netlify.app/softwarereview_policies.html#policiesreviewprocess

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants