Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mulea #3328

Closed
10 tasks done
stitam opened this issue Mar 8, 2024 · 18 comments
Closed
10 tasks done

mulea #3328

stitam opened this issue Mar 8, 2024 · 18 comments
Assignees
Labels

Comments

@stitam
Copy link

stitam commented Mar 8, 2024

Dear Bioconductor Team,

With this issue we would like to submit the mulea package. mulea is a comprehensive overrepresentation and functional enrichment analyser R package which reads ontologies (gene and protein sets) in a standardised GMT (Gene Matrix Transposed) format.

Kind regards,
Tamás

Update the following URL to point to the GitHub repository of
the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

  • I understand that by submitting my package to Bioconductor,
    the package source and all review commentary are visible to the
    general public.

  • I have read the Bioconductor Package Submission
    instructions. My package is consistent with the Bioconductor
    Package Guidelines.

  • I understand Bioconductor Package Naming Policy and acknowledge
    Bioconductor may retain use of package name.

  • I understand that a minimum requirement for package acceptance
    is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS.
    Passing these checks does not result in automatic acceptance. The
    package will then undergo a formal review and recommendations for
    acceptance regarding other Bioconductor standards will be addressed.

  • My package addresses statistical or bioinformatic issues related
    to the analysis and comprehension of high throughput genomic data.

  • I am committed to the long-term maintenance of my package. This
    includes monitoring the support site for issues that users may
    have, subscribing to the bioc-devel mailing list to stay aware
    of developments in the Bioconductor community, responding promptly
    to requests for updates from the Core team in response to changes in
    R or underlying software.

  • I am familiar with the Bioconductor code of conduct and
    agree to abide by it.

I am familiar with the essential aspects of Bioconductor software
management, including:

  • The 'devel' branch for new packages and features.
  • The stable 'release' branch, made available every six
    months, for bug fixes.
  • Bioconductor version control using Git
    (optionally via GitHub).

For questions/help about the submission process, including questions about
the output of the automatic reports generated by the SPB (Single Package
Builder), please use the #package-submission channel of our Community Slack.
Follow the link on the home page of the Bioconductor website to sign up.

@bioc-issue-bot
Copy link
Collaborator

Hi @stitam

Thanks for submitting your package. We are taking a quick
look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: mulea
Type: Package
Encoding: UTF-8
Title: mulea - an R package for enrichment analysis using 
    multiple ontologies and empirical FDR correction
Version: 0.99.10
Date: 2016-04-08
Authors@R: c(
    person("Cezary", "Turek", role =  c("aut", "ctb"),
        comment = c(ORCID = "0000-0002-1445-5378")),
    person("Márton", "Ölbei", role = c("aut", "ctb"),
        comment = c(ORCID = "0000-0002-4903-6237")),
    person("Tamás", "Stirling", email = "stirling.tamas@gmail.com",
        role = c("aut", "cre"),
        comment = c(ORCID = "0000-0002-8964-6443")),
    person("Gergely", "Fekete", role = c("aut"),
        comment = c(ORCID = "0000-0001-9939-4860")),
    person("Ervin", "Tasnádi", role = c("aut"),
        comment = c(ORCID = "0000-0002-4713-5397")),
    person("Leila", "Gul", role = c("aut")),
    person("Balázs", "Bohár", role = c("aut"),
        comment = c(ORCID = "0000-0002-3033-5448")),
    person("Balázs", "Papp", role = c("aut"),
        comment = c(ORCID = "0000-0003-3093-8852")),
    person("Wiktor", "Jurkowski", role = c("aut"),
        comment = c(ORCID = "0000-0002-7820-1991")),
    person("Eszter", "Ari", role = c("aut", "ctb"), 
        comment = c(ORCID = "0000-0001-7774-1067")))
Description: Traditional gene set enrichment analyses are 
    typically limited to a few ontologies and do not account for 
    the interdependence of gene sets or terms, resulting in 
    overcorrected p-values. To address these chellenges, 
    we introduce mulea, an R package offering comprehensive 
    overrepresentation and functional enrichment analysis. mulea employs 
    an innovative empirical false discovery 
    rate (eFDR) correction method, specifically designed for 
    interconnected biological data, to accurately identify significant 
    terms within diverse ontologies. Beyond conventional tools, 
    mulea incorporates a wide range of 
    ontologies encompassing Gene Ontology, 
    pathways, regulatory elements, genomic locations, and protein domains. 
    This flexibility empowers researchers to tailor enrichment analysis 
    to their specific questions, such as identifying enriched 
    transcriptional regulators in gene expression data or 
    overrepresented protein domains in protein sets. To facilitate 
    seamless analysis, mulea provides 
    gene sets (in standardized GMT format) for 27 model organisms, 
    covering 16 databases and various identifiers. Additionally, 
    the muleaData ExperimentData Bioconductor package simplifies 
    access to these 879 pre-defined ontologies. Furthermore, 
    mulea's architecture allows for easy integration of 
    user-defined ontologies, expanding its applicability 
    across diverse research areas.
biocViews: Annotation, DifferentialExpression, GeneExpression, 
    GeneSetEnrichment, GO, GraphAndNetwork, MultipleComparison, Pathways, 
    Reactome, Software, Transcription, Visualization
License: MIT + file LICENSE
Depends:
    R (>= 4.0.0)
Imports:
    devtools,
    data.table (>= 1.13.0),
    dplyr,
    fgsea (>= 1.0.2),
    ggplot2,
    ggraph (>= 2.0.3),
    magrittr (>= 2.0.3),
    methods,
    parallel (>= 4.0.2),
    plyr (>= 1.8.4),
    Rcpp,
    readr,
    rlang,
    scales,
    stats,
    stringi,
    tibble,
    tictoc,
    tidygraph,
    tidyverse
Suggests:
    knitr,
    rmarkdown,
    testthat (>= 3.1.4)
LinkingTo:
    Rcpp
VignetteBuilder: knitr
URL: https://github.com/ELTEbioinformatics/mulea
BugReports: https://github.com/ELTEbioinformatics/mulea/issues
RoxygenNote: 7.3.1
Roxygen: list(markdown = TRUE)
Config/testthat/edition: 3

@bioc-issue-bot bioc-issue-bot added the 1. awaiting moderation submitted and waiting clearance to access resources label Mar 8, 2024
@stitam
Copy link
Author

stitam commented Mar 8, 2024

Some comments:

We have already submitted muleaData which is the data package for mulea: #3291

mulea has an upstream which is archived and will not be updated in the future.

BiocCheck::BiocCheck() returns the following NOTES:

NOTE: Update R version dependency from 4.0.0 to 4.3.0.

I would rather not do this because then R CMD check will fail on oldrel-1 (ELTEbioinformatics/mulea#24). The package does not require R 4.3.0. (I have turned off GitHub workflows for the Bioconductor submission process)

NOTE: Cannot determine whether maintainer is subscribed to the Bioc-Devel mailing list (requires admin
      credentials). Subscribe here: https://stat.ethz.ch/mailman/listinfo/bioc-devel

I am registered to both the mailing list and the support forum.

This was referenced Mar 8, 2024
@lshep lshep added the pre-check passed pre-review performed and ready to be added to git label Mar 20, 2024
@bioc-issue-bot
Copy link
Collaborator

Your package has been added to git.bioconductor.org to continue the
pre-review process. A build report will be posted shortly. Please
fix any ERROR and WARNING in the build report before a reviewer is
assigned or provide a justification on why you feel the ERROR or
WARNING should be granted an exception.

IMPORTANT: Please read this documentation for setting
up remotes to push to git.bioconductor.org. All changes should be
pushed to git.bioconductor.org moving forward. It is required to push a
version bump to git.bioconductor.org to trigger a new build report.

Bioconductor utilized your github ssh-keys for git.bioconductor.org
access. To manage keys and future access you may want to active your
Bioconductor Git Credentials Account

@bioc-issue-bot bioc-issue-bot added pre-review on bioconductor git and access to on demand build but not assigned reviewer until build report clean and removed 1. awaiting moderation submitted and waiting clearance to access resources pre-check passed pre-review performed and ready to be added to git labels Mar 21, 2024
@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder:
Linux (Ubuntu 22.04.3 LTS): mulea_0.99.10.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/mulea to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.

@lshep
Copy link
Contributor

lshep commented Mar 27, 2024

Please fix ERROR in build report before the package will more forward in review.

@stitam
Copy link
Author

stitam commented Apr 2, 2024

Hi @lshep it seems I cannot interact with the upstream (Permission denied (publickey)). Can you please check if everything looks fine on your end? Is it possible that because of an earlier submission (which was not done by me) my GitHub account is not on the whitelist? FYI, I'm following this guide: https://contributions.bioconductor.org/git-version-control.html#new-package-workflow. SSH seems fine on my end.

@lshep
Copy link
Contributor

lshep commented Apr 2, 2024

everything is fine on our end. Please see about that you should activate your account and then if need be add additional ssh keys

@bioc-issue-bot
Copy link
Collaborator

Received a valid push on git.bioconductor.org; starting a build for commit id: 0417dfeadc43a7a1b52933236f2525e38af17be7

@stitam
Copy link
Author

stitam commented Apr 2, 2024

Thanks @lshep, my Bioconductor Git Credentials account was not activated.

@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "WARNINGS".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder:
Linux (Ubuntu 22.04.3 LTS): mulea_0.99.11.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/mulea to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.

@bioc-issue-bot
Copy link
Collaborator

Received a valid push on git.bioconductor.org; starting a build for commit id: 7514578d23c591d0ff75460cf5cad3b0d4a5e832

@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "WARNINGS".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder:
Linux (Ubuntu 22.04.3 LTS): mulea_0.99.12.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/mulea to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.

@stitam
Copy link
Author

stitam commented Apr 2, 2024

A quick note about the warning: library() is necessary for importing cpp stuff into nodes when using multiple threads. I've tried to implement a solution without library() but could not find one. We're happy to receive suggestions.

@lshep lshep added 2. review in progress assign a reviewer and a more thorough review of package code and documentation taking place and removed pre-review on bioconductor git and access to on demand build but not assigned reviewer until build report clean labels Apr 2, 2024
@bioc-issue-bot
Copy link
Collaborator

A reviewer has been assigned to your package for an indepth review.
Please respond accordingly to any further comments from the reviewer.

@DarioS
Copy link

DarioS commented Apr 5, 2024

I have trialled the package and noted some issues to be addressed.

  • The package lacks sufficient integration with existing Bioconductor infrastructure. It involves enrichment analysis but does not make use of GSEABase data structures. Also, the package depends on data strutures such as data.frame which have a Bioconductor equivalent DataFrame.
> library(S4Vectors)
> d <- DataFrame(x = 1:5)
> is(d, "data.frame")
  FALSE

Also, code uses parallel package instead of BiocParallel. For example, cl <- makeCluster(spec=nthread, type="PSOCK"). Please refer to Parallel Recommendations.
Please greatly improve interoperability with existing Bioconductor conventions.

  • The vignette imports an undocumented data set from the inst directory and uses eval = FALSE. No code chunks are allowed to use eval = FALSE. Please see Package Data chapter for how to best include data sets.
  • Some of the steps in the vignette are long seem like they should be a function. For example,
# if there are duplicated Gene.symbols keep the first one only
geo2r_result_tab_filtered <- geo2r_result_tab %>% 
    # grouping by Gene.symbol to be able to filter
    group_by(Gene.symbol) %>%
    # keeping the first row for each Gene.symbol from rows with the same 
    # Gene.symbol
    filter(row_number()==1) %>% 
    # ungrouping
    ungroup() %>% 
    # arranging by logFC in descending order
    arrange(desc(logFC)) %>%
    select(Gene.symbol, logFC)
  • Functions and variables need to use camelCase rather than snake_case format. See R Code.
  • Namespace file has both selective and complete imports for a particular package.
import(magrittr)
importFrom(magrittr,"%<>%")
importFrom(magrittr,"%>%")
  • Don't create empty vectors or lists and incrementally grow them. For example,
create_random_db <- function() {
  DB <- list()
  for (cat_i in seq_len(10)) {
    DB_cat_values <- c()

Refer to Vectorize.

@stitam
Copy link
Author

stitam commented Apr 5, 2024

Many thanks @DarioS for reviewing the package, we'll address these ASAP.

It was unclear to us whether Bioconductor requires camel case or snake case is also accepted, we decided to go with snake case and only harmonised this for uer facing functions. Please advise, for a successful review 1. should we moveo to camel case? 2. should we use harmonise for internal functions as well?

Regarding your observation on the namespace file: If there is complete import there is no need to include selective import as well, this is the issue, right?

@DarioS
Copy link

DarioS commented Apr 6, 2024

camelCase for user-facing functions is sufficient. Yes, import either completely or selectively, depending on how many functions.

@stitam
Copy link
Author

stitam commented May 27, 2024

Thank you Bioconductor Team for all your work! It was not feasible for us to implement these changes so we decided to publish the package on CRAN instead of Bioconductor. The package has been released: https://cran.r-project.org/web/packages/mulea/index.html.

@stitam stitam closed this as completed May 27, 2024
@bioc-issue-bot bioc-issue-bot removed the 2. review in progress assign a reviewer and a more thorough review of package code and documentation taking place label May 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants