Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update eurobis package #12

Open
3 of 7 tasks
salvafern opened this issue Mar 24, 2022 · 2 comments
Open
3 of 7 tasks

Update eurobis package #12

salvafern opened this issue Mar 24, 2022 · 2 comments
Labels
enhancement New feature or request

Comments

@salvafern
Copy link
Collaborator

salvafern commented Mar 24, 2022

An update of this package is planned for spring 2022.

The current way the package works is:

  1. via getEurobisData you can provide an aphiaid, mrgid, dasid and dates.
  2. An URL to geoserver using WFS is generated.
  3. If there are more than 20K records in the call, these are downloaded in batches (Very slow)
  4. The data are loaded into your R environment (this is loaded into memory, won't work with large amounts of data)
  5. A list of the datasets is provided together with the data, retrieved via IMIS.

In addition, there are functions to load the grids per species, which helps to visualize more quickly the distribution of the species: getEurobisGrid()

This GitHub issue lists the tasks that can be looked at:

  • Improve getting large amounts of data: Currently, large amounts of data are near impossible to download. We have to look into adding pagination and cache to improve this. See e.g. Slow requests when trying to access a large volume of data #8
  • Use of imis package: Check if the use of vlizBe/imis is needed or if there's a better alternative - this dependency causes many installation troubles Failing installation eurobis package (R 4.0.2) #11
  • Metadata: Currently the package gives the getEurobisData() output as a list with 3 tables, including metadata. While is great to provide metadata, I wonder if this is the best way for the users. I propose to include specific functions that provide a list of datasets being used by an specific query. Also definitions for each column.
  • Use EMODnetWFS?: The package uses the EMODnet-Biology webservices. So far it just uses the plain URL, but we can look into using instead EMODnet/EMODnetWFS or in a higher level -> eblondel/ows4R . Problem: querying by Marine Regions mrgid requires to use the viewParams WFS query option: this is not implemented yet in EMODnetWFS and we don't have knowledge on how to use it via ows4r. Test ows4R to query EMODnet Biology WFS data #15
  • Change names of functions?: Currently the functions are named as getEurobisData() instead of get_eurobis_data(). I propose to change the naming to use underscores as this is more widely used in R. For example, we can adhere to what the obis package does and call it: eurobis_occurrences()
  • Add CI: There are currently no automatic tests. We should add tests and add a CI pipeline via github actions. Add automatic testing and github actions #13 Add package website #14
  • Publication: Once fully developed, this package should be submitted to CRAN and maybe ask a review to ROpenSci.
@salvafern salvafern added the enhancement New feature or request label Mar 24, 2022
@LennertSchepers
Copy link
Member

Nice summary, thanks!

  • should the data be consistent with what's coming out of the EMODnet Bio download toolbox? (and have the option for the 3 versions? https://www.emodnet-biology.eu/emodnet-data-format)
  • metadata: What metadata is needed? What is returned in the download toolbox? How is this retrieved via the download toolbox?
  • vlizBE/imis: I think we should maintain a package that retrieves imis data (and loads it into R). But I'm not sure if this should be done in for this task -> it will probably depend on the previous question regarding what metadata is needed/wanted
  • Publication: depends on how mature we think the package will be after the update. (we decide this after the update)

@salvafern
Copy link
Collaborator Author

  • Yes of course. Although: keep in mind the full occurrences and parameters option does not return the date in the exact same format by the download toolbox and the webservices: the download toolbox splits the EMOF and occurrences in two tables; the webservices return all in one table. We can either split the data on the client side to be closer to the download toolbox, or we could not do it. I would say we should not transform the data on the client side and stick to the webservices - or add a helper function to split the data.
  • metadata: Actually just checked the download toolbox and does not return any metadata.
  • vlizBE/imis: the problem is that any issue in the imis package cascades to the eurobis package. I would rather keep both packages separated and explain how to use them together in a vignette.
  • Publication: agree :)

salvafern added a commit that referenced this issue Apr 4, 2022
salvafern added a commit that referenced this issue Apr 4, 2022
salvafern added a commit that referenced this issue Apr 6, 2022
salvafern added a commit that referenced this issue Apr 8, 2022
salvafern added a commit that referenced this issue Apr 12, 2022
salvafern added a commit that referenced this issue May 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants