Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with ZenodoManager$depositRecordVersion(): API access / record information lost #142

Closed
ablaette opened this issue Dec 21, 2023 · 4 comments

Comments

@ablaette
Copy link

ablaette commented Dec 21, 2023

I wish to use the functionality to deposit a new version for a record using the method $depositRecordVersion() of the ZenodoManager class. The short explanation in the vignette is straight-forward and great. I greatly appreciate your systematic work to expose the abilities of the API. It has a great potential for our workflows, but here is a set of issues I encountered.

This is initial sample code I used.

library(zen4R) # I use 0.9.9000 from branch '126-zenodo-invenio-rdm'

zenodo <- ZenodoManager$new(
  token = Sys.getenv("ZENODO_ACCESS_TOKEN"),  # available via .Renviron
  logger = "DEBUG" # or "INFO"
)
myrec <- zenodo$getDepositionByDOI("10.5281/zenodo.7949074") # latest deposition of GermaParl corpus

# some modifications
myrec$setVersion("v2.0.1")
myrec$setPublicationDate(Sys.Date()) 
myrec$prereserveDOI(FALSE) # necessary?

myrec2 <- zenodo$depositRecordVersion(
  myrec,
  delete_latest_files = TRUE,
  publish = FALSE
)

Resulting in:

[zen4R][INFO] ZenodoManager - Creating new version for record '7949074/versions/latest' (concept DOI: '10.5281/zenodo.3735140')
-> POST /api/deposit/depositions/7949074/versions/latest/actions/newversion HTTP/1.1
-> Host: zenodo.org
-> Accept-Encoding: deflate, gzip
-> Cookie: 5569e5a730cade8ff2b54f1e815f3670=55bec0eaecfcbf5692fa89fa4aad17e3
-> Accept: application/json, text/xml, application/xml, /
-> User-Agent: zen4R_0.9.9000
-> Content-Type: application/json
-> Authorization: <your_token>
-> Content-Length: 2
->

{}

<- HTTP/1.1 404 NOT FOUND
<- server: nginx
<- date: Thu, 21 Dec 2023 07:27:14 GMT
<- content-type: application/json
<- transfer-encoding: chunked
<- vary: Accept-Encoding
<- access-control-allow-origin: *
<- access-control-expose-headers: Content-Type, ETag, Link, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset
<- permissions-policy: interest-cohort=()
<- x-frame-options: sameorigin
<- x-xss-protection: 1; mode=block
<- x-content-type-options: nosniff
<- content-security-policy: default-src 'self' fonts.googleapis.com *.gstatic.com data: 'unsafe-inline' 'unsafe-eval' blob: zenodo-broker.web.cern.ch zenodo-broker-qa.web.cern.ch maxcdn.bootstrapcdn.com cdnjs.cloudflare.com ajax.googleapis.com webanalytics.web.cern.ch
<- strict-transport-security: max-age=31556926; includeSubDomains
<- referrer-policy: strict-origin-when-cross-origin
<- set-cookie: csrftoken=eyJhbGciOiJIUzUxMiIsImlhdCI6MTcwMzE0MzYzNCwiZXhwIjoxNzAzMjMwMDM0fQ.ImtPSTJuZTQ4RzV0a1k2SzgxdUpkdFdHUDVrdXlFcmFyIg.Dk2xBvMMQ_shaBiG1ObSjYpr1LyBZ1fkVgcfR-kQVDiJUBP72HPWOwswXpfOVmvgrQvtCCUGKg0fR4X7Q4pjXw; Expires=Thu, 28 Dec 2023 07:27:14 GMT; Max-Age=604800; Secure; Path=/; SameSite=Lax
<- content-encoding: gzip
<-
[zen4R][ERROR] ZenodoManager - Error while creating new version: The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.

Based on the documentation, I explored the API with curl from the Terminal and I realized that the API does not accept " /api/deposit/depositions/7949074/versions/latest/actions". I can modify the API call as follows:

myrec$links$latest <- "https://zenodo.org/api/records/7949074"

Running the code with the HACK on the link of the latest version ...

library(zen4R)
library(fs)
library(magrittr)

zenodo <- ZenodoManager$new(
  token = Sys.getenv("ZENODO_ACCESS_TOKEN"), 
  logger = "DEBUG" # or "INFO"
)
myrec <- zenodo$getDepositionByDOI("10.5281/zenodo.6546810")

myrec$setVersion("v2.0.0")
myrec$setPublicationDate(Sys.Date()) 
myrec$prereserveDOI(FALSE)

myrec$links$latest <- "https://zenodo.org/api/records/6546810" # !!!!!! HACK !!!!

myrec2 <- zenodo$depositRecordVersion(
  myrec,
  delete_latest_files = TRUE,
  publish = FALSE
)

... now yields:

[zen4R][INFO] ZenodoManager - Creating new version for record '6546810' (concept DOI: '10.5281/zenodo.3822638')
-> POST /api/deposit/depositions/6546810/actions/newversion HTTP/1.1
-> Host: zenodo.org
-> Accept-Encoding: deflate, gzip
-> Cookie: 5569e5a730cade8ff2b54f1e815f3670=55bec0eaecfcbf5692fa89fa4aad17e3; csrftoken=eyJhbGciOiJIUzUxMiIsImlhdCI6MTcwMzE0MzYzNCwiZXhwIjoxNzAzMjMwMDM0fQ.ImtPSTJuZTQ4RzV0a1k2SzgxdUpkdFdHUDVrdXlFcmFyIg.Dk2xBvMMQ_shaBiG1ObSjYpr1LyBZ1fkVgcfR-kQVDiJUBP72HPWOwswXpfOVmvgrQvtCCUGKg0fR4X7Q4pjXw
-> Accept: application/json, text/xml, application/xml, /
-> User-Agent: zen4R_0.9.9000
-> Content-Type: application/json
-> Authorization: <your_token>
-> Content-Length: 2
->

{}

<- HTTP/1.1 201 CREATED
<- server: nginx
<- date: Thu, 21 Dec 2023 07:32:03 GMT
<- content-type: application/json
<- content-length: 3970
<- etag: "7"
<- x-ratelimit-limit: 1000
<- x-ratelimit-remaining: 995
<- x-ratelimit-reset: 1703143983
<- retry-after: 59
<- permissions-policy: interest-cohort=()
<- x-frame-options: sameorigin
<- x-xss-protection: 1; mode=block
<- x-content-type-options: nosniff
<- content-security-policy: default-src 'self' fonts.googleapis.com *.gstatic.com data: 'unsafe-inline' 'unsafe-eval' blob: zenodo-broker.web.cern.ch zenodo-broker-qa.web.cern.ch maxcdn.bootstrapcdn.com cdnjs.cloudflare.com ajax.googleapis.com webanalytics.web.cern.ch
<- strict-transport-security: max-age=31556926; includeSubDomains
<- referrer-policy: strict-origin-when-cross-origin
<- access-control-allow-origin: *
<- access-control-expose-headers: Content-Type, ETag, Link, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset
<- strict-transport-security: max-age=15768000
<- x-request-id: 06230b16baa2fc5b6b4429f5bdc4af0a
<-
[zen4R][INFO] ZenodoRequest - Fetching https://zenodo.org/api/user/records?q=recid:10417081&size=10&page=1&allversions=1
-> GET /api/user/records?q=recid:10417081&size=10&page=1&allversions=1 HTTP/1.1
-> Host: zenodo.org
-> Accept-Encoding: deflate, gzip
-> Cookie: 5569e5a730cade8ff2b54f1e815f3670=55bec0eaecfcbf5692fa89fa4aad17e3; csrftoken=eyJhbGciOiJIUzUxMiIsImlhdCI6MTcwMzE0MzYzNCwiZXhwIjoxNzAzMjMwMDM0fQ.ImtPSTJuZTQ4RzV0a1k2SzgxdUpkdFdHUDVrdXlFcmFyIg.Dk2xBvMMQ_shaBiG1ObSjYpr1LyBZ1fkVgcfR-kQVDiJUBP72HPWOwswXpfOVmvgrQvtCCUGKg0fR4X7Q4pjXw
-> Accept: application/json, text/xml, application/xml, /
-> User-Agent: zen4R_0.9.9000
-> Authorization: <your_token>
->
<- HTTP/1.1 200 OK
<- server: nginx
<- date: Thu, 21 Dec 2023 07:32:03 GMT
<- content-type: application/json
<- transfer-encoding: chunked
<- vary: Accept-Encoding
<- x-ratelimit-limit: 1000
<- x-ratelimit-remaining: 994
<- x-ratelimit-reset: 1703143984
<- retry-after: 60
<- permissions-policy: interest-cohort=()
<- x-frame-options: sameorigin
<- x-xss-protection: 1; mode=block
<- x-content-type-options: nosniff
<- content-security-policy: default-src 'self' fonts.googleapis.com *.gstatic.com data: 'unsafe-inline' 'unsafe-eval' blob: zenodo-broker.web.cern.ch zenodo-broker-qa.web.cern.ch maxcdn.bootstrapcdn.com cdnjs.cloudflare.com ajax.googleapis.com webanalytics.web.cern.ch
<- strict-transport-security: max-age=31556926; includeSubDomains
<- referrer-policy: strict-origin-when-cross-origin
<- access-control-allow-origin: *
<- access-control-expose-headers: Content-Type, ETag, Link, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset
<- strict-transport-security: max-age=15768000
<- x-request-id: 19359d63d4648134c8f1bb8cd53ca62c
<- content-encoding: gzip
<-
[zen4R][INFO] ZenodoManager - Successfully fetched list of depositions (user records)!
[zen4R][WARN] ZenodoManager - No record for id '10417081'!
[zen4R][INFO] ZenodoManager - Successful new version record created for concept DOI '10.5281/zenodo.3822638'
-> POST /api/records HTTP/1.1
-> Host: zenodo.org
-> Accept-Encoding: deflate, gzip
-> Cookie: 5569e5a730cade8ff2b54f1e815f3670=55bec0eaecfcbf5692fa89fa4aad17e3; csrftoken=eyJhbGciOiJIUzUxMiIsImlhdCI6MTcwMzE0MzYzNCwiZXhwIjoxNzAzMjMwMDM0fQ.ImtPSTJuZTQ4RzV0a1k2SzgxdUpkdFdHUDVrdXlFcmFyIg.Dk2xBvMMQ_shaBiG1ObSjYpr1LyBZ1fkVgcfR-kQVDiJUBP72HPWOwswXpfOVmvgrQvtCCUGKg0fR4X7Q4pjXw
-> Accept: application/json, text/xml, application/xml, /
-> User-Agent: zen4R_0.9.9000
-> Content-Type: application/json
-> Authorization: <your_token>
-> Content-Length: 2313
->

{
"stats": [
{
"downloads": 25895,
"unique_downloads": 16230,
"views": 9258,
"unique_views": 7139,
"version_downloads": 1647,
"version_unique_downloads": 1388,
"version_unique_views": 3504,
"version_views": 4547
}
],
"revision": 2,
"submitted": true,
"state": "done",
"status": "published",
"recid": "6546810",
"owners": [
{
"id": 80803
}
],
"modified": "2022-05-14T01:50:11.263852+00:00",
"metadata": {
"title": "GermaParl Sample Corpus",
"publication_date": "2023-12-21",
"description": "

The GermaParlSample Corpus is a small subset of the GermaParl corpus that has been prepared in the PolMine Project (http://polmine.github.io). The intended usage of the sample corpus is to explore the data format that has been linguistically annotated (using the TreeTagger) and imported into the Corpus Workbench (CWB), and to test functionality for automatic data retrieval from Zenodo. See the GermaParl documentation website (http://polmine.github.io/GermaParl) for further information.</p>\n\n

The purpose of GermaParlSample is to have a lightweight resource at Zenodo for testing purposes. The only reason why access to GermaParlSample v0.1.1 is limited is to have a version with restricted access, so that required cookies can be tested in the test suite of the R package 'cwbtools'. If you do not have access, you are not missing anything, v0.1.1 is identical with v0.1.0.</p>",
"access_right": "restricted",
"creators": [
{
"name": "Blätte, Andreas",
"affiliation": "University of Duisburg-Essen",
"orcid": "0000-0001-8970-8010"
}
],
"keywords": [
"corpus, Bundestag, parliamentary protocols"
],
"version": "v2.0.0",
"resource_type": {
"title": "Dataset",
"type": "dataset"
},
"relations": {
"version": [
{
"index": 2,
"is_last": true,
"parent": {
"pid_type": "recid",
"pid_value": "3822638"
}
}
]
},
"prereserve_doi": true
},
"doi_url": "https://doi.org/10.5281/zenodo.6546810",
"created": "2022-05-13T15:41:00.568217+00:00",
"conceptrecid": "3822638",
"conceptdoi": "10.5281/zenodo.3822638"
}

<- HTTP/1.1 201 CREATED
<- server: nginx
<- date: Thu, 21 Dec 2023 07:32:03 GMT
<- content-type: application/json
<- content-length: 3000
<- etag: "4"
<- x-ratelimit-limit: 1000
<- x-ratelimit-remaining: 993
<- x-ratelimit-reset: 1703143984
<- retry-after: 60
<- permissions-policy: interest-cohort=()
<- x-frame-options: sameorigin
<- x-xss-protection: 1; mode=block
<- x-content-type-options: nosniff
<- content-security-policy: default-src 'self' fonts.googleapis.com *.gstatic.com data: 'unsafe-inline' 'unsafe-eval' blob: zenodo-broker.web.cern.ch zenodo-broker-qa.web.cern.ch maxcdn.bootstrapcdn.com cdnjs.cloudflare.com ajax.googleapis.com webanalytics.web.cern.ch
<- strict-transport-security: max-age=31556926; includeSubDomains
<- referrer-policy: strict-origin-when-cross-origin
<- access-control-allow-origin: *
<- access-control-expose-headers: Content-Type, ETag, Link, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset
<- strict-transport-security: max-age=15768000
<- x-request-id: b60d3bf03fe478f61fb79c4c27272137
<-
[zen4R][INFO] ZenodoManager - Successful record deposition
[zen4R][INFO] ZenodoManager - Deleting files copied from latest record

So zen4R is not conforming to the latest development of the API?

So I was able to overcome this issue, but then I realized that significant parts of the record metadata is lost with the new record, such as: Resource type, Creator, language, keywords, communities.

Finally, I can use ZenodoManager$uploadFile() for small files, but there is a rate limit for larger files that I cannot overcome for my 1-3 GB files.

So I find the functionality of zen4R very, very useful, but I cannot use it as of now, unfortunately.

@eblondel
Copy link
Owner

See #127 and specifically for record version deposition see #140

@eblondel
Copy link
Owner

The Zenodo team has informed that the new API release is still not stable. Huge part of the migration has been done in #127 with a specific dev branch under work, but there are still missing parts.

@ablaette
Copy link
Author

Thanks a lot for your quick reply and my apologies that I failed to see #140. I understand the challenge you explain - so I will leave it with stating that I find your work incredibly useful. zen4R is a crucial building block for our RDM!!

@eblondel
Copy link
Owner

Method has been migrated through #133

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants