New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix restricted dois problem #1724
fix restricted dois problem #1724
Conversation
8b4e4c7
to
7eba358
Compare
ed51f3a
to
ab061d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! This safely skips registering DOIs with DataCite on restricted records, and hides DOI metadata when users have changed the permissions. It also allows for DOIs to be registered when a user changes a restricted record to public or when an embargo expires (though maybe that should be tested just to be sure).
# Set metadata | ||
doc = self.serializer.dump_obj(record) | ||
self.client.api.update_doi(metadata=doc, doi=pid.pid_value, url=url) | ||
if hide: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't be able to update a doi even if it is is hidden? Also, how do I restore a hidden doi? Currently, if I change from restricted to public then the update method will update the metadata but not make the doi findable again, unless I miss something...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, good catch. it should be that the hided should be put back to "findable" after a change from "restricted" to "public". i will have a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's going to be a tricky one, since I don't think we want to run show_doi
for every single public DOI update.
I can think of two solutions:
- I can add a
state
option to theupdate_doi
function in the datacite library. Then I could pass along all the updates to DataCite in one call - We can modify pidstore so it has a
hidden
state. Then we could check locally whether the state of the DOI needs to be changed.
Let me know what you think or if you want to chat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, I think that should work too. Just make sure DataCite doesn't complain about having the event as "publish" for a published doi. Looks like the tests need to updated too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If DataCite indeed doesn't complain about the "publish" events that should be good. Otherwise, we need some kind of state, maybe the caller passes the old record value, to call the publish/show doi only on the change of permission.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How i tested it: DataCite doesn't complain.
b07ef89
to
a4402bc
Compare
if hide: | ||
self.client.api.hide_doi(doi=pid.pid_value) | ||
else: | ||
doc = self.serializer.dump_obj(record) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We reviewed this with @zzacharo and we think that when you change from restricted to public, here we are not calling .register()
, so the local PID will still have the status PIDStatus.NEW
or PIDStatus.RESERVED
.
We are missing a test that will test the change from restricted to public and viceversa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the pid register method is already called, on the publish. that is independent of the way datacite is handling the restriction access setting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about this call. That will not be called if the record is restricted. Isn't this suppose to register the "doi" PID?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that call won't happen until the record is public. The "doi" pid doesn't need to be registered until the DOI is actually sent to DataCite. So every time there are changes to the record it will try to register the DOI, but it will only get registered if the record is public. A test is probably a good idea though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about this call. That will not be called if the record is restricted. Isn't this suppose to register the "doi" PID?
what do you mean by "register" the doi PID?
as Tom said, the doi should only be set to R in pidstore_pid
if it has been registered at datacite. and that isn't the case if the record is set to restricted
. this means that the call in the register_or_update task will register the pid DOI at datacite after the change from restricted
to public
We reviewed this with @zzacharo and we think that when you change from restricted to public, here we are not calling
.register()
, so the local PID will still have the statusPIDStatus.NEW
orPIDStatus.RESERVED
. We are missing a test that will test the change from restricted to public and viceversa.
is this test what you had in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean using the DatacitProvider
directly without going through the service :) It is not the usual/recommended way but as an admin you might do that and I would rather have an extra check to register the local PID doi before going to datacite. The normal flow via services is working correctly no problem with that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then yes you are right, we have to add a if to register the pid if it has not been registered at that point. which makes the whole code really messy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the pid.register()
method already checks if a pid is registered otherwise do nothing.... Maybe I am overthinking it but I was just triggered by the fact that such a check was present in the register method..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be honest, in my point of view it is wrong to register the DOI pid in pidstore_pid
within the provider class. the provider should only take care what's associated with the provider. so the DataCitePIDProvider.register
should not know about the internal behavior and should only be called if it has been registered internally. But that is another discussion.
yes pid.register
takes care of knowing if it has already bin registered but this is not called in DataCitePIDProvider.update
as you pointed out. So if a record is set to RESTRICTED
the DOI PID will not be set to R in pidstore_pid
. If someone is using the DataCitePIDProvider.update
method then directly the DOI will be registered at DataCite and will remain K
in pidstore_pid
. if the record is then set to public
the code will mostly work, because the public_doi
method of the datacite rest client does more or less the same as the update does, it uses the event: publish
to register the doi.
the thing is, is it a problem if the doi is in state K
in pidstore_pid
for this scenario?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the /doi link won't work if the identifier is in state K
. I guess we could add a check to pidstore and then register if needed, though it's a lot of extra calls to take care of someone who's not really following the normal flow.
self.client.api.update_doi(metadata=doc, doi=pid.pid_value, url=url) | ||
if hide: | ||
self.client.api.hide_doi(doi=pid.pid_value) | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another way could be to delegate the state to DataCite: we could first read/get from DataCite, check if it is hidden or not, and therefore add the event: publish
or not. We are in a celery task so it could work.
However, we need to take into account that DataCite might not respond, and what to do in that case.
At the same time, I think that it would make sense to have an update
method that receive the old and the new version of the record, so the implementation could use the diff to make choices. It feels like a much bigger change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which concerns do you have with having always the event: publish
there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in my opinion it would complicate the whole process if it has first to be checked on DataCite if the record is hidden. The state management will be more complex and we need at least one request to get the information. Yes one request is not that much but scaled on multiple InvenioRDM instances with multiple records DataCite has to handle a lot more requests.
fe042f8
to
54bc272
Compare
8a82551
to
3714002
Compare
768b222
to
2175536
Compare
...mantic-ui/js/invenio_rdm_records/src/deposit/fields/AccessField/components/MetadataAccess.js
Outdated
Show resolved
Hide resolved
...mantic-ui/js/invenio_rdm_records/src/deposit/fields/AccessField/components/MetadataAccess.js
Outdated
Show resolved
Hide resolved
...mantic-ui/js/invenio_rdm_records/src/deposit/fields/AccessField/components/MetadataAccess.js
Outdated
Show resolved
Hide resolved
...mantic-ui/js/invenio_rdm_records/src/deposit/fields/AccessField/components/MetadataAccess.js
Outdated
Show resolved
Hide resolved
...mantic-ui/js/invenio_rdm_records/src/deposit/fields/AccessField/components/MetadataAccess.js
Outdated
Show resolved
Hide resolved
acc0383
to
b3aeaf0
Compare
92e3ccf
to
eb8bcef
Compare
@@ -21,7 +21,7 @@ | |||
class PIDManager: | |||
"""RDM PIDs Manager.""" | |||
|
|||
def __init__(self, providers, required_schemes=None): | |||
def __init__(self, providers, required_schemes=[]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any reason why we changed this to mutable argument? it is considered not to be a best practice
eb8bcef
to
2e0a53a
Compare
* fix bug for not existing docker-compose, use docker version instead * reorder steps to first setup and run backend and then setup and run frontend Co-authored-by: jrcastro2 <jrcastro9515@gmail.com> Co-authored-by: anikachurilova <nyuta.churilova@gmail.com>
2e0a53a
to
f213917
Compare
this PR will close inveniosoftware/product-rdm#178
still open Problem:
the DOI is shown on the landing page, because the PID exists in the databasethe DOI widget in the frontend has to be grayed out if the access is set to restrictedATTENTION:
NOTE:
restricted
topublic
and mint the DOI with this step. No further implementation or work from the user would be necessaryUI:
after grace period of a public record ended, the restriction button is disabled + the popup message