-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migration: Fixing import issues / Speed up repeated unknown pod detection #8257
Conversation
spec/federation_callbacks_spec.rb
Outdated
diaspora_id = Fabricate.sequence(:diaspora_id) | ||
expect(Person).to receive(:find_or_fetch_by_identifier).with(diaspora_id) | ||
.and_raise(DiasporaFederation::Discovery::DiscoveryError) | ||
.and_return(nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Layout/MultilineMethodCallIndentation: Align .and_return
with .with
on line 236.
Reminder: To enable/disable local caching on your dev machine you can toggle it (without changing configuration files) with: I also added a statement to the spec helpers to clear any caches between each test. |
e480452
to
c114386
Compare
end | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Layout/TrailingEmptyLines: Final newline missing.
c114386
to
4f663f7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a full review yet, just a few comments about some basic stuff which probably changes a lot of the rest of the PR anyway.
app/models/person.rb
Outdated
@@ -330,19 +330,31 @@ def exported_key | |||
serialized_public_key | |||
end | |||
|
|||
# discovery (webfinger) | |||
# discovery (webfinger), returns nil if not found |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As already mentioned in #8264 (comment), this method works as intended and shouldn't be changed. It's important this raises (forwards) the DiscoveryError
, because multiple reasons:
- It preserves the original cause of what failed. When this just returns nil, it fails later with a different error and you don't immediately see that it just couldn't fetch the person.
- In most cases, raising is fine anyway, because without the person you can't continue with whatever should have been done. The archive-import is a bit of a special case, where you still want to import other stuff, and not stop the whole import process, but that problem should be solved at the import, not here.
- Caching here is also not a good idea, because if a received message can't be processed because the person couldn't be fetched (because maybe the other pod was just only restarted), sidekiq retries the receive a few minutes later, if this then still has the cached result, the whole retry would be pointless. The caching makes sense for the import (if the same person is tried to be fetched only seconds apart), but again, this should be solved for the import only, not here.
- Different errors are handled differently. As said above, received messages which fail because of the person not being able to be fetched are retried again a few times. With this (and all the related changes), this doesn't work anymore, because if this returns
nil
, andfetch_public_key
inconfig/initializers/diaspora_federation.rb
also returningnil
then (instead of raising the expectedDiscoveryError
), the receive job fails with aPublicKeyNotFound
, which isn't retried (because if a person can be fetched, because noDiscoveryError
was thrown, but doesn't have a valid key for whatever reason, a retry would fail again, so no need to retry). - There are probably more things that rely on
DiscoveryError
being raised (and this method not being cached), this was just what immediately came to my mind.
So there needs to be a way to fix the problems of the import so they only affect the import. Maybe move more logic into the validation step and remove entities that can't be imported, so stuff doesn't happen twice? Maybe a cache that is only populated during an import?
@@ -19,7 +21,9 @@ def fetch_parent(data) | |||
break entity_class::PARENT_TYPE if entity_class.const_defined?(:PARENT_TYPE) | |||
} | |||
entity = Diaspora::Federation::Mappings.model_class_for(type).find_by(guid: data.fetch(:parent_guid)) | |||
data[:parent] = Diaspora::Federation::Entities.related_entity(entity) | |||
raise NoParentError if entity.nil? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same question as in #8010 (comment) still exists. In theory this shouldn't happen, because in the validate step it already tries to fetch all the entities here
diaspora/lib/archive_validator/own_relayable_validator.rb
Lines 15 to 17 in f85f167
def by_guid(klass, guid) | |
klass.find_or_fetch_by(archive_author_diaspora_id, guid) | |
end |
using
diaspora/lib/diaspora/federated/fetchable.rb
Lines 9 to 17 in f85f167
def find_or_fetch_by(diaspora_id, guid) | |
instance = find_by(guid: guid) | |
return instance if instance.present? | |
DiasporaFederation::Federation::Fetcher.fetch_public(diaspora_id, to_s, guid) | |
find_by(guid: guid) | |
rescue DiasporaFederation::Federation::Fetcher::NotFetchable | |
nil | |
end |
which already returns nil when not being able to be fetched, and if the parent is nil, it's not valid:
diaspora/lib/archive_validator/relayable_validator.rb
Lines 16 to 18 in a3196a1
def validate | |
self.valid = parent_present? | |
end |
which then should remove all invalid entities during the validation phase, so they aren't tried to be actually imported:
diaspora/lib/archive_validator/collection_validator.rb
Lines 9 to 13 in f85f167
collection.keep_if do |item| | |
subvalidator = entity_validator.new(archive_hash, item) | |
messages.concat(subvalidator.messages) | |
subvalidator.valid? | |
end |
If this doesn't work as expected, it also means a lot of the things are done twice, which isn't intended (and could also lead to longer runtimes). If you have steps to reproduce this and can debug why entities that don't exist still end up at this point and aren't filtered out earlier already, it would be good to know and should be fixed there instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The part: find_by(guid: data.fetch(:parent_guid)) makes the problem here, not detected by the validator.
But we already talked that it seems to be inconvenient to fetch everything in the validation part.
@SuperTux88 I give up. Sorry - I didn't understand the validation process in depth so that I can check all needed authors, parents and entries and fix tests and implement all as a validation. |
9e2d729
to
93a776c
Compare
Fixing raising NotFetchable error Fixing not found a parent on entities
93a776c
to
6f23839
Compare
If I understand correctly, you since had a call with @SuperTux88 to solve this point. So what is the status of that pull request now? |
My status is, that first idea was to push checks into the validation part, which did not succeed because of needed code changes. Another (more inportant) point was that fetching everything during validation is not what a validation process should do.
After some code attempts, I implemented the smallest possible solution in this PR.
To get forward and have one working branch I also rebased the upload-migration file branch to this work.
Status is: it just works. (Until someone else Codes a „better“ solution).
… Am 14.09.2021 um 22:15 schrieb Fla ***@***.***>:
@SuperTux88 I give up. Sorry - I didn't understand the validation process in depth so that I can check all needed authors, parents and entries and fix tests and implement all as a validation.
If I understand correctly, you since had a call with @SuperTux88 to solve this point. So what is the status of that pull request now?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
@SuperTux88 As discussed, I have tried to make the smallest possible working solution without rebuilding the validation part. This solution gives what it should give. |
…and diaspora#8017 Just test for generic NotFetchable error, which also includes the root author failing to be fetched.
…and diaspora#8017 Just test for generic NotFetchable error, which also includes the root author failing to be fetched.
I ended up merging #8010 because it had separate commits for the two different problems and also had a test for both problems. But I merged in the test from here since this was more generic (tests generic fetch problem and not only discovery errors). Thanks for the work on this anyway since we finally found an answer for the original question in that PR 🍪 |
Awesome to finally see this merged! |
Thanks for your analysis - to finally found a solution. |
It handles some migration issues listed in #8256, please check @cmrd-senya, @jhass, @Flaburgan .
(Import stops when posts comments or likes don't have a parent/author),
It handles #8019 (caching person fetches from remotes)
For this reason I changed the Person.find_or_fetch_by_identifier function. It now won't raise a NotFetchable exception, it just returns nil or the person object. Some callers rely on nil (which was never returned), some other rely on exception. IMHO this function should return an existing person from local store or by fetching, but never raise an exception.
Now, Person.find_or_fetch_by_identifier returns a person, or nil. Callers are adapted to this behaviour.
@cmrd-senya : I left over a failing test, which I can't fix, maybe you have on eye on this?
@jhass : can you check import - times with your archive file?