Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migration: Fixing import issues / Speed up repeated unknown pod detection #8257

Closed
wants to merge 1 commit into from

Conversation

tclaus
Copy link
Member

@tclaus tclaus commented Jun 23, 2021

It handles some migration issues listed in #8256, please check @cmrd-senya, @jhass, @Flaburgan .
(Import stops when posts comments or likes don't have a parent/author),
It handles #8019 (caching person fetches from remotes)

For this reason I changed the Person.find_or_fetch_by_identifier function. It now won't raise a NotFetchable exception, it just returns nil or the person object. Some callers rely on nil (which was never returned), some other rely on exception. IMHO this function should return an existing person from local store or by fetching, but never raise an exception.
Now, Person.find_or_fetch_by_identifier returns a person, or nil. Callers are adapted to this behaviour.

@cmrd-senya : I left over a failing test, which I can't fix, maybe you have on eye on this?
@jhass : can you check import - times with your archive file?

diaspora_id = Fabricate.sequence(:diaspora_id)
expect(Person).to receive(:find_or_fetch_by_identifier).with(diaspora_id)
.and_raise(DiasporaFederation::Discovery::DiscoveryError)
.and_return(nil)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Layout/MultilineMethodCallIndentation: Align .and_return with .with on line 236.

@tclaus
Copy link
Member Author

tclaus commented Jun 23, 2021

Reminder: To enable/disable local caching on your dev machine you can toggle it (without changing configuration files) with:
$: rails dev:cache

I also added a statement to the spec helpers to clear any caches between each test.

@Flaburgan Flaburgan added this to the 0.8.0.0 milestone Jun 28, 2021
@tclaus tclaus changed the title Fixing import migration issues Migration: Fixing import issues / Speed up repeated unknown pod detection Jun 28, 2021
@tclaus tclaus force-pushed the 8256_fixing_migration_issues branch from e480452 to c114386 Compare July 1, 2021 07:00
end
end
end

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Layout/TrailingEmptyLines: Final newline missing.

Copy link
Member

@SuperTux88 SuperTux88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a full review yet, just a few comments about some basic stuff which probably changes a lot of the rest of the PR anyway.

@@ -330,19 +330,31 @@ def exported_key
serialized_public_key
end

# discovery (webfinger)
# discovery (webfinger), returns nil if not found
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As already mentioned in #8264 (comment), this method works as intended and shouldn't be changed. It's important this raises (forwards) the DiscoveryError, because multiple reasons:

  • It preserves the original cause of what failed. When this just returns nil, it fails later with a different error and you don't immediately see that it just couldn't fetch the person.
  • In most cases, raising is fine anyway, because without the person you can't continue with whatever should have been done. The archive-import is a bit of a special case, where you still want to import other stuff, and not stop the whole import process, but that problem should be solved at the import, not here.
  • Caching here is also not a good idea, because if a received message can't be processed because the person couldn't be fetched (because maybe the other pod was just only restarted), sidekiq retries the receive a few minutes later, if this then still has the cached result, the whole retry would be pointless. The caching makes sense for the import (if the same person is tried to be fetched only seconds apart), but again, this should be solved for the import only, not here.
  • Different errors are handled differently. As said above, received messages which fail because of the person not being able to be fetched are retried again a few times. With this (and all the related changes), this doesn't work anymore, because if this returns nil, and fetch_public_key in config/initializers/diaspora_federation.rb also returning nil then (instead of raising the expected DiscoveryError), the receive job fails with a PublicKeyNotFound, which isn't retried (because if a person can be fetched, because no DiscoveryError was thrown, but doesn't have a valid key for whatever reason, a retry would fail again, so no need to retry).
  • There are probably more things that rely on DiscoveryError being raised (and this method not being cached), this was just what immediately came to my mind.

So there needs to be a way to fix the problems of the import so they only affect the import. Maybe move more logic into the validation step and remove entities that can't be imported, so stuff doesn't happen twice? Maybe a cache that is only populated during an import?

@@ -19,7 +21,9 @@ def fetch_parent(data)
break entity_class::PARENT_TYPE if entity_class.const_defined?(:PARENT_TYPE)
}
entity = Diaspora::Federation::Mappings.model_class_for(type).find_by(guid: data.fetch(:parent_guid))
data[:parent] = Diaspora::Federation::Entities.related_entity(entity)
raise NoParentError if entity.nil?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same question as in #8010 (comment) still exists. In theory this shouldn't happen, because in the validate step it already tries to fetch all the entities here

def by_guid(klass, guid)
klass.find_or_fetch_by(archive_author_diaspora_id, guid)
end

using

def find_or_fetch_by(diaspora_id, guid)
instance = find_by(guid: guid)
return instance if instance.present?
DiasporaFederation::Federation::Fetcher.fetch_public(diaspora_id, to_s, guid)
find_by(guid: guid)
rescue DiasporaFederation::Federation::Fetcher::NotFetchable
nil
end

which already returns nil when not being able to be fetched, and if the parent is nil, it's not valid:

def validate
self.valid = parent_present?
end

which then should remove all invalid entities during the validation phase, so they aren't tried to be actually imported:

collection.keep_if do |item|
subvalidator = entity_validator.new(archive_hash, item)
messages.concat(subvalidator.messages)
subvalidator.valid?
end

If this doesn't work as expected, it also means a lot of the things are done twice, which isn't intended (and could also lead to longer runtimes). If you have steps to reproduce this and can debug why entities that don't exist still end up at this point and aren't filtered out earlier already, it would be good to know and should be fixed there instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The part: find_by(guid: data.fetch(:parent_guid)) makes the problem here, not detected by the validator.
But we already talked that it seems to be inconvenient to fetch everything in the validation part.

@tclaus
Copy link
Member Author

tclaus commented Jul 20, 2021

@SuperTux88 I give up. Sorry - I didn't understand the validation process in depth so that I can check all needed authors, parents and entries and fix tests and implement all as a validation.

@tclaus tclaus force-pushed the 8256_fixing_migration_issues branch from 9e2d729 to 93a776c Compare August 29, 2021 16:35
Fixing raising NotFetchable error
Fixing not found a parent on entities
@Flaburgan
Copy link
Member

@SuperTux88 I give up. Sorry - I didn't understand the validation process in depth so that I can check all needed authors, parents and entries and fix tests and implement all as a validation.

If I understand correctly, you since had a call with @SuperTux88 to solve this point. So what is the status of that pull request now?

@tclaus
Copy link
Member Author

tclaus commented Sep 15, 2021 via email

@tclaus
Copy link
Member Author

tclaus commented Sep 18, 2021

@SuperTux88 As discussed, I have tried to make the smallest possible working solution without rebuilding the validation part. This solution gives what it should give.
With this in mind - could you check this again and possible approve (and merge) this?
It will be let the #8274 working.

SuperTux88 pushed a commit to cmrd-senya/diaspora that referenced this pull request Sep 18, 2021
…and diaspora#8017

Just test for generic NotFetchable error, which also includes the root
author failing to be fetched.
SuperTux88 pushed a commit to cmrd-senya/diaspora that referenced this pull request Sep 18, 2021
…and diaspora#8017

Just test for generic NotFetchable error, which also includes the root
author failing to be fetched.
@SuperTux88
Copy link
Member

I ended up merging #8010 because it had separate commits for the two different problems and also had a test for both problems. But I merged in the test from here since this was more generic (tests generic fetch problem and not only discovery errors).

Thanks for the work on this anyway since we finally found an answer for the original question in that PR 🍪

@Flaburgan
Copy link
Member

Awesome to finally see this merged!

@tclaus
Copy link
Member Author

tclaus commented Sep 18, 2021

Thanks for your analysis - to finally found a solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants