Improve metadata update handling in repository_migration script. #387
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We improve the performance and allow to have no blobs when indexing by patching the Catalog and only updating relevant metadata instead the full list of metadata.
The Catalog will always update all available metadata if reindexing an object with
update_metadata
enabled. There is no way to update only specific metadata values of an object. Updating metadata means, that all metadata will be recalcuated for an objects. This process takes a lot of time and is mostly unnecessary. In addition, it requires to have access to the blobs of documents because some of the metadata properties are relying on blob information (filesize). To speed up the migration and to be able to run the migration without the existence of blobs, we'll patch the catalog to only update the relevant metadata items.