Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OAK-10778 - Support downloading from Mongo in parallel. #1435

Merged
merged 24 commits into from
May 13, 2024

Commits on Apr 25, 2024

  1. Support downloading from Mongo in parallel. Adds new boolean system p…

    …roperty: oak.indexer.pipelined.mongoParallelDump.
    nfsantos committed Apr 25, 2024
    Configuration menu
    Copy the full SHA
    e903c7a View commit details
    Browse the repository at this point in the history
  2. Add missing license headers.

    nfsantos committed Apr 25, 2024
    Configuration menu
    Copy the full SHA
    a074ea0 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    5f58aa9 View commit details
    Browse the repository at this point in the history
  4. Retrieve from system properties the version of the Mongo docker image…

    … to be used for tests.
    nfsantos committed Apr 25, 2024
    Configuration menu
    Copy the full SHA
    65387ce View commit details
    Browse the repository at this point in the history

Commits on Apr 26, 2024

  1. Configuration menu
    Copy the full SHA
    45623cf View commit details
    Browse the repository at this point in the history
  2. Improve documentation and replace use of var keyword by explicit type…

    … declaration where this makes code more clear.
    nfsantos committed Apr 26, 2024
    Configuration menu
    Copy the full SHA
    934bea6 View commit details
    Browse the repository at this point in the history

Commits on Apr 29, 2024

  1. Configuration menu
    Copy the full SHA
    3440733 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    55b1157 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    c412f2f View commit details
    Browse the repository at this point in the history

Commits on Apr 30, 2024

  1. Configuration menu
    Copy the full SHA
    b2684bd View commit details
    Browse the repository at this point in the history
  2. Reduce frequency of logging progress messages in the downloader from …

    …every 10k to every 20k.
    nfsantos committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    051cd51 View commit details
    Browse the repository at this point in the history

Commits on May 2, 2024

  1. When downloading the full range of values in _modified, use $gte(0) a…

    …nd $lte(Long.MAX_VALUE) instead $exits(_modified). $exists also checks for the property being equals to null, which cannot be verified just by looking at an index, because indexes in MongoDB do not contain null values. Using $exists requires retrieving the full document from the column store, which dramatically slows down the traversal.
    nfsantos committed May 2, 2024
    Configuration menu
    Copy the full SHA
    f5fd28c View commit details
    Browse the repository at this point in the history

Commits on May 6, 2024

  1. Configuration menu
    Copy the full SHA
    3fd5c3c View commit details
    Browse the repository at this point in the history

Commits on May 7, 2024

  1. Improve error handling when parallel download fails with some excepti…

    …on. Ensures that both download threads are shutdown gracefully.
    
    Small refactoring.
    nfsantos committed May 7, 2024
    Configuration menu
    Copy the full SHA
    63254c2 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    70f431e View commit details
    Browse the repository at this point in the history

Commits on May 8, 2024

  1. Apply review comments.

    nfsantos committed May 8, 2024
    Configuration menu
    Copy the full SHA
    a1bc963 View commit details
    Browse the repository at this point in the history

Commits on May 10, 2024

  1. Configuration menu
    Copy the full SHA
    8aceb22 View commit details
    Browse the repository at this point in the history
  2. Add a test for Pipelined strategy where the mongo filter does not mat…

    …ch any documents.
    
    Add additional comments.
    nfsantos committed May 10, 2024
    Configuration menu
    Copy the full SHA
    9e1a1a3 View commit details
    Browse the repository at this point in the history
  3. Make NodeDocumentCodec thread safe and simplify logic of switch state…

    …ment to reduce its size.
    nfsantos committed May 10, 2024
    Configuration menu
    Copy the full SHA
    d76d5fe View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    0d6f607 View commit details
    Browse the repository at this point in the history
  5. Address review comments.

    nfsantos committed May 10, 2024
    Configuration menu
    Copy the full SHA
    7dad006 View commit details
    Browse the repository at this point in the history
  6. Simplify logic.

    nfsantos committed May 10, 2024
    Configuration menu
    Copy the full SHA
    d7d2f12 View commit details
    Browse the repository at this point in the history
  7. Fix: Download only documents with _modified also when doing a column …

    …traversal (retry on connection errors false). Documents without _modified field are not needed to build the FFS and like this there is no need to do a null check when calling getModified on the documents.
    nfsantos committed May 10, 2024
    Configuration menu
    Copy the full SHA
    1dcb944 View commit details
    Browse the repository at this point in the history

Commits on May 13, 2024

  1. Configuration menu
    Copy the full SHA
    d3f2377 View commit details
    Browse the repository at this point in the history