Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add endpoints to read pages from older crawl WACZs into database #1562

Merged
merged 15 commits into from Mar 19, 2024

Commits on Mar 13, 2024

  1. Stream crawl pages syncronously with remotezip

    And remove async streaming/zip parsing methods
    tw4l committed Mar 13, 2024
    Configuration menu
    Copy the full SHA
    d3f7e61 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    a0a72cc View commit details
    Browse the repository at this point in the history
  3. Fix linting

    tw4l committed Mar 13, 2024
    Configuration menu
    Copy the full SHA
    9d0cef6 View commit details
    Browse the repository at this point in the history
  4. Touchups

    tw4l committed Mar 13, 2024
    Configuration menu
    Copy the full SHA
    49d3674 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    5ee7a3f View commit details
    Browse the repository at this point in the history
  6. Add ZipInfo type annotations

    tw4l committed Mar 13, 2024
    Configuration menu
    Copy the full SHA
    978191d View commit details
    Browse the repository at this point in the history
  7. Remove aiostream dependency

    tw4l committed Mar 13, 2024
    Configuration menu
    Copy the full SHA
    81a8365 View commit details
    Browse the repository at this point in the history
  8. Bump CURR_DB_VERSION

    tw4l committed Mar 13, 2024
    Configuration menu
    Copy the full SHA
    c75ffe4 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    6745b77 View commit details
    Browse the repository at this point in the history
  10. Add tests

    tw4l committed Mar 13, 2024
    Configuration menu
    Copy the full SHA
    d1db0b4 View commit details
    Browse the repository at this point in the history
  11. Write pages to db in batches

    tw4l committed Mar 13, 2024
    Configuration menu
    Copy the full SHA
    1f00003 View commit details
    Browse the repository at this point in the history
  12. Linting fixes

    tw4l committed Mar 13, 2024
    Configuration menu
    Copy the full SHA
    ee764dc View commit details
    Browse the repository at this point in the history

Commits on Mar 14, 2024

  1. Configuration menu
    Copy the full SHA
    c6a887b View commit details
    Browse the repository at this point in the history

Commits on Mar 18, 2024

  1. Configuration menu
    Copy the full SHA
    720ba0c View commit details
    Browse the repository at this point in the history
  2. Fix typo

    tw4l committed Mar 18, 2024
    Configuration menu
    Copy the full SHA
    5fa5544 View commit details
    Browse the repository at this point in the history