Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework Gem2Go spider (was riskommunal) #241

Merged
merged 5 commits into from
Apr 29, 2024
Merged

Rework Gem2Go spider (was riskommunal) #241

merged 5 commits into from
Apr 29, 2024

Commits on Apr 28, 2024

  1. Rename riskommunal to gem2go

    It seems that GEM2GO is the product name.
    nblock committed Apr 28, 2024
    Configuration menu
    Copy the full SHA
    a4f9d7b View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    63602b8 View commit details
    Browse the repository at this point in the history
  3. gem2go: remove broken icon url

    The icon URL requires parameters and those links are broken in the feed
    XML. Remove it instead of writing a broken URL.
    nblock committed Apr 28, 2024
    Configuration menu
    Copy the full SHA
    a8c44b1 View commit details
    Browse the repository at this point in the history
  4. gem2go: scrape date either overview or article page

    Their "CMS" seems to render article dates on the overview page, the
    article page or somewhat randomly on either of them.
    nblock committed Apr 28, 2024
    Configuration menu
    Copy the full SHA
    642ab17 View commit details
    Browse the repository at this point in the history
  5. gem2go: add support for just another version

    Sites use different versions of the same "CMS". Extract the "news"
    container first and for each container scrape the article URL and the
    publication date.
    nblock committed Apr 28, 2024
    Configuration menu
    Copy the full SHA
    f99dd12 View commit details
    Browse the repository at this point in the history