Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework Gem2Go spider (was riskommunal) #241

Merged
merged 5 commits into from Apr 29, 2024
Merged

Rework Gem2Go spider (was riskommunal) #241

merged 5 commits into from Apr 29, 2024

Conversation

nblock
Copy link
Member

@nblock nblock commented Apr 28, 2024

  • Rename from Riskommunal to Gem2Go
  • Support multiple versions of the same "CMS"
  • Remove broken icon URL

It seems that GEM2GO is the product name.
The icon URL requires parameters and those links are broken in the feed
XML. Remove it instead of writing a broken URL.
Their "CMS" seems to render article dates on the overview page, the
article page or somewhat randomly on either of them.
Sites use different versions of the same "CMS". Extract the "news"
container first and for each container scrape the article URL and the
publication date.
@nblock nblock self-assigned this Apr 28, 2024
@nblock nblock requested a review from Lukas0907 April 28, 2024 16:08
@nblock nblock merged commit 2e6372c into master Apr 29, 2024
5 checks passed
@nblock nblock deleted the next branch April 29, 2024 06:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants