Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double image description #1536

Open
Inbefortus opened this issue Sep 5, 2021 · 4 comments · May be fixed by #1816
Open

Double image description #1536

Inbefortus opened this issue Sep 5, 2021 · 4 comments · May be fixed by #1816
Assignees
Labels
bug question wikimedia Direct impact on Wikimedia content scraping
Milestone

Comments

@Inbefortus
Copy link

ZIM: 2021-08 German Wikipedia

There are two image descriptions, although there should be only one (this is not the case everywhere).

http://library.kiwix.org/wikipedia_de_all_maxi/A/Belgien
http://library.kiwix.org/wikipedia_de_all_maxi/A/Nationalpark_Eifel

Screenshot_20210905-093214_Samsung Internet
Screenshot_20210905-093145_Samsung Internet
Screenshot_20210905-094031_Samsung Internet

@kelson42
Copy link
Collaborator

kelson42 commented Sep 5, 2021

Interesting, probably a bug in MWoffliner linked to some kind of change in HTML output.

@kelson42 kelson42 added bug wikimedia Direct impact on Wikimedia content scraping question labels Sep 5, 2021
@stale
Copy link

stale bot commented Nov 9, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

@kelson42
Copy link
Collaborator

@pavel-karatsiuba Can you please explain and fix this edge scenario?

@kelson42 kelson42 added this to the 1.13.0 milestone Mar 23, 2023
@pavel-karatsiuba pavel-karatsiuba linked a pull request Mar 23, 2023 that will close this issue
@pavel-karatsiuba
Copy link
Contributor

In some cases, by API we are getting the same description for the image and for the parent Element.
I have added a fix with checking if we got the same texts. If texts are the same then show only one of them.

@stale stale bot removed the stale label Mar 23, 2023
@kelson42 kelson42 modified the milestones: 1.13.0, 1.14.0 May 3, 2023
@VadimKovalenkoSNF VadimKovalenkoSNF self-assigned this Sep 18, 2023
@kelson42 kelson42 modified the milestones: 2.1.0, 1.14.0, 2.0.0 Sep 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug question wikimedia Direct impact on Wikimedia content scraping
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants