Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On the same site, different recognition of encoding #391

Open
butaford opened this issue Dec 24, 2023 · 2 comments
Open

On the same site, different recognition of encoding #391

butaford opened this issue Dec 24, 2023 · 2 comments
Labels
bug Something isn't working upstream

Comments

@butaford
Copy link

Hello. RSSTT is installed in Docker. The latest versions do not display the Cyrillic alphabet correctly. On version 2.4 I have this:
pic-20231224-184840

On 2.2 like this:
pic-20231224-184812
Links for check:
Displays correctly in all versions
http://iptvin.ru/component/jcomments/?task=rss&object_id=1000853&object_group=com_content&tmpl=component
Displays correctly no higher than version 2.2
http://iptvin.ru/component/jcomments/?task=rss&object_id=1000707&object_group=com_content&tmpl=component

Sorry for bad English. Best Regards

@Rongronggg9
Copy link
Owner

Rongronggg9 commented Dec 24, 2023

The later feed seems to contain invalid characters in UTF-8, which makes feedparser fall back to other encodings. Theoretically, it is either an upstream issue or a website fault, but I will try to work around it before it gets fixed upstream.

Before v2.3, feeds were decoded by aiohttp before passing to feedparser.

@butaford
Copy link
Author

Thank you. Thank you. Thank you 😘
Works correctly with your edits: https://github.com/Rongronggg9/feedparser/tree/fix/encoding-confidence

image
image

@Rongronggg9 Rongronggg9 added bug Something isn't working upstream labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working upstream
Projects
None yet
Development

No branches or pull requests

2 participants