Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report RSS URL Direct Parsing Issue #408

Open
bdim404 opened this issue Oct 21, 2023 · 3 comments
Open

Report RSS URL Direct Parsing Issue #408

bdim404 opened this issue Oct 21, 2023 · 3 comments

Comments

@bdim404
Copy link

bdim404 commented Oct 21, 2023

I have encountered an issue when using the feedparser library to parse RSS directly from a URL. For example:

>>> feedparser.parse('https://hackernewsrss.com/feed.xml').keys()
dict_keys(['bozo', 'entries', 'feed', 'headers', 'bozo_exception'])
>>> d = feedparser.parse('https://hackernewsrss.com/feed.xml')
>>> d['feed']['title']

This results in the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/feedparser/util.py", line 113, in __getitem__
    return dict.__getitem(self, key)
KeyError: 'title'

However, when I download the XML file to my local system and parse it using feedparser's local file reading method, it works correctly, as shown below:

>>> d = feedparser.parse(r'./a.xml')
>>> d['feed']['title']
'Hacker News: New Comments'
>>> d['feed']['links']
[{'rel': 'alternate', 'type': 'text/html', 'href': 'https://news.ycombinator.com/newcomments'}]

I believe this may be a potential bug, as it should be possible to parse content directly from an RSS URL. I would appreciate it if this issue could be addressed. Thank you!

@carltongibson
Copy link

Works for me:

>>> import feedparser
>>> d = feedparser.parse('https://hackernewsrss.com/feed.xml')
>>> d.keys()
dict_keys(['bozo', 'entries', 'feed', 'headers', 'etag', 'href', 'status', 'encoding', 'version', 'namespaces'])
>>> d["feed"].keys()
dict_keys(['title', 'title_detail', 'subtitle', 'subtitle_detail', 'links', 'link', 'language', 'updated', 'updated_parsed', 'published', 'published_parsed', 'sy_updateperiod', 'image'])
>>> d["feed"]["title"]
'Hacker News RSS Feed'
>>>

@bdim404
Copy link
Author

bdim404 commented Mar 19, 2024

Ok, I got it! Thanks for your reply!

@alexscheelmeyer
Copy link

I had a similar problem with 6.0.11, tried downgrading to 6.0.3 and the issue is no longer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants