Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML entities stripped from item title #243

Open
autonome opened this issue Nov 22, 2017 · 4 comments
Open

HTML entities stripped from item title #243

autonome opened this issue Nov 22, 2017 · 4 comments
Assignees
Labels

Comments

@autonome
Copy link

Eg: ">"

I'm parsing https://groups.google.com/forum/feed/mozilla.dev.platform/topics/rss.xml?num=50

The item titled "Intent to unship: as in image maps" has those entities encoded as < and > respectively.

However, in the code example below, the entities are missing from item.title, as is detectable from the length:

let url = 'https://groups.google.com/forum/feed/mozilla.dev.platform/topics/rss.xml?num=50';
let req = request(url);
let feedparser = new FeedParser();

req.on('response', function (res) {
this.pipe(feedparser);
});

feedparser.on('readable', function() {
let item = this.read();
console.log(item.title.length)
}

@danmactough
Copy link
Owner

@autonome Thanks for opening this issue.

This stripping is being done intentionally -- however, I can't actually remember why. 😬 Presumably, the idea was to avoid handing people a XSS injection foot-gun.

Note that the un-stripped title is available: item['rss:title']['#'].

{ title: 'Intent to unship:  as  in image maps',
  description: 'Hi, In bug 1317937 I intend to unship the feature of <a> elements acting the same way as <area> elements in image maps. This functionality was specced in HTML 4, but no other browser implemented it and was removed from HTML 5. Timothy (:tnikkel) tried to do it before, but it got blocked on',
  summary: 'Hi, In bug 1317937 I intend to unship the feature of <a> elements acting the same way as <area> elements in image maps. This functionality was specced in HTML 4, but no other browser implemented it and was removed from HTML 5. Timothy (:tnikkel) tried to do it before, but it got blocked on',
  date: 2017-11-08T23:50:27.000Z,
  pubdate: 2017-11-08T23:50:27.000Z,
  pubDate: 2017-11-08T23:50:27.000Z,
  link: 'https://groups.google.com/d/msg/mozilla.dev.platform/JUB5K-sz6ek/F4hQWdDRBQAJ',
  guid: 'https://groups.google.com/d/topic/mozilla.dev.platform/JUB5K-sz6ek',
  author: 'Emilio Cobos Álvarez',
  comments: null,
  origlink: null,
  image: {},
  source: {},
  categories: [],
  enclosures: [],
  'rss:@': {},
  'rss:title':
   { '@': {},
     '#': 'Intent to unship: <a> as <area> in image maps' },
  'rss:link':
   { '@': {},
     '#': 'https://groups.google.com/d/msg/mozilla.dev.platform/JUB5K-sz6ek/F4hQWdDRBQAJ' },
  'rss:description':

@autonome
Copy link
Author

autonome commented Nov 23, 2017 via email

@danmactough danmactough self-assigned this Dec 9, 2017
@danmactough danmactough added the bug label Dec 9, 2017
@danmactough
Copy link
Owner

related to #165

danmactough added a commit that referenced this issue Dec 11, 2017
Added option `strip_html` to restore old behavior.

Resolves #165, #243
@autonome
Copy link
Author

autonome commented Jan 4, 2018

Wonderful, thanks @danmactough!

danmactough added a commit that referenced this issue Jul 15, 2018
Added option `strip_html` to restore old behavior.

Resolves #165, #243
danmactough added a commit that referenced this issue Jul 15, 2018
Added option `strip_html` to restore old behavior.

Resolves #165, #243
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants