Add another parsing method #53

MaurizioRicci · 2018-09-23T11:29:35Z

Often desired information in a web site, are grouped under class name. For example in some site a list of torrent are a list of div with a particular class. So in some case it would be better to find elements by ID, ClassName or by Type instead of parsing page with the standard html parser and using various flags or variable to remember the state during parsing.

The question is, what about adding another parsing method? Something like jQuery, maybe pyquery or BeautifulSoup:
https://pythonhosted.org/pyquery/
https://www.crummy.com/software/BeautifulSoup/bs4/doc/

What do you think? @Chocobo1 @sledgehammer999 @Piccirello @zeule @ngosang @hannsen

hannsen · 2018-09-24T13:32:54Z

What do you propose, that they distribute BeautifulSoup with Qbit?

MaurizioRicci · 2018-09-24T13:35:54Z

@hannsen Yes I was thinking something like that. I don't think that it will require too effort and it may help a lot people. What do you think?

ngosang · 2018-09-24T16:10:39Z

I think beautifulsoup4 is perfect for this, but we have to include the package in the qbittorrent repository so the user doesn't have to install external packages. I can do it but I think @sledgehammer999 will oppose...

MaurizioRicci · 2018-09-24T16:16:39Z

I understand, never mind. Mine was just a suggestion

nindogo · 2018-12-01T19:26:50Z

Often desired information in a web site, are grouped under class name. For example in some site a list of torrent are a list of div with a particular class. So in some case it would be better to find elements by ID, ClassName or by Type instead of parsing page with the standard html parser and using various flags or variable to remember the state during parsing.

Could you please share a couple of sites with this issue?

nindogo

MaurizioRicci · 2018-12-01T22:42:33Z

I was thinking about all the site whose content are inside a table. It would be easy to find all tr and then iterating all td inside each tr. Instead of parsing whole page with flag. My request was intended as a suggestion if it could be done easily. I don't want to give you hard work Maurizio Ricci Il giorno sab 1 dic 2018, 20:26 Ni Ndogo <notifications@github.com> ha scritto:

…

Often desired information in a web site, are grouped under class name. For example in some site a list of torrent are a list of div with a particular class. So in some case it would be better to find elements by ID, ClassName or by Type instead of parsing page with the standard html parser and using various flags or variable to remember the state during parsing. Could you please share a couple of sites with this issue? nindogo — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#53 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AflCPDKtffqJRp2AN0fqo902lxAbsZ01ks5u0tf7gaJpZM4W1qfG> .

nindogo · 2018-12-02T15:42:44Z

Hi,

I have found for many of those, the re module can usually help.

hannsen · 2018-12-02T16:54:47Z

yeah I used a lot regex, too. It's faster than parsing but also not very readable, but neither is the standard html parser

imDMG · 2019-01-26T19:29:39Z

Right now i write (trying) a module (wrapper) for HTMLParser: https://github.com/imDMG/HTMLSelector
For now working some basic operations.

MaurizioRicci · 2019-01-27T11:23:31Z

i like your module, it looks like pyquery or similar plus it's based on HTMLparser, wich is a standars module

ngosang added the enhancement label Sep 24, 2018

Jaid mentioned this issue Oct 29, 2018

Provide modern Python tools for search engine plugins qbittorrent/qBittorrent#9781

Open

ngosang mentioned this issue Apr 7, 2019

Drop support for Python2 #84

Closed

16 tasks

ngosang mentioned this issue May 16, 2021

Creating plugins with inheritance #167

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add another parsing method #53

Add another parsing method #53

MaurizioRicci commented Sep 23, 2018 •

edited

hannsen commented Sep 24, 2018

MaurizioRicci commented Sep 24, 2018

ngosang commented Sep 24, 2018 •

edited

MaurizioRicci commented Sep 24, 2018

nindogo commented Dec 1, 2018

MaurizioRicci commented Dec 1, 2018 via email

nindogo commented Dec 2, 2018

hannsen commented Dec 2, 2018 •

edited

imDMG commented Jan 26, 2019

MaurizioRicci commented Jan 27, 2019

Add another parsing method #53

Add another parsing method #53

Comments

MaurizioRicci commented Sep 23, 2018 • edited

hannsen commented Sep 24, 2018

MaurizioRicci commented Sep 24, 2018

ngosang commented Sep 24, 2018 • edited

MaurizioRicci commented Sep 24, 2018

nindogo commented Dec 1, 2018

MaurizioRicci commented Dec 1, 2018 via email

nindogo commented Dec 2, 2018

hannsen commented Dec 2, 2018 • edited

imDMG commented Jan 26, 2019

MaurizioRicci commented Jan 27, 2019

MaurizioRicci commented Sep 23, 2018 •

edited

ngosang commented Sep 24, 2018 •

edited

hannsen commented Dec 2, 2018 •

edited