Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a strip kwarg to get() and getall() #249

Open
bblanchon opened this issue Aug 22, 2022 · 1 comment · May be fixed by #260
Open

Adding a strip kwarg to get() and getall() #249

bblanchon opened this issue Aug 22, 2022 · 1 comment · May be fixed by #260

Comments

@bblanchon
Copy link

bblanchon commented Aug 22, 2022

Hi,

Thank you very much for this excellent library ❤️

I've been using Parsel for a while and I constantly find myself calling .strip() after .get() or .getall().
I think it would be very helpful if Parsel provided a built-in mechanism for that.

I suggest adding a strip kwarg to get() and getall().
It would be a boolean value, and when it's true, Parsel would call strip() on every match.

Example with get():

# Before
author = selector.css("[itemprop=author] [itemprop=name]::text").get()
if author:
   author = author.strip()

# After
author = selector.css("[itemprop=author] [itemprop=name]::text").get(strip=True)

Example with getall():

# Before
authors = [author.strip() for author in selector.css("[itemprop=author] [itemprop=name]::text").getall()]

# After
authors = selector.css("[itemprop=author] [itemprop=name]::text").getall(strip=True)

Alternatively, we could change the ::text pseudo-element to support an argument, like ::text(strip=1).
That would be extremely handy too and probably more flexible than my original suggestion, but also more difficult to implement.

I know I could strip whitespaces with re() and re_first() but it's overkill and hides the intent.

Best regards,
Benoit

felipeboffnunes pushed a commit to felipeboffnunes/parsel that referenced this issue Oct 28, 2022
@felipeboffnunes felipeboffnunes linked a pull request Oct 28, 2022 that will close this issue
@bblanchon
Copy link
Author

PR #260 and #127 have gone stale.
Would one of them ever get merged?
I can't imagine I'm the only person calling .strip() on scraped strings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants