Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can you build it without lxml? #378

Open
urain39 opened this issue Oct 11, 2021 · 2 comments
Open

Can you build it without lxml? #378

urain39 opened this issue Oct 11, 2021 · 2 comments

Comments

@urain39
Copy link

urain39 commented Oct 11, 2021

MechanicalSoup is a really nice package i have used for, but it still requires C Compiler to compile the lxml on *nix systems.

It may be a problem to port to some platforms without C Compiler, such as Android or some minified Linux.

Currently i used a script to build MechanicalSoup without lxml:

#!/bin/sh

# Remove lxml in requirements.txt
sed -i '/lxml/d' requirements.txt

# Use `html.parser` instead `lxml`
sed -i "s@{'features': 'lxml'}@{'features': 'html.parser'}@g" mechanicalsoup/*.py

# Fix examples and tests
sed -i "s@\\(BeautifulSoup(.\\{1,\\}\\)'lxml'\\(.*)\\)@\1'html.parser'\2@g" examples/*.py tests/*.py

It works well, so i think it is not a big problem...

@moy
Copy link
Collaborator

moy commented Oct 12, 2021

lxml became the default in 83b2cdd, see the commit message for a bit more info. Essentially, the problem is:

  • We really want a default value for the parser, because we want reproducibility (not having a default would mean that the same program using MechanicalSoup would behave differently depending on what is installed on the machine where it runs).
  • lxml is rather clearly the best default in terms of features and performance, so although it adds a dependency we chose it as default. And therefore it has to be a dependency, otherwise the default configuration would be broken.
    I believe the simplest option for people who need to run MechanicalSoup on machines where a C compiler isn't installed is to patch it, as you did. We could also try to publish a MechanicalSoup-lite, or so, variant where lxml would not be a dependency but that would probably be a lot of trouble for a very small number of users. Up to now, I think we never got this request (except your message obviously).

@urain39
Copy link
Author

urain39 commented Oct 12, 2021

@moy Thanks reply. I think the idea of releasing a lite version is good enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants