Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Selecting a form that only has a class attribute #376

Open
SilverStrings024 opened this issue Jun 18, 2021 · 6 comments
Open

Selecting a form that only has a class attribute #376

SilverStrings024 opened this issue Jun 18, 2021 · 6 comments
Labels

Comments

@SilverStrings024
Copy link

SilverStrings024 commented Jun 18, 2021

I'm trying to get a form but it only has a class attribute and I'm continuously getting a "LinkNotFoundError". I've inspected the page and I know that I have the correct class name but it doesn't work at all and I don't see any real reference to this type of issue in the docs. I would try to get the form with BS4 but then there wouldn't be a way to select the form.

I can attempt to get the form with BS4 then maybe add an id attribute to it then try selecting it with an id attribute?

I'd really appreciate any help, thank you!

@hemberger
Copy link
Contributor

Thanks for your interest in MechanicalSoup! You should be able to select the form based only on the class attribute, assuming you can construct a unique selector. It would look something like this:

browser.select_form('form[class="value"]')

If this doesn't work, would you be able to share any part of your code or the HTML you are trying to parse?

@SilverStrings024
Copy link
Author

Thanks for your interest in MechanicalSoup! You should be able to select the form based only on the class attribute, assuming you can construct a unique selector. It would look something like this:

browser.select_form('form[class="value"]')

If this doesn't work, would you be able to share any part of your code or the HTML you are trying to parse?

Thank you very much for the swift response!!

I will try this and report back; again, thank you!

@SilverStrings024
Copy link
Author

SilverStrings024 commented Jun 18, 2021

@hemberger Sorry but it's still not working. I'm pretty sure it's because the site I'm trying to scrape uses React (I just found out) so I think using something like dryscrape might work.

Edit: After trying to integrate dryscrape into my project (Which does indeed work for this situation) I'm noticing that It doesn't have the methods I need so now I need to integrate DryScrape and MechanicalSoup with one another. If I get it to work out should I submit a PR or just make a completely new project?

@hemberger
Copy link
Contributor

It looks like dryscrape is no longer actively maintained as of 2017, so I think we'd probably want to avoid officially supporting it. However, if you get it working, please feel free to link to your code in case there are others who share a similar problem.

You might also consider trying a robust tool like Selenium, which does support JavaScript. Good luck!

@SilverStrings024
Copy link
Author

It looks like dryscrape is no longer actively maintained as of 2017, so I think we'd probably want to avoid officially supporting it. However, if you get it working, please feel free to link to your code in case there are others who share a similar problem.

You might also consider trying a robust tool like Selenium, which does support JavaScript. Good luck!

Thank you very much!

@SilverStrings024
Copy link
Author

It looks like dryscrape is no longer actively maintained as of 2017, so I think we'd probably want to avoid officially supporting it. However, if you get it working, please feel free to link to your code in case there are others who share a similar problem.

You might also consider trying a robust tool like Selenium, which does support JavaScript. Good luck!

I'm going to attempt to update dryscrape and keep it maintained, once it's stable would it be useful to code something to integrate dryscrape and mechanical soup? I only ask because I really like the way mechanical soup does things and I really want to help it adapt to the new JavaScript filled web pages so it can be even more useful than it currently is.

Thanks, I hope your day goes well!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants