Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XPath query is buggy #296

Open
shner-elmo opened this issue May 12, 2024 · 3 comments
Open

XPath query is buggy #296

shner-elmo opened this issue May 12, 2024 · 3 comments

Comments

@shner-elmo
Copy link

Hey so I'm trying to locate a table inside the HTML using an XPath, and its not working well,
when I select the first element [1] it returns a list of two elements instead of just one (I tested it on chrome and it works correctly there).

This is the code that I used to initialize it:

import parsel

html = '....'
sel = parsel.Selector(html)

And the bug:
image

@Gallaecio
Copy link
Member

What you do on Chrome does not matter, because Chrome does not work on the raw HTML response, but on the DOM.

I bet there are 2 tables that are the first element of their parent. (//table)[1] probably does what you want.

@shner-elmo
Copy link
Author

shner-elmo commented May 22, 2024

What you do on Chrome does not matter, because Chrome does not work on the raw HTML response, but on the DOM.

I don't understand, how is the DOM different from the HTML? because maybe some JS modified it?

If that's the case it's the same thing because the HTML that I opened in Chrome was a local file (file://...) that I saved from a website.

@kmike
Copy link
Member

kmike commented May 22, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants