Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send selected DOM gives incorrect validation results #42

Open
cstrobbe opened this issue Oct 2, 2023 · 3 comments
Open

Send selected DOM gives incorrect validation results #42

cstrobbe opened this issue Oct 2, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@cstrobbe
Copy link

cstrobbe commented Oct 2, 2023

When sending the DOM to the W3C Validator, the validator claims that the page has no title element for pages that evidently have such an element.

To Reproduce

  1. Open a page such as Athlean-X or KU Leuven in Google Chrome.
  2. Press F12 to open the devtools, press Ctrl F to open the search function and enter the XPath expression //title to verify that the page actually contains a title element.
  3. Open the ARC Toolkit and press "Send selected DOM" without first selecting any part of the DOM (so the entire DOM gets validated).
  4. Check the validator's list of errors: it contains the error "Element head is missing a required instance of child element title." (and other errors that don't make sense for the page being validated). This issue has been present at least since May 2023.

Expected behavior

Only real errors are reported ...

Version information

  • Browser and version: Google Chrome Version 117.0.5938.132 (Official Build) (64-bit) on Windows 10.
  • ARC Toolkit version: 5.5.3.
@ferllings
Copy link
Member

On some actions I noticed that it "default" the selected DOM to <body>.
That might be the case here.

@extra808
Copy link

This problem still occurs. There are actually three errors, "Error: Element head is missing a required instance of child element title," "Error: Stray start tag html," and Fatal Error: Cannot recover after last error. Any further errors will be ignored." Since the errors all happen on Line 1, they prevent the selected DOM node from being validated at all.

The problem is caused by the "wrapper" code ARC is using to place the selected DOM node within a minimal HTML page; it's trying to include newline (aka "line feed") characters after each element but the \n notation hasn't been converted to newlines, they're still a part of the strings:

<!DOCTYPE html>\n<html lang='en'>\n<head>\n<title>ARC Toolkit Node Validation</title>\n</head>\n<body>\n[begin selected DOM Node]
…
[end selected DOM Node]\n</body>\n</html>

The selected DOM node code does contain newlines.

The HTML validator doesn't care about newlines, they're for making the code more readable, so to make the validation work again, removing the \n code is sufficient. If you can figure out how to insert the desired newlines, that would be even better.

@ferllings
Copy link
Member

@extra808 Thanks for the detailed comment. That should definitely helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants