Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #510: Make XPointer work #511

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Fix #510: Make XPointer work #511

wants to merge 1 commit into from

Conversation

tomschr
Copy link
Collaborator

@tomschr tomschr commented May 23, 2019

This PR fixes #510.

Jing fails as it depends on Xerces. As Xerces does not support the xpointer() scheme, we need to give jing a "complete" document. The document is created by xmllint.

Currently it's an ugly implementation as it appends ".resolved" to the PROFILED_MAIN variable. Probably we need to adapt the code in other locations too.

For the time being, it's just fixes the validate subcommand.


Addition: what's about the following suggestion?

  • daps validate
    would use the current validation. Nothing changed.
  • daps validate --xpointer
    would enable the different resolution path with XPointers (resolve them with xmllint and validate with jing).

Jing fails as it depends on Xerces. As Xerces does not support
the xpointer() scheme, we need to give jing a "complete"
document. The document is created by `xmllint`.

Currently it's an ugly implementation as it appends ".resolved"
to the `PROFILED_MAIN` variable. Probably we need to adapt the
code in other locations too.

For the time being, it's just the validate subcommand.
@tomschr tomschr requested a review from fsundermeyer May 23, 2019 14:18
@ghost
Copy link

ghost commented May 27, 2019

daps validate --xpointer

@tomschr What is the point of adding an option here instead of just making this default behavior? Getting validation failures unless you run with an extra option sounds rather inconvenient.

@tomschr
Copy link
Collaborator Author

tomschr commented May 27, 2019

daps validate --xpointer

@tomschr What is the point of adding an option here instead of just making this default behavior? Getting validation failures unless you run with an extra option sounds rather inconvenient.

It is. Actually I consider it a hack. The whole problem are our tools that we use:

Tool Supported XPointer Schemes Validation Support RELAX NG
xmllint xpointer() and element() partly
jing element() full

Whichever component you use, you will run into problems. 😢

IMHO, the "default behavior" should be the validation with jing. I guess, not all projects need or want XPointers, so they can stick with the Jing validation. The "extended validation" is mainly needed for SLE.

Unfortunately, if we really want to support XPointer (and we already use it) I don't see a better solution yet (except the "option" to fix the tools, but you know how this will end...). Unfortunately, we have some implementation issues:

  • Creating a bigfile with xmllint and validating it with jing would technically work. However, if there are validation errors, you will loose the filenames (as it's only one file). That makes it harder to spot any errors.
  • Validating the document with xmllint alone won't work as validation with RELAX NG is only partially supported. 😢
  • Validating the document with jing alone does work unless nobody uses the xpointer() scheme.

So how to solve this? That's why I've suggested this additional option (we could also named it --full).

@ghost
Copy link

ghost commented May 27, 2019

IMHO, the "default behavior" should be the validation with jing. I guess, not all projects need or want XPointers, so they can stick with the Jing validation. The "extended validation" is mainly needed for SLE.

Is there any real downside to just applying the XPointer fix in all cases? (Besides the extra second that it may take.)

I am absolutely not in favor of making life harder for everyone by adding extra parameters. I would be much in favor of using the same validation everywhere.

@tomschr
Copy link
Collaborator Author

tomschr commented May 27, 2019

IMHO, the "default behavior" should be the validation with jing. I guess, not all projects need or want XPointers, so they can stick with the Jing validation. The "extended validation" is mainly needed for SLE.

Is there any real downside to just applying the XPointer fix in all cases? (Besides the extra second that it may take.)

Well, depends what you mean with real. Seeing not the filenames is a real problem for some, do you agree? It makes debugging harder.

I am absolutely not in favor of making life harder for everyone by adding extra parameters. I would be much in favor of using the same validation everywhere.

Sure, that would be my aim as well. Unfortunately, the tools are the bottleneck.

I think it boils down to this simple question: do we want to support XPointers? If no, we can close this issue. If yes, we need to find a solution. At the moment we have these options:

  1. Create an intermediate bigfile and validate that.
  2. Find another solution to keep the filename.
  3. Fix the tools.

@fsundermeyer Could we have number 3, please?

@ghost
Copy link

ghost commented May 27, 2019

Seeing not the filenames is a real problem for some, do you agree? It makes debugging harder.

That is a big downside. I missed that part, sorry.

Otoh--don't we have xml:base for that? Is there any way we could expose that information via jing?

I think it boils down to this simple question: do we want to support XPointers?

A definite "yes" from me.

@tomschr
Copy link
Collaborator Author

tomschr commented May 28, 2019

[...]
Otoh--don't we have xml:base for that? Is there any way we could expose that information via jing?

Yes, we have xml:base. That is automatically added during XInclude resolution. However, exposing it to jing doesn't have any advantages. It's just an attribute, nothing special.

Unfortunately, I fear, we have not many options:

  1. Validate with jing and bigfile and live with the drawback that we need to do some more research when having validation issues.
  2. Create some magic daps-validate command which does some "tricks" (whatever they will be).
  3. Fix the tools (which would my preferred option).

As number 3 is probably unlikely or takes too much time, we are left with 1 + 2. If we do not want to accept the drawback on 1, that's gives us only option 2.

If we theoretically implement a daps-validate command it could do the following steps:

  1. Check for well-formedness.
    If there are any errors, the writer needs to fix them first.
  2. Create a bigfile with xmllint.
    We need to resolve any XIncludes and XPointers.
  3. Validate the bigfile with jing.
    1. If the validation was successful: be happy and dance. 🕺
    2. If the validation failed:
      a. Split the bigfile according to the xml:base (could be done by some XSLT magic).
      b. Iterate over all files created by xml:base and validate it via jing -i.

I know, this sound terribly complicated and probably it is. 😢 It's likely there are many issues with this approach. Unfortunately, I can't think of a better way.

Do any of you have a better idea?

@fsundermeyer
Copy link
Member

Seeing not the filenames is a real problem for some, do you agree? It makes debugging harder.
That is a big downside. I missed that part, sorry.
Otoh--don't we have xml:base for that? Is there any way we could expose that information via jing?
I think it boils down to this simple question: do we want to support XPointers?
A definite "yes" from me.

It rather boils down to the question "at what price do we want to support xpointers"? And I am definitely not willing to pay the price of crippling validation to being almost unusable.
If we validate on a bigfile, we need to come up with some code that allows us to point to the file and the line where the error occurred.

Does the xmllint run you proposed change the general formatting?

@fsundermeyer
Copy link
Member

  1. If the validation failed:
    a. Split the bigfile according to the xml:base (could be done by some XSLT magic).
    b. Iterate over all files created by xml:base and validate it via jing -i.

Do any of you have a better idea?

Jing somehow returns the line number where the error occurs. We would need to find the corresponding xml:id plus the corresponding xml:base. Can that be done via xslt?

@ghost
Copy link

ghost commented May 28, 2019

It rather boils down to the question "at what price do we want to support xpointers"? And I am definitely not willing to pay the price of crippling validation to being almost unusable.

Our validation already spits out useless line numbers and file paths that don't lead to the real source file in many cases, e.g.:

[...]doc-sle/build/.profiled/x86_64_zseries_power_aarch64_sles/yast2_sw.xml:43:19: error: element "varlistentry" incomplete; missing required element "term"

In this sense, we're just gradually getting worse (which is not a good thing either of course).
You only ever get the actual line number/file if your file is not well-formed.

Jing somehow returns the line number where the error occurs. We would need to find the corresponding xml:id plus the corresponding xml:base. Can that be done via xslt?

  • Saxon has an extension function for this.
  • libxml + Python/lxml provides this as well but libxml is limited to 65k lines (I use this one in the style checker, but the line number is currently useless, so it's never displayed).

Hitches:

  • A lot of lines don't have elements with IDs on them to begin with
  • Some source IDs are removed when profiling, <remark/>, comments and DOCTYPE headers may also be profiled away [though we could fix that], all of this leads to a significant line diff between original file and output file

@ghost ghost changed the base branch from develop to master August 13, 2020 17:30
@fsundermeyer fsundermeyer changed the base branch from master to main November 25, 2020 16:44
@fsundermeyer fsundermeyer marked this pull request as draft March 11, 2021 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make XPointer work with GeekoDoc/DocBook 5
2 participants