Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for Broken Links on Build #82

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

AristurtleDev
Copy link
Contributor

@AristurtleDev AristurtleDev commented Jan 26, 2024

Description

This PR will add validation on build to check all links in the generated HTML files for the following

  • Valid internal link
  • Valid external link
  • Valid link to image

Status

This is currently a draft PR as work is being started. This description section will be updated once draft is completed and ready for review.

Current status is that all files are being scanned and links parsed to validate. It currently marks links that contain a # tag in the link as an error since it only verify path and not path + id in content.

Related Issues

This is for issue MonoGame/docs.monogame.github.io#11

@AristurtleDev
Copy link
Contributor Author

@mrhelmut @SimonDarksideJ @tomspilman
The link check, for the most part, is working, however, I'm showing 717 link errors

image

The Issue

After researching this to determine if it was a bug on my end or of this was an actual invalid link in the documentation I discovered the following. Here is just one example.

For the MonoGame.Framework.Utilities.ZlibStream.CompressString(string) method, it generates the following with a broken link for the GZipStream.CompressString(string) method

image

These links are generated from the source code comments here

image

So here comes the issue, that <seealso> tag is linking to the internal GZipStream.CompressString(string) method that's inside the ZibStream.cs file instead of linking to the correct GZipStream.CompressString(string) method that's a public method in a public class.

Additional Notes

When I run this from my Windows PC while developing, this is the result I get. However, when I run docfx from my Ubuntu setup, this issue is not present. It looks again like another DocFx bug when it is running on Linux vs Windows or Mac.
I'm going to file another bug report with them to determine why there is the difference in document generation on different operating systems.

Suggested Resolution

I currently have two suggested proposals

1. Update repository (highest time consumption)

To fix these broken links, the <seealso> tags like this are going to need to be updated to use the correct namespace as well to point to the correct class. For example

<seealso cref="GZipStream.CompressString(string)"/>

changed to

<seealso cref="MonoGame.Framework.Utilities.Deflate.GZipStream.CompressString(string)"/>

I can generate all of these on my end and do a PR to update the main repository to correct these issue for documentation generation if ya'll think that is the right direction to go in.

2. Do not validate links generated in DocFx

I can disable link validation for DocFx files since there is a discrepancy between how it generates the documents on Linux vs Windows/Mac. We would still generate the documentation, just not validate the links it auto generates. This would mean we are only validating links that are human written on the main pages and in the article documents.

@mrhelmut
Copy link
Contributor

mrhelmut commented Jan 30, 2024

I would be up to 1.

Great work, this is going to be useful.

@mrhelmut
Copy link
Contributor

mrhelmut commented Jan 30, 2024

I've looked a bit into those seealso references, and there's only a handful of them, and they are mostly related to GZipStream, which is an utility library that shouldn't be part of the public API and documentation. I think we should make them internal and not publicly referenced.

But that's just a handful of references, not 700+. 🤔

@SimonDarksideJ
Copy link
Collaborator

SimonDarksideJ commented Jan 30, 2024

I say we take this as is and pull this into the docsmigration repo, then we can evaluate and then suggest changes as necessary, pointing out positive and negative hits to better evaluate the link validation.

We SHOULD keep the API links, as they are heavily utilized in the ms docs migration, so the validation should only take place AFTER all the generation and DocFX API generation is complete (almost like a final check).

I agree with @mrhelmut that there should not be document links to internals, all documentation is only for public consumption.

Also @AristurtleDev , the seealso and cref links should be fixable with a global search replace for the text, so that shouldn't take too long, especially as we know what we are looking for, si I say we should do as many of those as we can.

@AristurtleDev
Copy link
Contributor Author

@mrhelmut The 700+ i agree is a large number. I'm looking more into why I'm showing that many errors, but I can say that it's showing 700+ against the DocFx documentation generated on Windows and only 52 when doing it from documentation generated on Linux. Most of these are from external link checking that failed because of my Linux box not reaching out properly so those are false negatives

image

But there is a discrepancy still with how DocFx generates documentation in a linux environment vs windows or mac environment This may have been something that's always existed and we're only seeing it now that we are validating links?

@AristurtleDev
Copy link
Contributor Author

@SimonDarksideJ I'll get a list generated of all the<seealso> tags that are causing link validation when running DocFx on windows. That should make it easier to do a find/replace.

@mrhelmut
Copy link
Contributor

mrhelmut commented Jan 30, 2024

Following a discussion with AristurtleDev, it appears that MonoGame has two different implementations of GZipStream that are actually both unused since we moved to StbImage to decode PNGs. We forgot to clean them and they should certainly not be part of the public API.

I'm removing both of them from MonoGame to avoid confusion.

@AristurtleDev
Copy link
Contributor Author

@SimonDarksideJ The original issue submitted MonoGame/docs.monogame.github.io#11 by you has validating external links as one of the bullet points.

As part of this PR I can and do have the code written to do this, but I want to warn that the results are inconsistent. If the goal is to prevent a build of the site on an invalid link, then when testing external links, what if the external website is down at that time, but not permanently. Or there is a timeout between the GitHub runner performing the request to check the link and it marks it invalid and fails the build?

Basically I'm asking if validating external links should cause explicit build failures, only provide a warning, or just don't perform them at all?

@SimonDarksideJ
Copy link
Collaborator

Tbf, if such issues exist, then maybe they should just be warnings, rather than build failures.
These can be checked in Pr's and ignored If there is sufficient reason.

@AristurtleDev
Copy link
Contributor Author

Just commenting to give an update on this. Link validation is working as intended for all articles except the DocFx generated articles. They are an edge case.

DocFx itself has an internal mechanism that converts the <xref> tags to a valid <a> tag in the generated HTML. I added the xref transformation that would convert them by replacing the xref with a to turn it into a valid link tag. The problem is that this doesn't work for ones that are like <xref href="System.IO.Stream" data-throw-if-not-resolved="false"></xref>. DocFx has an internal map that it uses to convert that System.IO.Stream href into the valid MSDN external link.

Because this isn't handled this way currently on our end, the xref transformation will need to be updated to recognize the external namespace paths like that and adjust the href appropriately.

@AristurtleDev
Copy link
Contributor Author

Will need to wait on #91 to be approved and merged before continuing work on this.

@SimonDarksideJ
Copy link
Collaborator

Isn't this already working now @AristurtleDev ? Plus #91 was merged a while back :D

@AristurtleDev
Copy link
Contributor Author

Isn't this already working now @AristurtleDev ? Plus #91 was merged a while back :D

@SimonDarksideJ Sorry, this was next up on my list after it got pushed down from waiting on the merge.

DocFx itself will output the invalid links when it builds the documentation for articles and apis, but there's no link checking for the website only side in 11ty.

If the concern is only about the articles/apis, then this can be closed. Otherwise, just need to finish implementing it for the 11ty side of things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants