Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix listblock preformatted overlap #4055

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
7 changes: 7 additions & 0 deletions inc/Parsing/Lexer/Lexer.php
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,13 @@ public function parse($raw)
if ($currentLength === $length) {
return false;
}
// If we are closing a block and the last character consumed by our matched
// string is a newline, put it back. See the following for details:
// https://github.com/dokuwiki/dokuwiki/issues/4054
if ($mode == "__exit" && substr($matched, -1) == "\n") {
Copy link
Collaborator

@fiwswe fiwswe Sep 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am by no means an expert on the DW Lexer — but should this not be:
if ($this->isModeEnd($mode) && substr($matched, -1) == "\n") {

Or better yet:
if ($this->isModeEnd($mode) && str_ends_with($matched, "\n")) {

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AIUI, str_ends_with is PHP8, and DokuWiki still supports PHP 7.

As for isModeEnd: you could be right; I too am not an expert on the lexer. A quick look at the source shows that the code for that function is: return ($mode === "__exit"); so it seems that they will be functionally equivalent. For a person unfamiliar with the lexer, $mode was right there... :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recently, in the development version polyfills for str_ends_with() and likes are added e.g.

if (!function_exists('str_ends_with')) {
function str_ends_with(string $haystack, string $needle)
so it fine to use them already.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No a polyfill has been added recently. So str_*_with functions can now be used.
See

/**
* polyfill for PHP < 8
* @see https://www.php.net/manual/en/function.str-starts-with
*/
if (!function_exists('str_starts_with')) {
function str_starts_with(string $haystack, string $needle)
{
return empty($needle) || strpos($haystack, $needle) === 0;
}
}
/**
* polyfill for PHP < 8
* @see https://www.php.net/manual/en/function.str-contains
*/
if (!function_exists('str_contains')) {
function str_contains(string $haystack, string $needle)
{
return empty($needle) || strpos($haystack, $needle) !== false;
}
}
/**
* polyfill for PHP < 8
* @see https://www.php.net/manual/en/function.str-ends-with
*/
if (!function_exists('str_ends_with')) {
function str_ends_with(string $haystack, string $needle)
{
return empty($needle) || substr($haystack, -strlen($needle)) === $needle;
}
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it doesn't work on my system... I'm running 2023-04-04a "Jack Jackrum", not a development snapshot. I'm assuming that functionality came post-Jack? (And I really don't want to set up a development snapshot at this time: in fact, I'm just manually patching this stuff in the GitHub web UI...) But if the polyfill capability is in the post-Jack snapshot feel free to make that change.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel out of my depth regarding the unit tests. I took a look at https://github.com/dokuwiki/dokuwiki/blob/master/_test/tests/inc/parser/lexer.test.php but saw no easy way to add tests for this PR. But maybe I'm looking in the wrong place?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to add a test for the lexer, but we can add tests for the parse handers for listblock, preformatted and tables. Look above for the comment from @Klap-in with proper paths. In that case, we can give it a chunk of markup source, and then compare our statically-added state graph against the graph generated by the running code. I replied with proposed markup that covers all of the updated transitions above in this issue. I also took that same markup and replaced the comments with heading tags and put it on my production server with my PR applied. You can see the result here: https://www.fluidtechservices.com/dokuwiki/blocktest

It shows that the markup is being processed correctly. As opposed to what happens when you paste it into the DokuWiki sandbox, which I've also done: https://www.dokuwiki.org/sandbox:playground

You can see on my server each type of markup results in exactly the right type of HTML. You can see on the sandbox it does not.

Also, I searched for some doc on unit testing and found this: https://www.dokuwiki.org/devel:unittesting I can tell you this, I do not have the environment for setting this up -- I'm way over my time budget for this. But if someone can give me a straightforward way to get the state graph, I can use the existing unit tests as a pattern and create new ones.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the Table to preformatted transition is correct on your wiki? I don't see any preformatted text.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct: I just noticed the same thing. It seems the lexer is going into the preformatted block and grabbing the text, but not outputting the text as HTML.

After spending the last half-hour looking into it, I checked the sandbox: this bug is in the current lexer as well. So I don't know where that text is going, or why, but it's not related to this patch. That problem already exists...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general markup level makes sense for me. However, as this feels as a rather general change, probably adding tests for the lexer as well might make sense for how it should consume/not consume new lines between different parsermodes. Of course, these tests are less trivial.

On short notice, I have not much time available. I hope I can find some time in a week/2 weeks.

Everybody who can create unit tests is invited to add them, I think especially for markup these are quite doable :-)

$raw = "\n" . $raw;
$currentLength++;
}
$length = $currentLength;
$pos = $initialLength - $currentLength;
}
Expand Down
4 changes: 2 additions & 2 deletions inc/Parsing/ParserMode/Preformatted.php
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ public function connectTo($mode)
$this->Lexer->addEntryPattern('\n\t(?![\*\-])', $mode, 'preformatted');

// How to effect a sub pattern with the Lexer!
$this->Lexer->addPattern('\n ', 'preformatted');
$this->Lexer->addPattern('\n\t', 'preformatted');
$this->Lexer->addPattern('\n (?![\*\-])', 'preformatted');
$this->Lexer->addPattern('\n\t(?![\*\-])', 'preformatted');
}

/** @inheritdoc */
Expand Down