Newlines stops consuming tag descriptions #241

jaapio · 2024-05-13T20:08:20Z

I got an issue report on my project phpDocumentor/ReflectionDocBlock#365 where we found an issue in this parser. Summarized the users are facing an issue with tag descriptions containing empty lines.

/**
 * @param string $myParam this is a description of the param
 *
 *                     And this line also belongs to $myParam.
 * @param array $other this is a new param.
 */

Newlines in tag descriptions are possible, but when the newline is detected the parser stops consuming new tokens. While if should only stop when a new tag starts. Other reported issues are more or less out of scope of this issue and I can workaround them. But the situation with the newlines cannot be patched from the outside. To prove the solution I made an ugly hack in our code to see if we can overcome this issue by just adjusting a single method.

phpDocumentor/ReflectionDocBlock#369

The POC shows that the method getSkippedHorizontalWhiteSpaceIfAny is not working when the detected token is processed as a newline token. As a result the leading whitespace between param and And in my example is removed by the parser.

I would be able to create a PR to solve this issue. But before I invest more time I would like to see if other solutions are already available.

The text was updated successfully, but these errors were encountered:

ondrejmirtes · 2024-05-13T20:25:30Z

Have you tried setting textBetweenTagsBelongsToDescription in PhpDocParser to true?

These options are similar to PHPStan's bleeding edge. Right now they're set to false for BC reasons but they will be removed in 2.0 and code will behave the new modern way (as if these options were set to true).

Also check out options in TypeParser and ConstExprParser.

jaapio · 2024-05-13T21:09:32Z

Yes I did check those settings, those resulted in the same behavior.
I didn't check the other parsers but I assume they have nothing Todo with the issue we are facing as it is reproduced with just the description. In fact the text processing stops after an empty line in all cases I checked.

jaapio · 2024-05-17T15:44:23Z

I checked again... since I was using the parser just for a single tag this caused issues. So I do have a workaround that works for us right now. As we can simply consume all tokens from the lexer.

Which brings me to another issue, which might be one you also are facing when reconstructing docblocks. When I have a docblock line like this:

/**
 * @param string $myParam this is a description of the param
 *
 *                     And this line also belongs to $myParam.
 * @param array $other this is a new param.
 */
*/

The token on line 2 contains: \n with type Lexer::TOKEN_PHPDOC_EOL but as the parser sees this as a EOL it will remove the trailing whitespace when building the description.

We fixed this by adding some post processing on the tokens:

    private function tokenizeLine(string $tagLine): TokenIterator
    {
        $tokens = $this->lexer->tokenize($tagLine);
        $fixed = [];
        foreach ($tokens as $token) {
            if ($token[1] === Lexer::TOKEN_PHPDOC_EOL) {
                if (rtrim($token[0], " \t") !== $token[0]) {
                    $fixed[] = [rtrim($token[Lexer::VALUE_OFFSET], " \t"), Lexer::TOKEN_PHPDOC_EOL, $token[2] ?? null];
                    $fixed[] = [ltrim($token[Lexer::VALUE_OFFSET], "\n\r"), Lexer::TOKEN_HORIZONTAL_WS, ($token[2] ?? null) + 1];
                    continue;
                }
            }

            $fixed[] = $token;
        }
        return new TokenIterator($fixed);;
    }

But I think this is a bug in the way phpstan parses the docblocks. This expression: self::TOKEN_PHPDOC_EOL => '\\r?+\\n[\\x09\\x20]*+(?:\\*(?!/)\\x20?+)?', seems to be to greedy.

Do you think it's worth fixing this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Newlines stops consuming tag descriptions #241

Newlines stops consuming tag descriptions #241

jaapio commented May 13, 2024

ondrejmirtes commented May 13, 2024

jaapio commented May 13, 2024

jaapio commented May 17, 2024

Newlines stops consuming tag descriptions #241

Newlines stops consuming tag descriptions #241

Comments

jaapio commented May 13, 2024

ondrejmirtes commented May 13, 2024

jaapio commented May 13, 2024

jaapio commented May 17, 2024