Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix orphans check #2507

Merged
merged 1 commit into from
Sep 18, 2021
Merged

Fix orphans check #2507

merged 1 commit into from
Sep 18, 2021

Conversation

Mellthas
Copy link
Member

@Mellthas Mellthas commented Aug 1, 2021

This also effectively forbids line breaks before the first line.

Based on work in #1356
Fixes #1617

@Mellthas
Copy link
Member Author

Mellthas commented Aug 2, 2021

See the example HTML for the effect of preventing line breaks before the first line box:

<!DOCTYPE html>
<html>

<head>
<meta charset="UTF-8">
<style type="text/css">
@page {
	size: 400pt 300pt;
	margin: 50pt;
}

body {
	font-family: sans-serif;
	letter-spacing: 0.8pt;
	border: 1pt solid black;
	line-height: 19pt;
}

p {
	margin: 0 0 30pt 0;
	border: 5pt solid black;
	orphans: 1;
	widows: 1;
}
</style>
</head>

<body>
	<p>Lorem ipsum dolor sit amet, consectetur adipisici elit, sed eiusmod tempor
	incidunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
	nostrud exercitation ullamco laboris nisi ut aliquid ex ea commodi consequat.</p>

	<p>Quis aute iure reprehenderit in voluptate velit esse cillum dolore eu fugiat
	nulla pariatur. Excepteur sint obcaecat cupiditat non proident, sunt in culpa
	qui officia deserunt mollit anim id est laborum.</p>
</body>

</html>

orphans-before.pdf
orphans-after.pdf

@bsweeney bsweeney added this to the 1.0.3 milestone Aug 24, 2021
@bsweeney bsweeney linked an issue Sep 3, 2021 that may be closed by this pull request
@bsweeney
Copy link
Member

bsweeney commented Sep 4, 2021

I wonder if we should go ahead and implement the full update to the orphan logic from 1356 (lines 360-380). Just returning a count of the number of lines is inaccurate since it always returns number of rendered lines up to this point instead of the line we're currently checking. This is important considering that the page break logic walks back through previous frames when a page break is disallowed.

@Mellthas
Copy link
Member Author

That is true. But I think the thing is (and probably the reason why the original code didn't try to find the exact line number): As long as we are only checking for orphans, checking against the current number of line boxes is actually sufficient. Backtracking within an inline context can only happen due to:

  1. The orphans check failing for the current frame. Then it will also fail for every previous sibling, as count($block_parent->get_line_boxes()) hasn't changed.
  2. The page-break-inside: avoid check failing. Then it will also fail for any previous sibling, as they share the same block parent.

That said, I just realized that the orphans property is checked on the current frame, so it could change while backtracking. But the spec states that orphans and widows only apply to block container elements, so that potential point is moot.

If I have not missed anything, then I would hold off implementing the full logic for getting the exact line number until it is actually needed (as it would only incur a slight performance hit right now, without achieving anything). I have updated the PR with a short comment and changed the code to check the orphans property on the block parent. What do you think?

@Mellthas
Copy link
Member Author

Mellthas commented Sep 13, 2021

That said, there seem to be cases where a new line box has already been added without backtracking involved. I’ll have to investigate that further.
Edit: Nevermind, that had backtracking involved, a block element in an inline context (ul within an li). The problem here is that the page-break logic thinks that the ul is the first block-level frame, when the preceding inline content should be implicitly wrapped in an anonymous block box instead (see https://www.w3.org/TR/CSS22/visuren.html#anonymous-block-level).

@bsweeney
Copy link
Member

Edit: Nevermind, that had backtracking involved, a block element in an inline context (ul within an li).

Is that a correct example? An LI doesn't represent inline context, unless I'm misunderstanding the example. Via https://www.w3.org/TR/CSS22/generate.html#lists:

An element with 'display: list-item' generates a principal block box for the element's content

I haven't had a chance to think of an example where this could produce incorrect results. I suspect it would be a rare case, but basically the line after the acceptable orphan count would have to result in a non-allowed page break. When Dompdf rewinds to see if there is an acceptable breaking point earlier it would do so within the lines that should not be orphaned since we're only checking the line count and not the current line.

Truthfully it's probably a rare case, and anyway it's how things are handled at present. It will become more of an issue when widows are supported.

@Mellthas
Copy link
Member Author

Sorry, was a bit unclear there. What I mean is something like the following, when a page break is to occur before the inner ul:

<ul>
  <li>Inline content
    <ul></ul>
  </li>
</ul>

The current code will forbid a page break before the inner ul because it thinks that the ul has no preceding block-level sibling. The inner ul has resulted in a new line box, and so when backtracking for an allowed page break, the line number for the inline content will be ‘reported’ as 2 instead of 1, and so the orphans check will not disallow a page break before it (if orphans is 2 or less without this PR, if orpahns is 1 with this PR). See the following full sample:

HTML
<!DOCTYPE html>
<html>

<head>
<meta charset="UTF-8">
<style>
@page {
    size: 400pt 300pt;
    margin: 50pt;
}

ul {
    counter-reset: li;
}

li {
    counter-increment: li;
}

li::before {
    content: counter(li);
}
</style>
</head>

<body>
<ul>
<li></li>
<li></li>
<li></li>
<li></li>
<li></li>
<li></li>
<li></li>
<li></li>
<li></li>
<li></li>
<li></li>
<li><ul><li></li></ul></li>
<li></li>
<li></li>
<li></li>
</ul>
</body>

</html>

Output: list-bullet.pdf

@Mellthas
Copy link
Member Author

Mellthas commented Sep 14, 2021

#2552 fixes the sample. It will still fail if the inner ul has page-break-before: avoid. But that would be the only remaining kind of case that I can think of were just counting the number of line boxes is not enough.

This effectively forbids line breaks before the first line. Also, check
the orphans property on the block parent, as it only applies to block container
elements per spec.

Based on work in dompdf#1356
Fixes dompdf#1617

Co-authored-by: David Sickmiller <david@sickmiller.com>
@bsweeney bsweeney merged commit d1d7669 into dompdf:develop Sep 18, 2021
@Mellthas Mellthas deleted the fix-orphans branch September 19, 2021 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Orphans calculation seems to be off by one
2 participants