Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dompdf Urdu Words Not Joining Issue #3091

Open
thewebdevelopment opened this issue Dec 25, 2022 · 7 comments
Open

dompdf Urdu Words Not Joining Issue #3091

thewebdevelopment opened this issue Dec 25, 2022 · 7 comments

Comments

@thewebdevelopment
Copy link

thewebdevelopment commented Dec 25, 2022

Font is loading but characters are separated tried so many other URDU language fonts like zoya font, noto_nastaliq_urdu font but nothing worked.

My complete code is

<?php
require_once "vendor/autoload.php";

use Dompdf\Dompdf;
use Dompdf\CanvasFactory;
use Dompdf\Exception;
use Dompdf\FontMetrics;
use Dompdf\Options;

use FontLib\Font;
ini_set("display_errors", true);
ini_set("error_log", "phperr.log");
ini_set("log_errors", true);
error_reporting(E_ALL);

$pdf = new Dompdf(new Options([
    'defaultPaperSize'          => 'a4',
    'defaultPaperOrientation'   => 'portrait',
    'isRemoteEnabled'           => true,
    'allowUrlFopen'             => true,
    'isHtml5ParserEnabled'      => true,
    'isFontSubsettingEnabled' => false,
    'direction' => 'rtl',
]));

// define("DOMPDF_ENABLE_HTML5PARSER", true);

$pdf->loadHtml('
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<meta charset="utf-8">

<style>

@font-face {
  font-family: "Zoya";
  font-style: normal;
  font-weight: 400;
  src: url(http://localhost/dompdf/drive/zoya-regular-1.ttf);
}


* {
  font-family: "Zoya";
  font-style: normal;
  font-weight: 400;
  direction: rtl;

}
</style>
</head>

<body>    
  <span> اردو پوائنٹ    </span>
</body>

',"UTF-8");
$pdf->render();
$pdf->stream('test.pdf', array('Attachment'=> 0 ));

screenshot
image

@thewebdevelopment thewebdevelopment changed the title dompdf is loading urdu font but letter's are seprated dompdf Urdu Words Not Joining Issue Dec 26, 2022
@bsweeney
Copy link
Member

bsweeney commented Dec 26, 2022

Dompdf does not currently perform any kind of complex text layout operations (such as shaping/joining). You would need to pre-process your text through some other means prior to rendering.

@thewebdevelopment
Copy link
Author

Basically these are not shap's or anything other then Urdu language! Load font and showing exactly the same font language is possible in dompdf?

other pdf libraries are working fine only dompdf is having such issue.

@bsweeney
Copy link
Member

bsweeney commented Dec 27, 2022

You misunderstand, and may be surprised to know that the joining of the characters doesn't just happen automatically. Shaping is a process by which a program takes the input character string and performs sufficient operations so that the resulting visible text is rendered correctly. The program generating a document has to have embedded in it sufficient logic about how to format a language to render it correctly. Unfortunately, at this time, Dompdf does not have that information.

If you're interested in more information on the subject you can start with Wikipedia: Complex text layout.

@HariharanUmapathi
Copy link

@bsweeney thanks for reference the document

I'm facing same kind of issue with tamil script unable to render complex layout characters

Is there any way to add some call back to take over the ctl part and send the correct text to be writable in pdf rendering

Example :

$rawUtf8Text="தமிழ் கணிணி பொறியாளன்";

function handleCTL($rawUTF8Text)
{//process expected

return $processedUTF8data;

}

@bsweeney
Copy link
Member

I don't really have an answer at this time since I have not had an opportunity to research the issue more thoroughly.

@HariharanUmapathi
Copy link

@bsweeney

If tc-lib-unicode is used to add language support for arabic bidirectional and rtl

I'll try to implement Tamil font character mapping for tamil using tc-lib-unicode

@bsweeney
Copy link
Member

bsweeney commented May 2, 2023

You can check out #2107 for a tweak to Dompdf that includes tc-lib-unicode. The branch had some issues rendering text that was multi-line and/or used inline elements so it hasn't moved forward.

I'm not sure what language support tc-lib-unicode has. I haven't fully reviewed the code but the shaping logic has a specific class for Arabic and nothing else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants