Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF Tags (accessibility) #1234

Open
christophersavory opened this issue Mar 16, 2023 · 9 comments
Open

PDF Tags (accessibility) #1234

christophersavory opened this issue Mar 16, 2023 · 9 comments

Comments

@christophersavory
Copy link

I found this discussion regarding iText tags and accessibility in a forum from 2012.
https://forums.opentext.com/forums/developer/discussion/52337/how-do-you-create-an-accessible-pdf-report-with-birt

I couldn't find anything regarding PDF accessibility in the BIRT Docs.
https://eclipse.github.io/birt-website/docs/t_brief-editor-tour

Has PDF Emitter accessibility (tags) been implemented into BIRT? If not, is it on the roadmap?

@hvbtup
Copy link
Contributor

hvbtup commented Mar 17, 2023

I asked the same question a few years ago.
It is not on the roadmap.
I think really many people world-wide would benefit from PDF/UA support.

PDFs generated for authorities MUST support this in the EU.

But nobody is willing to pay for it.
And it will be a lot of work, so it cannot be done by an enthusiastic hobby programmer.

Where are the big companies, where are the states, where is the EU?
IMHO this is a topic where it is necessary to actually pay for open source development.

From a technical point of view:

BIRT uses OpenPDF for PDF creation.
AFAIK OpenPDF does not support creating Tagged PDF which is a precondition for PDF/UA.
See LibrePDF/OpenPDF#181

The BIRT community can only start developing PDF/UA support for BIRT after OpenPDF added support for it.

At least one tiny bit of preparation is included in BIRT:
There is a PDF tag type property.
grafik
This was certainly meant to assist creating tagged PDFs.
But AFAIK this property isn't actually used.

I don't know: Did the commercial BIRT product support creating tagged PDFs?

@hvbtup
Copy link
Contributor

hvbtup commented Mar 23, 2023

It seems like JasperReports supports creating PDF/UA.
Internally, JasperReports uses OpenPDF just like BIRT (a patched version at the moment, see LibrePDF/OpenPDF#765)

So, in contrast to what I said in my previous comment:

It would certainly be possible to create PDF/UA with in BIRT.

It's just that the OpenPDF community itself is not focused on this and doesn't provide examples.
But if JasperReports can do it, why shouldn't we?

Still, this is certainly a lot of work.

@christophersavory
Copy link
Author

It looks like OpenPDF might already support PDF/A-1a and PDF/A-1b

See lines 1738 and 1740 of https://github.com/LibrePDF/OpenPDF/blob/3b38ad8588669d24fd1f772ec10bb516e996e3c1/openpdf/src/main/java/com/lowagie/text/pdf/PdfWriter.java

Do we just need to create a new EMITTER_ID that will set the correct PDFXConformance on the PDFWriter?

@MayurDeore
Copy link

Any updates on implementing tagged PDF functionality? Has anyone made progress on this?

@hvbargen
Copy link
Contributor

hvbargen commented Oct 4, 2023

I'm on vacation this week and spent some time investigating yesterday. I was able to create a valid PDF/UA 1 document using the rather low-level functions of OpenPDF (validation done with the PAC 2021 validator).
The idea is that you create a structure tree containing the logical structure of the content (similar to basic HTML) and you link every content on the pages to one of these structure elements.
Basically, the structure elements correspond to the item instances.
So I really think it is possible to add this to BIRT.
But there is a lot to think about.
E.g. do we need to distinguish between locale and language; and while it is typical that a BIRT corresponds to a specific tag in the structure tree, this is not always the case, so this must be configurable.

@hvbargen
Copy link
Contributor

hvbargen commented Oct 4, 2023

Reminder to myself: ATM the example is saved to a private GH repo.

@luzhanov
Copy link
Contributor

luzhanov commented Oct 4, 2023

@hvbargen may I ask you to share your experimental code, please, if possible? I'm doing some similar attempts to integrate PDF/A tags to the custom PDF emitter, and this will be really helpful.

@hvbargen
Copy link
Contributor

hvbargen commented Oct 4, 2023

OK, here it is:

https://github.com/hvbargen/openpdf-ua

As I said, I'm convinced that it is possible to add PDF/UA (and PDF/A) support to BIRT.
But this cannot work as a one-man-show. We need people who can help specifying, programming, testing.
Anyway, this can only start after 4.14.

@luzhanov
Copy link
Contributor

luzhanov commented Oct 5, 2023

Thank you @hvbtup, this is really helpful! I can see that my approach is similar to yours, I'm trying to do with BIRT PDF emitter - first I'm adding the tag structure in similar way, but the most tricky part is to tag content itself (every text entry, image etc).

Currently we're using Birt 4.9.0 as far is the latest pom-based version we can integrate in our application. This issue is blocking us from moving to newest versions: #625

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants