Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exifr fails to read PNG image parameters if the field is 1000+ characters #113

Open
JustMaier opened this issue Jun 9, 2023 · 1 comment

Comments

@JustMaier
Copy link

First off, thank you for the excellent package. It's been great to be able to use it to provide the ability for users to easily read metadata from AI generated images.

We just ran into an issue where it seems that the package struggles to read parameters from PNGs if the field is 1000 or more characters. You can actually see this behavior in the playground itself by using these test images:

Is this a known issue? I browsed the codebase and didn't see anything tied to that character limit.

@00dani
Copy link

00dani commented Oct 22, 2023

I did some quick experimentation. Exifr can successfully parse very long text fields from PNGs, but only if the text been encoded into the PNG file as tEXt chunks. There are three kinds of text chunks the PNG specification allows - tEXt, zTXt, and iTXt - and Exifr currently only has code to process the first of these. The other two are simply ignored.

tEXt is the simplest: it's just plain text, stored in the Latin-1 encoding. It's the easiest for Exifr to process as a result.

zTXt is semantically the same as tEXt, except that the text data is stored compressed, so it's recommended for longer strings. Depending on the software you're using to make your PNGs, it might decide to use a zTXt chunk rather than a tEXt chunk when the field's length hits 1000 characters, but there's no hard requirement that this happen. For example, ExifTool produces a tEXt chunk unless it's explicitly asked to use compression.

Finally, iTXt refers to international text - the text is stored in UTF-8 encoding, and may optionally be compressed as well. ExifTool and most likely other PNG processing software will automatically switch to this encoding if the text to be stored contains any characters outside of the Latin-1 range.

The main problem with Exifr supporting these extra chunk types is that both of them use compression. Currently Exifr can handle a few compressed fields, but only when it's running in Node, not in the browser, because Node's standard library provides zlib bindings and browsers do not. Requiring an alternative deflate implementation to read compressed metadata in-browser, such as fflate, would be entirely doable however.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants