Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify with format switch reports erroneous JPEG quality if actual value is undefined #260

Open
bitsgalore opened this issue Aug 15, 2023 · 4 comments

Comments

@bitsgalore
Copy link

bitsgalore commented Aug 15, 2023

ImageMagick version

6.9.10-23

Operating system

Linux

Operating system, version and so on

Linux Mint 20.1 Ulyssa

Description

Identify with format switch reports erroneous JPEG quality fallback value if actual value is undefined.

Steps to Reproduce

Attached JPEG file was supplied to us by a vendor, who claims it was compressed at 50% quality. In an attempt to verify this claim, I first ran identify with verbose output:

identify -verbose test-quality.jpg

The resulting output does not contain the "Quality" property (which is used for JPEG quality). So for some reason identify doesn't seem to be able to establish the quality level in this case (I'm not entirely sure why, but from what I read here estimating JPEG quality can be tricky at low quality levels). But if I run identify like this (which only reports the quality level):

identify -format '%Q\n' test-quality.jpg

Result:

92

Which is obviously wrong in this case! The actual file size is actually pretty much what I would expect for a 50% quality JPEG for this kind of material; also a double-check with this tool by Neal Krawetz confirmed the actual quality must be quite low.

I had a quick look into the code, and found this:

https://github.com/ImageMagick/ImageMagick/blob/f5bdfdd62af7109ad105f8af4e28111e353edecd/MagickCore/property.c#L2725

I'm not a C programmer, but if I understand this correctly, this forces the reported image quality to a "92" fallback value if the actual quality is 0 (I assume this is used internally if the actual quality cannot be determined). If I'm correct, a better solution may be to use something that clearly indicates that the quality level is undefined here.

(Side note: I also did some additional tests with low-quality JPEGs I made within ImageMagick, but for all of these identify was able to report the correct quality value. It's not clear to me why this is the case.)

Images

test-quality

@fmw42
Copy link

fmw42 commented Aug 15, 2023

Best that I can see from identify and from exiftool, no quality value was recorded in the file. It may have been compressed at quality 50% but it was not recorded. So IM will assume its default quality of 92.

Someone else can double check my assessment.

@dlemstra dlemstra transferred this issue from ImageMagick/ImageMagick Aug 15, 2023
@bitsgalore
Copy link
Author

bitsgalore commented Aug 16, 2023

@fmw42 You're probably right, but IM's current behavior makes it impossible to distinguish between JPEGs that were actually compressed at 92% quality, and JPEGs for which the quality is unknown, which isn't very helpful. Reporting some NaN value or even an empty string (I don't know if there are any IM conventions for this?) would be much more helpful. Knowing that IM cannot establish the quality is actually useful information, whereas getting some arbitrary value that's indistinguishable from a meaningful estimate isn't!

BTW you mention the absence of quality level info in the metadata. I noticed that as well. From the Fotoforensics explainer (tab "Estimating Quality) I understand it's quite rare for JPEGs to have this info in the metadata, and even if it's there, it's often unreliable. It's not entirely clear to me how IM establishes the quality; I suspect somewhere under the hood IM or some delegate library uses either the "Approximate Ratios" or "Approximate Quantization Tables" methods that are mentioned in the Fotoforensics piece. But from the code I can't quite figure out if this is indeed the case.

However, JPEGs created inside IM don't appear to contain quality-related metadata either. Nevertheless, "identify" is able to establish the quality!

A quick example. First I create a new JPEG with 40% compression quality:

convert -quality 40 wizard: wizard-40.jpg

The output of the following ExifTool command doesn't contain anything related to the quality level:

exiftool -X wizard-40.jpg

Despite this, using "identify":

identify -format '%Q\n' wizard-40.jpg

Result:

40

This is all slightly straying from the issue, which is really about the reporting. But since it popped up in the conversation, and I'd be doing some tests with that already, I might as well mention it in case it's of any use.

@fmw42
Copy link

fmw42 commented Aug 16, 2023

If you define a quality and add it to the file properly then it will be in the meta data. Most of the JPGs that I have checked have a quality value. If no quality value is in the file, then you get 92. If you have a quality value in the file, then it could have 92, but it will show as a quality value in the file as opposed to no quality value.

An IM developer might comment further on this.

@bitsgalore
Copy link
Author

bitsgalore commented Aug 16, 2023

A bit of further digging seems to confirm that IM actually determines the JPEG compression quality from the quantization tables, and not from some pre-defined metadata field:

Determine the JPEG compression quality from the quantization tables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants