Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doesn't recognize bitmap files exported from GIMP #58

Open
Gurfuzle opened this issue Nov 14, 2018 · 5 comments
Open

Doesn't recognize bitmap files exported from GIMP #58

Gurfuzle opened this issue Nov 14, 2018 · 5 comments

Comments

@Gurfuzle
Copy link

When I'm exporting images from GIMP as bitmap, this is not recognizing the magic number for those. When I run the file through xxd, I am getting:

00000000: 424d 7a75 0200 0000 0000 7a04 0000 6c00 BMzu......z...l.
00000010: 0000 9001 0000 9001 0000 0100 0800 0000 ................
00000020: 0000 0071 0200 232e 0000 232e 0000 0001 ...q..#...#.....
00000030: 0000 0001 0000 4247 5273 0000 0000 0000 ......BGRs......

Which does start with the 424d, but it fails to be recognized as a bitmap.

@Gurfuzle
Copy link
Author

Gurfuzle commented Nov 14, 2018

Here's an example file (zipped)
example.bmp.zip

@j256
Copy link
Owner

j256 commented Nov 14, 2018

Great example Mike. Thanks much.

@CrushaKRool
Copy link

CrushaKRool commented Oct 19, 2020

I've actually stumbled upon this myself and investigated a bit. The problem lies in MagicEntries.optimizeFirstBytes(), where it calls MagicEntry.getStartsWithByte() -> StringType.getStartingBytes() ->StringType$TestInfo.getStartingBytes(). This will always return null if the string is less than 4 characters long.
Which means all file types that start with a string pattern of magic bytes that is less than 4 characters long will not end up in the optimization index and are never actually considered during subsequent matching attempts. Since the Bitmap format only starts with two fixed characters BM as its starting string, it also falls victim to this rule.
Actually, the calling code only ever uses the first byte anyway, so requiring more than that seems unnecessary.

@j256
Copy link
Owner

j256 commented Oct 19, 2020

Appreciate the look @CrushaKRool . The code is supposed to use the first-byte stuff and then fall through to the findMatch(). See

Let me get this test in place and then debug it.

@CrushaKRool
Copy link

CrushaKRool commented Oct 21, 2020

Ah, you are right. I overlooked that.

Debugging it further, it seems to identify the first magic bytes as Bitmap but fails to match any of the child formats, which require the byte at index 14 to be either 12, 40, 64 or 128. In my case it's 124, though (exported from GIMP).
Unfortunately, since the name of the parent MagicEntry for bitmap is "unknown" and none of the children overwrite this with something else, it will end up as "unknown" in the ContentData and also not set any mime types. And the method is coded to return null as ContentInfo in that case.

ContentInfo matchBytes(byte[] bytes) {
ContentData data = matchBytes(bytes, 0, 0, null);
if (data == null || data.name == MagicEntryParser.UNKNOWN_NAME) {
return null;

So I guess it boils down to both the Magic file not providing enough data to handle the base case without a proper child match, as well as GIMP producing a header of an unknown format. According to the documentation on Wikipedia, the byte on the 0-based index 14 is the start of the DIB header and tells the size of that header in bytes. So perhaps GIMP is producing some kind of header that is only 124 bytes in size, rather than the four other sizes of the PC bitmap formats defined in the Magic file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants