Doesn't recognize bitmap files exported from GIMP #58

Gurfuzle · 2018-11-14T16:52:00Z

When I'm exporting images from GIMP as bitmap, this is not recognizing the magic number for those. When I run the file through xxd, I am getting:

00000000: 424d 7a75 0200 0000 0000 7a04 0000 6c00 BMzu......z...l.
00000010: 0000 9001 0000 9001 0000 0100 0800 0000 ................
00000020: 0000 0071 0200 232e 0000 232e 0000 0001 ...q..#...#.....
00000030: 0000 0001 0000 4247 5273 0000 0000 0000 ......BGRs......

Which does start with the 424d, but it fails to be recognized as a bitmap.

Gurfuzle · 2018-11-14T17:36:16Z

Here's an example file (zipped)
example.bmp.zip

j256 · 2018-11-14T20:25:26Z

Great example Mike. Thanks much.

CrushaKRool · 2020-10-19T14:24:28Z

I've actually stumbled upon this myself and investigated a bit. The problem lies in MagicEntries.optimizeFirstBytes(), where it calls MagicEntry.getStartsWithByte() -> StringType.getStartingBytes() ->StringType$TestInfo.getStartingBytes(). This will always return null if the string is less than 4 characters long.
Which means all file types that start with a string pattern of magic bytes that is less than 4 characters long will not end up in the optimization index and are never actually considered during subsequent matching attempts. Since the Bitmap format only starts with two fixed characters BM as its starting string, it also falls victim to this rule.
Actually, the calling code only ever uses the first byte anyway, so requiring more than that seems unnecessary.

j256 · 2020-10-19T23:17:26Z

Appreciate the look @CrushaKRool . The code is supposed to use the first-byte stuff and then fall through to the findMatch(). See

simplemagic/src/main/java/com/j256/simplemagic/entries/MagicEntries.java

Line 122 in 211cf35

return findMatch(bytes, entryList);

Let me get this test in place and then debug it.

CrushaKRool · 2020-10-21T15:18:17Z

Ah, you are right. I overlooked that.

Debugging it further, it seems to identify the first magic bytes as Bitmap but fails to match any of the child formats, which require the byte at index 14 to be either 12, 40, 64 or 128. In my case it's 124, though (exported from GIMP).
Unfortunately, since the name of the parent MagicEntry for bitmap is "unknown" and none of the children overwrite this with something else, it will end up as "unknown" in the ContentData and also not set any mime types. And the method is coded to return null as ContentInfo in that case.

simplemagic/src/main/java/com/j256/simplemagic/entries/MagicEntry.java

Lines 64 to 67 in 074a1fd

    
           ContentInfo matchBytes(byte[] bytes) { 
        
           	ContentData data = matchBytes(bytes, 0, 0, null); 
        
           	if (data == null || data.name == MagicEntryParser.UNKNOWN_NAME) { 
        
           		return null;

So I guess it boils down to both the Magic file not providing enough data to handle the base case without a proper child match, as well as GIMP producing a header of an unknown format. According to the documentation on Wikipedia, the byte on the 0-based index 14 is the start of the DIB header and tells the size of that header in bytes. So perhaps GIMP is producing some kind of header that is only 124 bytes in size, rather than the four other sizes of the PC bitmap formats defined in the Magic file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Doesn't recognize bitmap files exported from GIMP #58

Doesn't recognize bitmap files exported from GIMP #58

Gurfuzle commented Nov 14, 2018

Gurfuzle commented Nov 14, 2018 •

edited

j256 commented Nov 14, 2018

CrushaKRool commented Oct 19, 2020 •

edited

j256 commented Oct 19, 2020

CrushaKRool commented Oct 21, 2020 •

edited

Doesn't recognize bitmap files exported from GIMP #58

Doesn't recognize bitmap files exported from GIMP #58

Comments

Gurfuzle commented Nov 14, 2018

Gurfuzle commented Nov 14, 2018 • edited

j256 commented Nov 14, 2018

CrushaKRool commented Oct 19, 2020 • edited

j256 commented Oct 19, 2020

CrushaKRool commented Oct 21, 2020 • edited

Gurfuzle commented Nov 14, 2018 •

edited

CrushaKRool commented Oct 19, 2020 •

edited

CrushaKRool commented Oct 21, 2020 •

edited