Suggestion: Faster Conversion of FLIR .seq Files to Numpy Arrays #90

marcus858 · 2023-09-15T17:33:09Z

I've discovered a method to convert FLIR .seq files into numpy arrays significantly faster than using exiftool. In my tests, this approach converted 900 frames to an array in approximately 0.5 seconds.

Here's the improved code:

import numpy as np

SEQ_CONVERSION_ERROR_CAP = 5  # Define a cap for the number of allowed conversion errors

def seq_to_arr(infilename, width=320, height=240):
    """
    Convert a FLIR .seq file to a 3D numpy array, with each 2D slice representing a frame.

    Parameters:
    - infilename: path to the .seq file
    - width, height: dimensions of the images inside the .seq file (default 320x240)

    Returns:
    - A 3D numpy array where each slice along the third dimension represents a frame.
    """
    
    pat = b"\x46\x46\x46\x00"
    images = []

    # Read the .seq file in binary mode
    with open(infilename, "rb") as f:
        file_content = f.read()

    # Split the file content based on the pattern
    n = 0
    errCount = 0
    for content in file_content.split(pat):
        content_with_pat = pat + content

        # Determine the starting index of the image data. Run this once.
        data_start_idx = _find_last_occurrence(content_with_pat)

        try:
            data = np.frombuffer(
                content_with_pat, dtype=np.uint16, count=width * height, offset=data_start_idx
            )
            integerImg = np.reshape(data, (height, width))
            images.append(integerImg)
        except:
            errCount += 1
            if errCount > SEQ_CONVERSION_ERROR_CAP:
                print(f"Could not convert frame {n} to array")
                break
        n += 1

    # Stack the 2D images along a new third dimension to create a 3D numpy array
    images_array = np.array(images)
    images_array = np.transpose(images_array, (1, 2, 0))

    return images_array


def _find_last_occurrence(array):
    """
    Find the last occurrence of a specific byte pattern in a byte array.

    Parameters:
    - array: the byte array to search within

    Returns:
    - The index immediately after the last occurrence of the pattern.
    """
    
    pattern = bytes.fromhex("00 00 00 00")
    return array.rfind(pattern) + 4

I hope this can be of help to those working with FLIR .seq files. Feedback and improvements are welcome!

jveitchmichaelis · 2023-09-16T21:46:12Z

Thanks! I'd suggest benchmarking this against the implementation that's currently in the code (which does not use Exiftool). Unfortunately due to some variation in the SEQ header, you can't just search for the magic string - I did something very similar to your method for a while but users repeatedly raised issue that their files didn't load correctly. In the end I had to reverse engineer the header from Exiftool's code and that's what's currently in the library.

Note that you also need to load the calibration coefficients (which flirpy now does correctly - I hope), which your code doesn't handle at the moment. In the first implementation I would parse the first FFF subframe with Exiftool to get the metadata, but now we just read it directly.

A small optimisation - unless this is lazy: file_content.split(pat), it's hiding an O(N) scan of the file. To avoid this, flirpy uses a regex which yields an iterator, so should be a bit kinder on memory.

flirpy/src/flirpy/io/seq.py

Lines 55 to 65 in bfa3da7

    
               def _get_fff_iterator(self, seq_blob): 
        
                   """ 
        
                   Internal function which returns an iterator containing the 
        
                   indices of the files in the SEQ. Probably this should be 
        
                   converted to something a bit more intelligent which 
        
                   actually identifies the size of the records in the file. 
        
                   """ 
        
                   magic_pattern_fff = "\x46\x46\x46\x00".encode() 
        
                   valid = re.compile(magic_pattern_fff) 
        
                   return valid.finditer(seq_blob)

But the general approach is the same - you find the byte offsets for each frame and the offset for the data within it. Once you've done that, you can read the data (with known size) in a big chunk.

You can look at the current FFF decoder here: https://github.com/LJMUAstroecology/flirpy/blob/main/src/flirpy/io/fff.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion: Faster Conversion of FLIR .seq Files to Numpy Arrays #90

Suggestion: Faster Conversion of FLIR .seq Files to Numpy Arrays #90

marcus858 commented Sep 15, 2023

jveitchmichaelis commented Sep 16, 2023 •

edited

Suggestion: Faster Conversion of FLIR .seq Files to Numpy Arrays #90

Suggestion: Faster Conversion of FLIR .seq Files to Numpy Arrays #90

Comments

marcus858 commented Sep 15, 2023

jveitchmichaelis commented Sep 16, 2023 • edited

jveitchmichaelis commented Sep 16, 2023 •

edited