Generating Two Replicated Tracks in Predictions and ID Merging #1652

neugun · 2024-01-10T08:01:39Z

Bug description
The issue at hand involves the SLEAP tracking system generating multiple tracks for each animal, even though they are designated as IDs 1 and 2. This leads to an excessive number of tracks, causing complications in further analysis, especially when using software like Simba that misinterprets these as multiple body parts. The main problem is the non-discriminatory tracking that doesn't seem to differentiate the intensity or duration of animal movements accurately, resulting in multiple, redundant tracks.

Expected behaviour
Ideally, the system should generate two distinct, non-replicated tracks for the two animals, with each track accurately representing the movement and behavior of each individual animal without creating unnecessary duplicates.

Actual behaviour
Currently, the system generates multiple tracks for the two animals, regardless of setting the IDs as 1 and 2. This redundancy creates a confusing dataset, which is difficult to analyze and interpret correctly, particularly in subsequent software applications.

Your personal set up
OS: Windows 10
Version(s): SLEAP (specific version not provided)
SLEAP installation method: pip package (assumed based on usage of Windows 10 and GUI interface)

I would appreciate any advice or solutions you can offer to resolve this issue and streamline the tracking process.

Best regards,
Zhenggang

roomrys · 2024-01-11T21:48:57Z

Hi @neugun,

Thanks for creating the issue (from discussion #1650). I did a bit of tracing and have everything documented below.

Workaround

The workaround would be to set-up the inference command to use a LabelsReader provider. In the GUI, this can be done by selecting a "Predict On" option that uses the LabelsReader which would be all options containing "user" or "suggested".

The suggested SLEAP annotating pipeline involves creating a set of Labeling Suggestions, annotating a few frames from the Labeling Suggestions (leaving a few unannotated), training, and predicting on the unlabeled "suggested" frames. Let me know if your still having a bit of trouble.

Eventually, when you want to predict on the entire current/all videos, what I would expect is new Tracks to be created for each Video in the project (see last paragraph of When is a LabelsReader used instead of a VideoReader?). I would then delete all unused Tracks (Delete Multiple Tracks > Unused) if needed (I don't believe Tracks are cleaned-up after inference if no instances are assigned to them anymore).

Diagnosis

If a LabelsReader is not used as the data Provider during inference, then a list of new Tracks will be generated each time inference is run. THIS is the "bug := gaping hole". We should create a PR to remedy this by utilizing the MultiClassPredictor.tracks attribute assuming it is possible to pass in the Tracks to use when creating the Predictor if we aren't using a LabelsReader data Provider.

Digging

For starters I am wondering if any of these Tracks are unused Tracks (Delete Multiple Tracks > Unused) which are leftover from previous predictions which later get updated to the new Track for that ID but still remain in the project. Not sure why we wouldn't just re-use the Tracks that already exist though... so, let's do some digging.

How do we get the `Track`s when creating `LabeledFrame`s for `MultiClassPredictor`s?

Well, we first try to grab the tracks attribute from the MultiClassPredictor itself, but if it is None, then we check if the Provider has a tracks attribute and use that. If neither have a Track for us, then we create new Tracks using the names pulled in from the training config (under model/heads.multi_class/class_maps/classes):

sleap/sleap/nn/inference.py

Lines 3708 to 3719 in 14b5b78

    
           tracks = self.tracks 
        
           if tracks is None: 
        
               if hasattr(data_provider, "tracks"): 
        
                   tracks = data_provider.tracks 
        
               elif ( 
        
                   self.config.model.heads.multi_class_bottomup.class_maps.classes 
        
                   is not None 
        
               ): 
        
                   names = self.config.model.heads.multi_class_bottomup.class_maps.classes 
        
                   tracks = [sleap.Track(name=n, spawned_on=0) for n in names] 
        
           predicted_frames = []

This last option seems likely to blame... so we ask

Why `MultiClassPredictor.tracks is None`?

A little review of both the BottomUpMultiClassPredictor and the TopDownMultiClassPredictor reveal that neither explicitly sets the tracks attribute from within their immediate class nor from the parent Predictor class nor in load_model. In fact, an entire repo search of \.tracks = reveals that no where is the MultiClassPredictor.tracks = being set to something (at least in this manner). I also searched for setattr. drumrollllllll. no dice.

Suspicious. Let's not point fingers... yet. Instead, let's checkout our next question.

Why `Provider.tracks is None`

Alright, let's peek all of our Providers, namely LabelsReader and VideoReader. So, it seems (and makes sense) that only LabelsReader has a tracks property. Feeling a bit like I should have guessed that.

The outcome: if a LabelsReader is not used as the data Provider during inference, then a list of new Tracks will be generated each time inference is run. THIS is the "bug := gaping hole". We should create a PR to remedy this by utilizing the MultiClassPredictor.tracks attribute.

When is a `LabelsReader` used instead of a `VideoReader`?

We see that in the Predictor.predict(..., data: Union[Provider, sleap.Labels, sleap.Video], ...) function, if the data is a Labels object (or already a LabelsReader object), then the LabelsReader is used; otherwise, the data is expected to be either a np.ndarray, Video, or VideoReader and the VideoReader is used.

sleap/sleap/nn/inference.py

Lines 512 to 518 in 14b5b78

    
           # Create provider if necessary. 
        
           if isinstance(data, np.ndarray): 
        
               data = sleap.Video(backend=sleap.io.video.NumpyVideo(data)) 
        
           if isinstance(data, sleap.Labels): 
        
               data = LabelsReader(data) 
        
           elif isinstance(data, sleap.Video): 
        
               data = VideoReader(data)

Cool, not super useful though because pretty much anytime inference is run (even through the GUI), the _make_provider_from_cli function is used to return a data provider (which is then passed into predict(data=provider)).

The type of provider to use is determined by data_path for the data. The data_path is either input through the legacy --labels optional argument for sleap-track (legacy) or through the data_path positional argument - much more likely.

sleap/sleap/nn/inference.py

Lines 5237 to 5247 in 14b5b78

    
           labels_path = getattr(args, "labels", None) 
        
           if labels_path is not None: 
        
               data_path = labels_path 
        
           else: 
        
               data_path = args.data_path 
        
           if data_path is None or data_path == "": 
        
               raise ValueError( 
        
                   "You must specify a path to a video or a labels dataset. " 
        
                   "Run 'sleap-track -h' to see full command documentation." 
        
               )

Basically, if the data_path ends in .slp, then a LabelsReader is used, otherwise a VideoReader is used.

sleap/sleap/nn/inference.py

Lines 5249 to 5273 in 14b5b78

    
           if data_path.endswith(".slp"): 
        
               labels = sleap.load_file(data_path) 
        
               if args.only_labeled_frames: 
        
                   provider = LabelsReader.from_user_labeled_frames(labels) 
        
               elif args.only_suggested_frames: 
        
                   provider = LabelsReader.from_unlabeled_suggestions(labels) 
        
               elif getattr(args, "video.index") != "": 
        
                   provider = VideoReader( 
        
                       video=labels.videos[int(getattr(args, "video.index"))], 
        
                       example_indices=frame_list(args.frames), 
        
                   ) 
        
               else: 
        
                   provider = LabelsReader(labels) 
        
           else: 
        
               print(f"Video: {data_path}") 
        
               # TODO: Clean this up. 
        
               video_kwargs = dict( 
        
                   dataset=vars(args).get("video.dataset"), 
        
                   input_format=vars(args).get("video.input_format"), 
        
               ) 
        
               provider = VideoReader.from_filepath( 
        
                   filename=data_path, example_indices=frame_list(args.frames), **video_kwargs 
        
               )

Alright, let's wrap this up. When is a the .slp used as the data_path?! Obviously with direct construction of the sleap-track command, you can specify whatever you want for the data_path, but for the GUI, we use the InferenceTask.make_predict_cli_call to construct the CLI command for us. If the item_for_inference is a DatasetItemForInference, then the .slp is used as the data_path.

Ok, so where do we determine the what type of ItemForInference to use!? Well, a list of the ItemForInference is stored in the ItemsForInference.items attribute. The items_for_inference are gathered from LearningDialog.get_items_for_inference. Ah, yes, the answer:

sleap/sleap/gui/learning/dialog.py

Lines 612 to 636 in 14b5b78

    
           if predict_frames_choice.startswith("user"): 
        
               items_for_inference = runners.ItemsForInference( 
        
                   items=[ 
        
                       runners.DatasetItemForInference( 
        
                           labels_path=self.labels_filename, frame_filter="user" 
        
                       ) 
        
                   ], 
        
                   total_frame_count=frame_count, 
        
               ) 
        
           elif predict_frames_choice.startswith("suggested"): 
        
               items_for_inference = runners.ItemsForInference( 
        
                   items=[ 
        
                       runners.DatasetItemForInference( 
        
                           labels_path=self.labels_filename, frame_filter="suggested" 
        
                       ) 
        
                   ], 
        
                   total_frame_count=frame_count, 
        
               ) 
        
           else: 
        
               items_for_inference = runners.ItemsForInference.from_video_frames_dict( 
        
                   video_frames_dict=frame_selection, 
        
                   total_frame_count=frame_count, 
        
                   labels_path=self.labels_filename, 
        
                   labels=self.labels, 
        
               )

So, basically, if you select a "Predict On" option containing either the word "user" or "suggested", then we use a LabelsReader and the Tracks are re-used, but otherwise, we use a VideoReader and new Tracks are created with each inference (note there is one item for inference per video).

roomrys · 2024-01-12T00:10:45Z

Hi @neugun,

Just pinging you to say that I've finished the diagnosis.

Thanks,
Liezl

neugun added the bug Something isn't working label Jan 10, 2024

roomrys self-assigned this Jan 11, 2024

roomrys removed their assignment Jan 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generating Two Replicated Tracks in Predictions and ID Merging #1652

Generating Two Replicated Tracks in Predictions and ID Merging #1652

neugun commented Jan 10, 2024

roomrys commented Jan 11, 2024 •

edited

roomrys commented Jan 12, 2024

Generating Two Replicated Tracks in Predictions and ID Merging #1652

Generating Two Replicated Tracks in Predictions and ID Merging #1652

Comments

neugun commented Jan 10, 2024

roomrys commented Jan 11, 2024 • edited

Workaround

Diagnosis

Digging

How do we get the Tracks when creating LabeledFrames for MultiClassPredictors?

Why MultiClassPredictor.tracks is None?

Why Provider.tracks is None

When is a LabelsReader used instead of a VideoReader?

roomrys commented Jan 12, 2024

roomrys commented Jan 11, 2024 •

edited

How do we get the `Track`s when creating `LabeledFrame`s for `MultiClassPredictor`s?

Why `MultiClassPredictor.tracks is None`?

Why `Provider.tracks is None`

When is a `LabelsReader` used instead of a `VideoReader`?