[Feature] Support calculating loss in the validation step #1486

zhouzaida · 2024-02-06T07:12:46Z

What is the feature?

We often receive requests from users who want to be able to print the loss during the validation phase, but MMEngine does not support this feature at the moment. If you also have a need for this, feel free to discuss it.

There are two possible solutions for MMEngine to support this feature.

One is to add a LossMetric to the downstream library, the forward method of the downstream library will still return a list of DataElement, and the LossMetric can calculate the loss by the information in the DataElement.

The other is that the forward method of the downstream library returns a dictionary, and the downstream library needs to finish the loss calculation in forward and return it to MMEngine.

https://github.com/open-mmlab/mmpretrain/blob/17a886cb5825cd8c26df4e65f7112d404b99fe12/mmpretrain/models/classifiers/image.py#L87

References

Any other context?

No response

guyleaf · 2024-05-02T09:22:10Z

Hi, I checked the current #1503 work.
Except for the ValLoop changes, I have some suggestions about how to return the losses.

MMEngine

To be more general and follow the current implementation style,
I think we should utilize the loss_and_predict function, like the loss_and_predict of the head module, instead of implementing it in predict mode directly.

In the val_step and test_step functions, we could add an optional argument to the function without breaking changes, e.g., loss: bool = False.
(Notice: the loss calculation in the non-training step should always be an optional option)

    def val_step(self, data: Union[tuple, dict, list], loss: bool = False) -> Union[list, Tuple[list, Dict[str, torch.Tensor]]]:
        """Gets the predictions of given data.

        Calls ``self.data_preprocessor(data, False)`` and
        ``self(inputs, data_sample, mode='predict')`` in order. Return the
        predictions which will be passed to evaluator.

        Args:
            data (dict or tuple or list): Data sampled from dataset.

        Returns:
            list: The predictions of given data.
        """
        data = self.data_preprocessor(data, False)
        if loss:
            return self._run_forward(data, mode='loss_and_predict')  # type: ignore
        else:
            return self._run_forward(data, mode='predict')  # type: ignore

    def test_step(self, data: Union[dict, tuple, list], loss: bool = False) -> Union[list, Tuple[list, Dict[str, torch.Tensor]]]:
        """``BaseModel`` implements ``test_step`` the same as ``val_step``.

        Args:
            data (dict or tuple or list): Data sampled from dataset.

        Returns:
            list: The predictions of given data.
        """
        data = self.data_preprocessor(data, False)
        if loss:
            return self._run_forward(data, mode='loss_and_predict')  # type: ignore
        else:
            return self._run_forward(data, mode='predict')  # type: ignore

Downstream libraries

As for the implementation of the loss_and_predict function, because the current implementations of models don't support the calculation of predictions and losses, we could implement it by calling loss and predict to get the results.

Although we need to infer twice to get the results, I think there is no choice to do it due to the backward compatibility.
The model has to add the loss_and_predict model to the forward function by itself. (optional feature)

def loss_and_predict(self,
                batch_inputs: Tensor,
                batch_data_samples: SampleList,
                rescale: bool = True) -> SampleList:
        """Predict results and calculate losses from a batch of inputs and data samples with post-
        processing.

        Args:
            batch_inputs (Tensor): Inputs, has shape (bs, dim, H, W).
            batch_data_samples (List[:obj:`DetDataSample`]): The batch
                data samples. It usually includes information such
                as `gt_instance` or `gt_panoptic_seg` or `gt_sem_seg`.
            rescale (bool): Whether to rescale the results.
                Defaults to True.

        Returns:
            list[:obj:`DetDataSample`]: Detection results of the input images.
            Each DetDataSample usually contain 'pred_instances'. And the
            `pred_instances` usually contains following keys.

            - scores (Tensor): Classification scores, has a shape
              (num_instance, )
            - labels (Tensor): Labels of bboxes, has a shape
              (num_instances, ).
            - bboxes (Tensor): Has a shape (num_instances, 4),
              the last dimension 4 arrange as (x1, y1, x2, y2).
            dict: A dictionary of loss components
        """
        preds = self.predict(batch_inputs, batch_data_samples, rescale=rescale)
        losses = self.loss(batch_inputs, batch_data_samples)
        return preds, losses

I think this is a more elegant and ideal solution.

zhouzaida · 2024-05-17T06:46:31Z

Hi @guyleaf , we have discussed your proposal before. Your solution is elegant but it requires the model forward twice.

zhouzaida mentioned this issue Feb 18, 2024

[Feature] Early Stopping, Validation Loss #1491

Open

zhouzaida linked a pull request Feb 26, 2024 that will close this issue

[Feature] Support calculating loss during validation #1503

Merged

zhouzaida closed this as completed in #1503 May 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support calculating loss in the validation step #1486

[Feature] Support calculating loss in the validation step #1486

zhouzaida commented Feb 6, 2024

guyleaf commented May 2, 2024 •

edited

zhouzaida commented May 17, 2024

[Feature] Support calculating loss in the validation step #1486

[Feature] Support calculating loss in the validation step #1486

Comments

zhouzaida commented Feb 6, 2024

What is the feature?

References

Any other context?

guyleaf commented May 2, 2024 • edited

MMEngine

Downstream libraries

zhouzaida commented May 17, 2024

guyleaf commented May 2, 2024 •

edited