Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for 3D Conv-Net #466

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

kevinkevin556
Copy link

Hi all,

Thank you for developing such a nice repo. I've been using it in many of my projects for network explainability, and it has been incredibly convenient!

Recently, I've been working with medical datasets using 3D-UNet. However, I noticed that 3D convolution is not yet supported in this library, and there are also some issues like #351 requesting for the feature. Therefore, I made several changes on GradCAM and BaseCAM to extend the functionality of GradCAM to support 3D images.

Please let me know if you have any questions or suggestions regarding the changes I've implemented. I'm excited to contribute to this project and look forward to your feedback!

@jacobgil
Copy link
Owner

jacobgil commented Dec 9, 2023

Hey, sorry for the late reply.
Thanks a lot for this functionality, this will be great to merge.

Is there a way to share an example use case for this: maybe some model and and input image example,
or an image example for the readme?

weights = self.get_cam_weights(input_tensor,
target_layer,
targets,
activations,
grads)
weighted_activations = weights[:, :, None, None] * activations
w_shape = (slice(None), slice(None)) + (None,) * (len(activations.shape)-2)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is a bit less straight forward to understand.
Can you please explain what's going on here?
Do you think there is a way to rewrite it to be more clear ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That line does exactly the same thing as

# 2D conv
if len(activations.shape) == 4:
  weighted_activations = weights[:, :, None, None] * activations

# 3D conv
elif len(activations.shape) == 5:   
  weighted_activations = weights[:, :, None, None, None] * activations

But I think you are right: it does lack some readability.
I will rewrite the code here.

@kevinkevin556
Copy link
Author

@jacobgil Thanks for your reply!

Is there a way to share an example use case for this: maybe some model and and input image example, or an image example for the readme?

I added an animation of gradcam-visualized CT scans in the readme.
Hope this can make it clearer.

@Syax19
Copy link

Syax19 commented Jan 17, 2024

@kevinkevin556 Thanks for providing the code for applying Grad-Cam on 3D CNN!

I have used your code to get the grad-cam outputs, my input 3D image tensor size is (1, 1, 24, 224, 224) representing (batch, channel, depth, height, width).
Then I got the grayscale_cam outputs size is (1, 24, 224, 224).
I'm curious to know if I take one of the outputs, for example, depth=11, the output will be "outputs[ 0, : ][ 11, : , : ] (depth, height, width)", will it corresponds to "input image[ : , 11 , : , : ] (channel, depth, height, width)" ?
Since I found that every depth of the output heatmap looked same.

Looking forward to your replying, thanks!

@kevinkevin556
Copy link
Author

I have used your code to get the grad-cam outputs, my input 3D image tensor size is (1, 1, 24, 224, 224) representing (batch, channel, depth, height, width). Then I got the grayscale_cam outputs size is (1, 24, 224, 224). I'm curious to know if I take one of the outputs, for example, depth=11, the output will be "outputs[ 0, : ][ 11, : , : ] (depth, height, width)", will it corresponds to "input image[ : , 11 , : , : ] (channel, depth, height, width)" ? Since I found that every depth of the output heatmap looked same.

@Syax19 Sorry for the late reply. I'm glad to hear that someone is using it 😄

Although I followed MONAI's convention to assign each dimension in the order of (height, width, depth), the output dimensions should still correspond with your input tensor, as there is no dimension swap when calculating Grad-CAM.

Therefore, the grayscale_cam of size (1, 24, 224, 224) represents dimensions (batch, depth, height, width) in your case.

@MoH-assan
Copy link

@jacobgil
Any update on this feature?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants