Fix torch.clamp issue #237 #238

CryptoSalamander · 2022-07-14T09:22:34Z

This PR is related to #237 !
There are two options to fix this problem,

just convert max tensor to scalar (this PR)
make max tensor be loaded the same device with self.logit_scale like below:

device = self.logit_scale.device
max_tensor = torch.log(torch.tensor(1. / 0.01)).to(device)
logit_scale = torch.clamp(self.logit_scale, max=max_tensor).exp()

I think the first option is better due to its simplicity.
I tested both options on my datasets, it seems no difference in cuda memory allocation & inference speed.

CryptoSalamander · 2022-07-14T09:27:02Z

@ancientmooner Could you please check issue #237 ?

juncgu · 2023-04-11T00:17:29Z

@CryptoSalamander, I would prefer to use the second option. I faced the same issue when using torch 2.0, and the item() method in the first option will lead torch.dynamo to break WindowAttention into two graphs when tracing the module.

CryptoSalamander · 2023-04-11T14:01:33Z

@juncgu Thanks for your suggestion. I have modified the code as the second option.
Could you please take a look at this PR? @ancientmooner

CryptoSalamander force-pushed the fix/clamp-issue branch 3 times, most recently from ad49644 to 38eba56 Compare April 11, 2023 13:57

Fix torch.clamp() issue microsoft#237

f0a0bbb

CryptoSalamander force-pushed the fix/clamp-issue branch from 38eba56 to f0a0bbb Compare April 11, 2023 14:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix torch.clamp issue #237 #238

Fix torch.clamp issue #237 #238

CryptoSalamander commented Jul 14, 2022

CryptoSalamander commented Jul 14, 2022

juncgu commented Apr 11, 2023

CryptoSalamander commented Apr 11, 2023 •

edited

Fix torch.clamp issue #237 #238

Are you sure you want to change the base?

Fix torch.clamp issue #237 #238

Conversation

CryptoSalamander commented Jul 14, 2022

CryptoSalamander commented Jul 14, 2022

juncgu commented Apr 11, 2023

CryptoSalamander commented Apr 11, 2023 • edited

CryptoSalamander commented Apr 11, 2023 •

edited