[WIP] ONNX conversion #6

ganik · 2020-07-20T22:36:58Z

Changes needed to convert DeBerta to ONNX

DeBERTa/deberta/disentangled_attention.py

ganik · 2020-08-02T05:45:16Z

DeBERTa/apps/train.py

+    with torch.no_grad():
+      trainer.train_step(batch['input_ids'], batch['type_ids'], batch['position_ids'], batch['input_mask'], batch['labels'])
+      # conversion fails now with:
+      # site-packages/torch/onnx/utils.py:617: UserWarning: ONNX export failed on ATen operator broadcast_tensors


broadcast_tensor and mse_loss are ops that are not implemented in ONNX currently. To get unblocked need to modify functional.py as per below comment

DeBERTa/apps/sequence_classification.py

DeBERTa/deberta/ops.py

DeBERTa/optims/fp16_optimizer.py

ganik · 2020-08-03T23:30:04Z

DeBERTa/apps/train.py

+    with torch.no_grad():
+      trainer.train_step(batch['input_ids'], batch['type_ids'], batch['position_ids'], batch['input_mask'], batch['labels'])
+      # conversion fails now with:
+      # site-packages/torch/onnx/utils.py:617: UserWarning: ONNX export failed on ATen operator broadcast_tensors


mse_loss implementation in https://github.com/pytorch/pytorch/blob/master/torch/nn/functional.py#L2682 uses 2 ops that are not implemented: broadcast_tensors() and mse_loss(). Working around this to get unblocked, made a patch:
#expanded_input, expanded_target = torch.broadcast_tensors(input, target)
expanded_input = input + torch.zeros(target.size())
expanded_target = target + torch.zeros(input.size())
#ret = torch._C._nn.mse_loss(expanded_input, expanded_target, _Reduction.get_enum(reduction))
t = expanded_input - expanded_target
t = t * t
ret = torch.mean(t)

DeBERTa/deberta/bert.py

DeBERTa/deberta/disentangled_attention.py

ganik · 2020-08-08T01:02:41Z

DeBERTa/deberta/disentangled_attention.py

-        self.q_bias = torch.nn.Parameter(torch.zeros((self.all_head_size), dtype=torch.float))
-        self.v_bias = torch.nn.Parameter(torch.zeros((self.all_head_size), dtype=torch.float))
+        # Looks like params below are never updated and const, so removing them
+        #self.q_bias = torch.nn.Parameter(torch.zeros((self.all_head_size), dtype=torch.float))


q_bias and v_bias are always const, so commenting them out

ganik · 2020-08-14T19:50:14Z

Instead of change code every where, why not just change StableDropout?

Previous iterations i tried to redefine StableDropout to inherit from nn.Dropout, but it led to regression in model stats. Could not figure out why. If i do change this way there is no regression. Something was missing with just redefining StableDropout.

Replace with Dropout and Softmax

498db46

BigBird01 suggested changes Jul 21, 2020

View reviewed changes

DeBERTa/deberta/disentangled_attention.py Outdated Show resolved Hide resolved

ganik added 2 commits July 21, 2020 23:04

mask attention scores in Softmax

3be8289

onnx conversion and training

dab83af

ganik commented Aug 2, 2020

View reviewed changes

DeBERTa/apps/sequence_classification.py Show resolved Hide resolved

ganik commented Aug 2, 2020

View reviewed changes

DeBERTa/deberta/ops.py Outdated Show resolved Hide resolved

TBD tight coupling with torch 1.3

64f068c

ganik commented Aug 2, 2020

View reviewed changes

DeBERTa/optims/fp16_optimizer.py Outdated Show resolved Hide resolved

ganik commented Aug 3, 2020

View reviewed changes

ganik changed the title ~~[WIP] Changes comparison for ONNX conversion~~ [WIP] ONNX conversion Aug 4, 2020

ganik added 4 commits August 7, 2020 00:58

opset 12, expand attention mask

1ce5cc1

Merge

efd079b

loss is first

b301b46

commenting out v_ and q_ biases as they are always const

d2fa9fd

ganik commented Aug 8, 2020

View reviewed changes

DeBERTa/deberta/bert.py Show resolved Hide resolved

ganik commented Aug 8, 2020

View reviewed changes

DeBERTa/deberta/disentangled_attention.py Outdated Show resolved Hide resolved

ganik commented Aug 8, 2020

View reviewed changes

ganik added 2 commits August 14, 2020 19:28

Fix Dropout model regression issue

e4793b8

Use nn.dropout and nn.softmax by default

155d966

Gani Nazirov added 4 commits September 29, 2020 18:06

Added ORT Glue based tests

95ec7ad

remove onnx path in train.py

bf8a3ce

Add Readme

c71818b

Use random seed by default

7eff1fd

BigBird01 force-pushed the master branch 5 times, most recently from 5315a01 to c81eb40 Compare February 7, 2021 01:44

BigBird01 force-pushed the master branch from 9f0ee02 to 839e3b4 Compare February 9, 2021 15:50

BigBird01 force-pushed the master branch 2 times, most recently from 79dbe25 to e59f09f Compare December 3, 2021 18:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] ONNX conversion #6

[WIP] ONNX conversion #6

ganik commented Jul 20, 2020 •

edited

ganik Aug 2, 2020 •

edited

ganik Aug 3, 2020 •

edited

ganik Aug 8, 2020

ganik commented Aug 14, 2020

[WIP] ONNX conversion #6

Are you sure you want to change the base?

[WIP] ONNX conversion #6

Conversation

ganik commented Jul 20, 2020 • edited

ganik Aug 2, 2020 • edited

Choose a reason for hiding this comment

ganik Aug 3, 2020 • edited

Choose a reason for hiding this comment

ganik Aug 8, 2020

Choose a reason for hiding this comment

ganik commented Aug 14, 2020

ganik commented Jul 20, 2020 •

edited

ganik Aug 2, 2020 •

edited

ganik Aug 3, 2020 •

edited