[BUGFIX]fix bug of top-p sampling #1503

hutao965 · 2021-01-26T02:32:53Z

Description

fix bug of top-p sampling mentioned in issue

Checklist

Essentials

PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage
Code is well-documented

cc @dmlc/gluon-nlp-team

codecov · 2021-01-26T02:46:32Z

Codecov Report

Merging #1503 (f20c838) into master (16146c2) will decrease coverage by 2.55%.
The diff coverage is 93.36%.

@@            Coverage Diff             @@
##           master    #1503      +/-   ##
==========================================
- Coverage   85.87%   83.31%   -2.56%     
==========================================
  Files          52       55       +3     
  Lines        6909     7517     +608     
==========================================
+ Hits         5933     6263     +330     
- Misses        976     1254     +278

Impacted Files	Coverage Δ
setup.py	`0.00% <ø> (ø)`
src/gluonnlp/op.py	`95.16% <ø> (ø)`
src/gluonnlp/utils/testing.py	`58.33% <ø> (-35.84%)`	⬇️
src/gluonnlp/models/transformer.py	`98.52% <57.14%> (-0.41%)`	⬇️
src/gluonnlp/layers.py	`86.72% <83.87%> (-0.40%)`	⬇️
src/gluonnlp/utils/tvm_utils.py	`92.30% <92.30%> (ø)`
src/gluonnlp/models/t5.py	`93.67% <93.67%> (ø)`
src/gluonnlp/models/mt5.py	`95.49% <95.49%> (ø)`
src/gluonnlp/attention_cell.py	`88.39% <100.00%> (+1.47%)`	⬆️
src/gluonnlp/models/__init__.py	`100.00% <100.00%> (ø)`
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 16146c2...f766c01. Read the comment docs.

sxjscience · 2021-01-26T07:37:27Z

scripts/generation/README.md

@@ -43,7 +44,7 @@ Some metrics for the unconditional generated text
 | topk=40       | 0.4291         | 0.9666         | 0.0            |


I think previously we have the results of t=0.9, we should remove that row.

sxjscience · 2021-01-26T07:38:44Z

src/gluonnlp/sequence_sampler.py

+                 mx.np.zeros_like(probs)
+             )
+             # choose the borderline prob
+             p_prob = mx.np.min(masked_probs, axis=2, keepdims=True)


Is it possible to use exactly the same implementation as https://gist.github.com/thomwolf/1a5a29f6962089e871b94cbd09daf317?

I'm referring to the part in which they choose not to mask the top-1 probability:

sorted_indices_to_remove[..., 1:] = sorted_indices_to_remove[..., :-1].clone()

Sorry for the confusion. I see that both sort and argsort are implemented but I don't see a way to get both values and indices in one call. The usage of topk(k=-1) that assumes the return values to be sorted seems to be undocumented, which is a bit of a concern.

sxjscience · 2021-01-26T19:12:59Z

@szha Would you help review?

sxjscience · 2021-01-29T17:20:05Z

@szha Would you take a look?

sxjscience · 2021-02-09T21:07:33Z

src/gluonnlp/sequence_sampler.py

+                 probs >= p_prob,
+                 probs,
+                 mx.np.zeros_like(probs)
+             )


The major difference between the current implementation and the original pytorch-based implementation is that when sampling_topp < max(probs), it is not clear which probability will be picked.

The pytorch-based implementation will always choose the token that is most probable.

hutao added 2 commits January 26, 2021 10:14

update

8364d7e

update

b3b846b

hutao965 requested a review from a team as a code owner January 26, 2021 02:32

hutao965 requested a review from sxjscience January 26, 2021 02:33

sxjscience reviewed Jan 26, 2021

View reviewed changes

update

f766c01

sxjscience reviewed Feb 9, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUGFIX]fix bug of top-p sampling #1503

[BUGFIX]fix bug of top-p sampling #1503

hutao965 commented Jan 26, 2021

codecov bot commented Jan 26, 2021 •

edited

sxjscience Jan 26, 2021

sxjscience Jan 26, 2021

sxjscience Jan 30, 2021

szha Jan 30, 2021

sxjscience commented Jan 26, 2021

sxjscience commented Jan 29, 2021

sxjscience Feb 9, 2021 •

edited

		@@ -43,7 +44,7 @@ Some metrics for the unconditional generated text
		\| topk=40 \| 0.4291 \| 0.9666 \| 0.0 \|

[BUGFIX]fix bug of top-p sampling #1503

Are you sure you want to change the base?

[BUGFIX]fix bug of top-p sampling #1503

Conversation

hutao965 commented Jan 26, 2021

Description

Checklist

Essentials

codecov bot commented Jan 26, 2021 • edited

Codecov Report

sxjscience Jan 26, 2021

Choose a reason for hiding this comment

sxjscience Jan 26, 2021

Choose a reason for hiding this comment

sxjscience Jan 30, 2021

Choose a reason for hiding this comment

szha Jan 30, 2021

Choose a reason for hiding this comment

sxjscience commented Jan 26, 2021

sxjscience commented Jan 29, 2021

sxjscience Feb 9, 2021 • edited

Choose a reason for hiding this comment

codecov bot commented Jan 26, 2021 •

edited

sxjscience Feb 9, 2021 •

edited