Backtracking in beam search #151

shubhamagarwal92 · 2018-06-30T14:01:14Z

Compared to OpenNMT, why do we need this block which handles the dropped sequences that see EOS earlier. (This is not there in their beam search implementation.) They are also doing a similar process: not letting the EOS have children here. However they have this end condition when EOS is at the top. They construct back the hypothesis using get_hyp function.

More specifically, can you explain elaborately what we are doing here.

            #       1. If there is survived sequences in the return variables, replace
            #       the one with the lowest survived sequence score with the new ended
            #       sequences
            #       2. Otherwise, replace the ended sequence with the lowest sequence
            #       score with the new ended sequence

I understand why we need to handle EOS sequences since we have their information in backtracking variables. But why do we need to "replace the one with the lowest survived sequence score with the new ended sequences"? AFAIK, this res_k_idx is tracking which beam (from the end) can we replace the information (the two conditions specified in the comments). However, we are not replacing the contents of the beam which got EOS, i.e:

t_predecessors[res_idx] = predecessors[t][idx[0]]
t_predecessors[idx] = ??

I understand that after this process all the beams remain static and we use index_select at each step to select the top beams.

Also, the unit test for top_k_decoder is not deterministic. Fails when batch_size>2 and also sometimes when batch_size==2.

The text was updated successfully, but these errors were encountered:

pskrunner14 · 2018-09-02T13:29:14Z

@shubhamagarwal92 thanks for pointing this out. I'll check their implementation and see what's different. Working on the test to make it more deterministic. Will test more beam sizes too.

Mehrad0711 · 2019-03-25T23:39:52Z

Hi,
Are there any updates for this issue?
I've implemented a similar beam search strategy which uses the _backtrack(...) function from this repo but even with a beam_size of 1, I get worse results than greedy decoding.
Would be really helpful if you can double-check the implementation. Thanks.

GZJAS · 2019-06-11T14:47:33Z

I studied the codes these days, and I thought you can use the torch.repeat_interleave. Such as follow:
hidden = tuple([torch.repeat_interleave(h, self.k, dim=1) for h in encoder_hidden])
inflated_encoder_outputs = torch.repeat_interleave(encoder_outputs, self.k, dim=0)

shubhamagarwal92 · 2019-06-11T15:48:20Z

@Mehrad0711 maybe you can try and integrate BS from allennlp; their implementation here

pskrunner14 · 2019-06-11T18:41:22Z

Hey @Mehrad0711 @shubhamagarwal92 sorry haven't gotten the time to work on this yet. You're welcome to submit a PR :)

ghost · 2021-05-07T13:04:38Z

Any update on this issue ?

pskrunner14 self-assigned this Sep 2, 2018

pskrunner14 added question high priority labels Sep 2, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backtracking in beam search #151

Backtracking in beam search #151

shubhamagarwal92 commented Jun 30, 2018

pskrunner14 commented Sep 2, 2018

Mehrad0711 commented Mar 25, 2019

GZJAS commented Jun 11, 2019

shubhamagarwal92 commented Jun 11, 2019

pskrunner14 commented Jun 11, 2019

ghost commented May 7, 2021

Backtracking in beam search #151

Backtracking in beam search #151

Comments

shubhamagarwal92 commented Jun 30, 2018

pskrunner14 commented Sep 2, 2018

Mehrad0711 commented Mar 25, 2019

GZJAS commented Jun 11, 2019

shubhamagarwal92 commented Jun 11, 2019

pskrunner14 commented Jun 11, 2019

ghost commented May 7, 2021