Added Tensor parameters to prevent redundant nodes. Also enabled support for beam sizes larger than the vocabulary size #323

ralphabb · 2018-05-12T07:53:09Z

My changes to the original code are as follows:

This implementation accepts state parameters (lengths, finished and log_probs) and time_ as tensors. time_ is a float64 scalar value (which can be a placeholder). Float64 was used to allow for high-precision computation. All other state values (lengths, log_probs, finished) can also be replaced by placeholders (i.e. define a state with placeholder parameters and use with this code), as I have done in my use of this implementation.
The reason behind this change is that I discovered, while using this code, that repeated calls to the beam_search_step function caused redundant nodes to be added to the computational graph and, as such, as more steps were called, more nodes were introduced and computational complexity was quadratic in the number of steps at best. Hence, I took it upon myself to "tensorize" the parameters, so that the nodes need only be defined once. To use this code, you
a) Define your state using placeholders of dynamic shape for every state parameter (An example use can be given upon request, as I have used this in my own project)
b) Define time as a float64 scalar tensor
c) Compute next_state values using sess.run() and save these, along with time+1. Then, feed these as the new feed_dict to the same function.
NOTE: The function beam_search_step and its output are set up outside any loop over time/steps. Within the loop calculating the beam programs, only feed_dicts are constructed for the next step using the outputs of sess.run (An example is available upon request)
The previous implementation does not support a beam size larger than the vocabulary size, and would crash when this is the case. Using conditionals, namely tf.cond since I "tensorised" the parameters of this code in step 1, I added functionality to detect when the beam size is larger than the number of candidates using log operations, and to set the size of the relevant arrays accordingly

Feel free to contact me for more explanations and any questions
Best,
Ralph Abboud

My changes: 1) This implementation accepts state parameters (lengths, finished and log_probs) as well as time_ as tensors. time_ is a float64 scalar value (which can be a placeholder). Float64 was used to allow for high-precision computation. All other state values (lengths, log_probs, finished) can also be replaced by placeholders (i.e. define a state with placeholder parameters and use with this code), as I have done in my use of this implementation. The reason behind this change to tensor values is that I discovered, while using this code, that repeated calls to the beam_search_step function caused redundant nodes to be added to the computational graph and, as such, as more steps were called, more nodes were introduced and computational complexity was quadratic in the number of steps at best. Hence, I took it upon myself to "tensorize" the parameters, so that the nodes need only be defined once. To use this code, you a) Define your state using placeholders of dynamic shape for every state parameter (An example can be given upon request, as I have used this in my own project) b) Define time as a float64 scalar tensor c) Compute next_state values using sess.run() and save these, along with time+1. Then, feed these as the new feed_dict to the same function. NOTE: The function beam_search_step is set up outside any loop over time/steps. Within the loop calculating the beam programs, only feed_dicts are constructed for the next step using the outputs of sess.run (An example is available upon request) 2) The previous implementation does not support a beam size larger than the vocabulary size, and would crash when this is the case. Using conditionals, namely tf.cond since I "tensorised" the parameters of this code, I added functionality to detect when the beam size is larger than the number of candidates using log operations, and to set the size of the relevant arrays accordingly Feel free to contact me for more explanations and any questions Best, Ralph Abboud

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Tensor parameters to prevent redundant nodes. Also enabled support for beam sizes larger than the vocabulary size #323

Added Tensor parameters to prevent redundant nodes. Also enabled support for beam sizes larger than the vocabulary size #323

ralphabb commented May 12, 2018

Added Tensor parameters to prevent redundant nodes. Also enabled support for beam sizes larger than the vocabulary size #323

Are you sure you want to change the base?

Added Tensor parameters to prevent redundant nodes. Also enabled support for beam sizes larger than the vocabulary size #323

Conversation

ralphabb commented May 12, 2018