Why we have self.Q = tf.reduce_sum(tf.multiply(self.output, self.actions_), axis=1) ? #61

Meur-sault · 2019-06-16T13:06:15Z

Hi Thomas,

(Since this issue got resolved without any proper answer, I'm submitting it again.)
I don't understand that why we are doing tf.reduce_sum and multiple the network output to action.

self.Q = tf.reduce_sum(tf.multiply(self.output, self.actions_), axis=1)

Why aren't we considering self.output as predicted Q value.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why we have self.Q = tf.reduce_sum(tf.multiply(self.output, self.actions_), axis=1) ? #61

Why we have self.Q = tf.reduce_sum(tf.multiply(self.output, self.actions_), axis=1) ? #61

Meur-sault commented Jun 16, 2019 •

edited

Why we have self.Q = tf.reduce_sum(tf.multiply(self.output, self.actions_), axis=1) ? #61

Why we have self.Q = tf.reduce_sum(tf.multiply(self.output, self.actions_), axis=1) ? #61

Comments

Meur-sault commented Jun 16, 2019 • edited

Meur-sault commented Jun 16, 2019 •

edited