Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于DDPG算法 #208

Open
zhenbin-li opened this issue Oct 26, 2022 · 1 comment
Open

关于DDPG算法 #208

zhenbin-li opened this issue Oct 26, 2022 · 1 comment

Comments

@zhenbin-li
Copy link

def choose_action(self, s):
s = s[np.newaxis, :] # single state
return self.sess.run(self.a, feed_dict={S: s})[0] # single action

你好,想问问这个函数返回值最后[0]是返回的什么东西呢

@yin1999
Copy link
Contributor

yin1999 commented Nov 20, 2022

神经网络每次会读取一个批次的数据,但这里的输入只有一个状态样本,在将状态s feed到模型之前,这里对其扩充了一个维度,使输入变为了批次大小为1的向量(s = s[np.newaxis, :])。同样地,模型返回的预测也是一个批次的,我们要获取输入的单个状态的预测,则需要通过 [0],将返回的数组中的第一个元素取出来,其对应输入的单个状态的预测值。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants