bug in reinforce with baseline #37

hlhang9527 · 2022-03-21T23:02:25Z

the update value network should be:

    alpha_w = 1e-3  # 初始化

    optimizer_w = optim.Adam(**s_value_func**.parameters(), lr=alpha_w)
    optimizer_w.zero_grad()
    policy_loss_w =-delta
    policy_loss_w.backward(retain_graph = True)
    clip_grad_norm_(policy_loss_w, 0.1)
    optimizer_w.step()

The text was updated successfully, but these errors were encountered:

stvsd1314 · 2022-04-12T12:57:00Z

There's some error in this code. when run this code,it shows some error about compute graph. do you meet the same problem?

hlhang9527 · 2022-04-12T14:49:13Z

same problem here, you can debug it step by step to see the errors

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug in reinforce with baseline #37

bug in reinforce with baseline #37

hlhang9527 commented Mar 21, 2022

stvsd1314 commented Apr 12, 2022

hlhang9527 commented Apr 12, 2022

bug in reinforce with baseline #37

bug in reinforce with baseline #37

Comments

hlhang9527 commented Mar 21, 2022

stvsd1314 commented Apr 12, 2022

hlhang9527 commented Apr 12, 2022