Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PPO算法的实现, 为啥要给概率取对数? #147

Open
chzhan opened this issue Nov 15, 2023 · 2 comments
Open

PPO算法的实现, 为啥要给概率取对数? #147

chzhan opened this issue Nov 15, 2023 · 2 comments

Comments

@chzhan
Copy link

chzhan commented Nov 15, 2023

如题, 公式里也没有需要取对数的地方, loss中也用不到对数(除了KL散度那一下), 就不大明白搞绕来绕去取对数再取指数求概率比值是为啥, 求解..

@yl-jiang
Copy link

yl-jiang commented Dec 6, 2023

我理解是为了将除法操作转换为减法操作吧

@johnjim0816
Copy link
Contributor

我理解是为了将除法操作转换为减法操作吧

是的

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants