Skip to content

What is the role of the actor network in the training of a PPO agent? #918

Closed Answered by SMH17
101AlexMartin asked this question in Q&A
Discussion options

You must be logged in to vote

Basically actor network is used to determine which actions the agent takes within its environment to increase advantage.

In doing this, it uses a series of policies, once the actor network selects actions based on the policy, the value network evaluates those actions by estimating the expected cumulative reward (value prediction see the comment at the top of ppo_agent.py) associated with those actions in order to do the more appropriate choices that increases the likelihood to maximize them adjusting the network parameters.

In summary

  • Actor Network (actor_net) selects actions within the environment taking the current state as input.

  • Value network estimates the expected future reward f…

Replies: 2 comments 5 replies

Comment options

You must be logged in to vote
5 replies
@101AlexMartin
Comment options

@SMH17
Comment options

@101AlexMartin
Comment options

@SMH17
Comment options

@SMH17
Comment options

Answer selected by 101AlexMartin
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants