Distributed Framework #327

threewisemonkeys-as · 2020-09-03T18:10:00Z

This is a very rough draft of a trainer for distributed off policy agents.

Currently working on getting DDPG to be trained in distributed manner using this.

codecov · 2020-09-03T18:26:16Z

Codecov Report

Merging #327 into master will decrease coverage by 4.04%.
The diff coverage is 7.07%.

@@            Coverage Diff             @@
##           master     #327      +/-   ##
==========================================
- Coverage   90.87%   86.83%   -4.05%     
==========================================
  Files          88       97       +9     
  Lines        3705     4017     +312     
==========================================
+ Hits         3367     3488     +121     
- Misses        338      529     +191

Impacted Files	Coverage Δ
genrl/distributed/__init__.py	`0.00% <0.00%> (ø)`
genrl/distributed/actor.py	`0.00% <0.00%> (ø)`
genrl/distributed/core.py	`0.00% <0.00%> (ø)`
genrl/distributed/experience_server.py	`0.00% <0.00%> (ø)`
genrl/distributed/learner.py	`0.00% <0.00%> (ø)`
genrl/distributed/parameter_server.py	`0.00% <0.00%> (ø)`
genrl/trainers/distributed.py	`29.62% <29.62%> (ø)`
genrl/core/buffers.py	`93.84% <33.33%> (-1.40%)`	⬇️
genrl/agents/deep/base/offpolicy.py	`96.20% <66.66%> (-1.21%)`	⬇️
genrl/agents/deep/ddpg/ddpg.py	`86.20% <100.00%> (-6.11%)`	⬇️
... and 33 more

lgtm-com · 2020-09-03T18:38:54Z

This pull request introduces 3 alerts when merging 4cba727 into 0fe4180 - view on LGTM.com

new alerts:

2 for Unused local variable
1 for Unused import

Sharad24

Have you tried it on any agent yet?

lgtm-com · 2020-09-06T18:56:35Z

This pull request introduces 10 alerts when merging 3d233c4 into 0fe4180 - view on LGTM.com

new alerts:

9 for Unused import
1 for Unused local variable

lgtm-com · 2020-09-20T14:33:36Z

This pull request introduces 15 alerts when merging 1c504cc into bb85ea1 - view on LGTM.com

new alerts:

14 for Unused import
1 for Unused local variable

lgtm-com · 2020-10-01T13:04:42Z

This pull request introduces 19 alerts when merging 2c3298a into 608fc03 - view on LGTM.com

new alerts:

14 for Unused import
2 for Unused local variable
2 for Wrong number of arguments in a call
1 for Missing call to __init__ during object initialization

lgtm-com · 2020-10-07T14:03:38Z

This pull request introduces 14 alerts when merging 73586d5 into 52b0b4c - view on LGTM.com

new alerts:

13 for Unused import
1 for Unused local variable

lgtm-com · 2020-10-07T14:42:02Z

This pull request introduces 2 alerts when merging 072d545 into 52b0b4c - view on LGTM.com

new alerts:

1 for Signature mismatch in overriding method
1 for Unused import

Sharad24 · 2020-10-08T08:03:01Z

genrl/distributed/parameter_server.py

+class WeightHolder:
+    def __init__(self, init_weights):
+        self._weights = init_weights


nit
You could add a decorator @dataclass to avoid __init__

Same for other classes where there's only assigning business happening in the constructor

Sharad24 · 2020-10-08T08:08:11Z

genrl/distributed/actor.py

+        learner_rref = get_rref(learner_name)
+        print(f"{name}: Begining experience collection")
+        while not learner_rref.rpc_sync().is_done():
+            agent.load_weights(parameter_server_rref.rpc_sync().get_weights())


Would it be better to assign the agent in the constructor itself?

Assign agent weights? They will need to be updated in the loop right?

Sharad24 · 2020-10-08T08:14:59Z

examples/distributed.py

+def collect_experience(agent, experience_server_rref):
+    obs = agent.env.reset()
+    done = False
+    for i in range(MAX_ENV_STEPS):
+        action = agent.select_action(obs)
+        next_obs, reward, done, info = agent.env.step(action)
+        experience_server_rref.rpc_sync().push((obs, action, reward, next_obs, done))
+        obs = next_obs
+        if done:
+            break


This experience collection is working only on a single agent/single thread?

This is being run in multiple different processes. Its being passed to the ActorNode which is running it in its own infinite loop

Sharad24

How is the Actor definition going to work? Can I define any architecture for the actor?(this would be ideal behavior)

(I dont see any neural network definitions as of now).

Another thought that I had was: do you think we could somehow use decorators here? There's a bunch of core details we can get rid of then.

lgtm-com · 2020-10-23T13:38:52Z

This pull request introduces 2 alerts when merging e2eef66 into 25eb018 - view on LGTM.com

new alerts:

2 for Unused import

lgtm-com · 2020-10-23T18:24:17Z

This pull request introduces 2 alerts when merging 837eb18 into 25eb018 - view on LGTM.com

new alerts:

2 for Unused import

threewisemonkeys-as · 2020-10-29T18:00:47Z

How is the Actor definition going to work? Can I define any architecture for the actor?(this would be ideal behavior)

Yeah. The ActorNode class only has multiprocessing details, it can basically accommodate any agent. You can see an example in the onpolicy example. The user has the ability to define which agent to run through the agent argument to the ActorNode and how to run it using the collect_experience function also passed as argument.

threewisemonkeys-as · 2020-10-29T18:03:33Z

(I dont see any neural network definitions as of now).

This is happening internally in the DDPG agent I'm passing to the ActorNode. For On policy the current genrl framework was too inflexible. So I created my own in the onpolicy example. I think we should move towards this kind of more modular structure in general instead of the current one which is more hierarchical.

lgtm-com · 2020-10-29T18:18:15Z

This pull request introduces 5 alerts when merging 8030b2a into 25eb018 - view on LGTM.com

new alerts:

5 for Unused import

threewisemonkeys-as · 2020-10-29T18:52:23Z

Another thought that I had was: do you think we could somehow use decorators here? There's a bunch of core details we can get rid of then.

I haven't used decorators too extensively before, I'll look into it though. Did you have any specific ideas in mind?

threewisemonkeys-as added 4 commits August 28, 2020 01:17

added demo notebooks

2567fe5

Merge branch 'master' of https://github.com/SforAiDl/genrl

1aebe3c

Merge branch 'master' of https://github.com/SforAiDl/genrl

e3b8a8a

initial structure

4cba727

Sharad24 reviewed Sep 4, 2020

View reviewed changes

threewisemonkeys-as added this to To do in Distributed RL Sep 6, 2020

threewisemonkeys-as moved this from To do to In progress in Distributed RL Sep 6, 2020

added mp

3d233c4

add files

1c504cc

added new structure on rpc

2c3298a

threewisemonkeys-as added 3 commits October 7, 2020 13:34

working structure

73586d5

fixed integration bugs

9ef6845

removed unneccary files

072d545

threewisemonkeys-as added 2 commits October 8, 2020 04:18

added support for running from multiple scripts

64db1c1

added evaluate to trainer

4d57a06

Sharad24 reviewed Oct 8, 2020

View reviewed changes

threewisemonkeys-as added 4 commits October 23, 2020 06:45

added proxy getter

f325429

added rpc backend option

7ce19ec

added logging to trainer

cfba909

Added more options to trainer

992a3a9

threewisemonkeys-as added 2 commits October 23, 2020 13:09

moved load weights to user

bf1a50a

decreased number of eval its

e2eef66

removed train wrapper

837eb18

threewisemonkeys-as changed the title ~~Adding distributed trainer~~ Distributed Framework Oct 25, 2020

threewisemonkeys-as added 10 commits October 26, 2020 12:06

removed loop to user fn

7fcbb23

added example for secondary node

0002fa4

removed original exmpale

bebf50f

removed fn

29bd1d6

shifted examples

18536a2

shifted logger to base class

8f859d6

added on policy example

555e290

removed temp example

59e960c

got on policy distributed example to work

8d5a8b6

formatting

8030b2a

threewisemonkeys-as linked an issue Nov 4, 2020 that may be closed by this pull request

RPC Communication in Distributed RL Training #303

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distributed Framework #327

Distributed Framework #327

threewisemonkeys-as commented Sep 3, 2020

codecov bot commented Sep 3, 2020 •

edited

lgtm-com bot commented Sep 3, 2020

Sharad24 left a comment

lgtm-com bot commented Sep 6, 2020

lgtm-com bot commented Sep 20, 2020

lgtm-com bot commented Oct 1, 2020

lgtm-com bot commented Oct 7, 2020

lgtm-com bot commented Oct 7, 2020

Sharad24 Oct 8, 2020

Sharad24 Oct 8, 2020

Sharad24 Oct 8, 2020 •

edited

threewisemonkeys-as Oct 29, 2020

Sharad24 Oct 8, 2020

threewisemonkeys-as Oct 29, 2020

Sharad24 left a comment

lgtm-com bot commented Oct 23, 2020

lgtm-com bot commented Oct 23, 2020

threewisemonkeys-as commented Oct 29, 2020

threewisemonkeys-as commented Oct 29, 2020

lgtm-com bot commented Oct 29, 2020

threewisemonkeys-as commented Oct 29, 2020

Distributed Framework #327

Are you sure you want to change the base?

Distributed Framework #327

Conversation

threewisemonkeys-as commented Sep 3, 2020

codecov bot commented Sep 3, 2020 • edited

Codecov Report

lgtm-com bot commented Sep 3, 2020

Sharad24 left a comment

Choose a reason for hiding this comment

lgtm-com bot commented Sep 6, 2020

lgtm-com bot commented Sep 20, 2020

lgtm-com bot commented Oct 1, 2020

lgtm-com bot commented Oct 7, 2020

lgtm-com bot commented Oct 7, 2020

Sharad24 Oct 8, 2020

Choose a reason for hiding this comment

Sharad24 Oct 8, 2020

Choose a reason for hiding this comment

Sharad24 Oct 8, 2020 • edited

Choose a reason for hiding this comment

threewisemonkeys-as Oct 29, 2020

Choose a reason for hiding this comment

Sharad24 Oct 8, 2020

Choose a reason for hiding this comment

threewisemonkeys-as Oct 29, 2020

Choose a reason for hiding this comment

Sharad24 left a comment

Choose a reason for hiding this comment

lgtm-com bot commented Oct 23, 2020

lgtm-com bot commented Oct 23, 2020

threewisemonkeys-as commented Oct 29, 2020

threewisemonkeys-as commented Oct 29, 2020

lgtm-com bot commented Oct 29, 2020

threewisemonkeys-as commented Oct 29, 2020

codecov bot commented Sep 3, 2020 •

edited

Sharad24 Oct 8, 2020 •

edited