Issues using the custom neural network models in AI-Economist #79

aslansd opened this issue Oct 4, 2022

Issues using the custom neural network models in AI-Economist #79

aslansd opened this issue Oct 4, 2022


aslansd commented Oct 4, 2022


I am developing AI-Economist and I am trying to run my own modified version of AI-Economist with your provided custom neural network models: KerasConvLSTM and RandomAction.

Here is my system properties:

Model Name: MacBook Pro
Model Identifier: MacBookPro17,1
Chip: Apple M1
Total Number of Cores: 8 (4 performance and 4 efficiency)
Memory: 16 GB

Here is the list of modules in my modified ai-economist environment:

As it is clear from the above list, there are some extra modules beside the required modules of the original AI-Economist and also some modules have different versions than what are mentioned in the original AI-Economist. For example, the ray version here is 2.0.0. I think with this version of ray, the parent class of any tensorflow recurrent custom model should be set RecurrentNetwork instead of RecurrentTFModelV2, so I made modifications in tf_models as required.

Then, I ran the following commands similar to what are brought in your provided notebook: multi_agent_training_with_rllib

from rllib.env_wrapper import RLlibEnvWrapper
env_obj = RLlibEnvWrapper({"env_config_dict": env_config_dict}, verbose=True)

import ray
from ray.rllib.agents.ppo import PPOTrainer

from ray.rllib.models.catalog import ModelCatalog
from rllib.tf_models import KerasConvLSTM, RandomAction

ModelCatalog.register_custom_model(KerasConvLSTM.custom_name, KerasConvLSTM)
ModelCatalog.register_custom_model(RandomAction.custom_name, RandomAction)

policies = {
"a": (
None, # uses default policy
{'clip_param': 0.3,
'entropy_coeff': 0.025,
'entropy_coeff_schedule': None,
'gamma': 0.998,
'grad_clip': 10.0,
'kl_coeff': 0.0,
'kl_target': 0.01,
'lambda': 0.98,
'lr': 0.0003,
'lr_schedule': None,
'model': {'custom_model': 'keras_conv_lstm',
'custom_model_config': {'fc_dim': 128,
'idx_emb_dim': 4,
'input_emb_vocab': 100,
'lstm_cell_size': 128,
'num_conv': 2,
'num_fc': 2},
'max_seq_len': 25},
'use_gae': True,
'vf_clip_param': 50.0,
'vf_loss_coeff': 0.05,
'vf_share_layers': False} # define a custom agent policy configuration.
"p": (
None, # uses default policy
{'clip_param': 0.3,
'entropy_coeff': 0.125,
'entropy_coeff_schedule': [[0, 2.0], [50000000, 0.125]],
'gamma': 0.998,
'grad_clip': 10.0,
'kl_coeff': 0.0,
'kl_target': 0.01,
'lambda': 0.98,
'lr': 0.0001,
'lr_schedule': None,
'model': {'custom_model': 'keras_conv_lstm',
'custom_model_config': {'fc_dim': 256,
'idx_emb_dim': 4,
'input_emb_vocab': 100,
'lstm_cell_size': 256,
'num_conv': 2,
'num_fc': 2},
'max_seq_len': 25},
'use_gae': True,
'vf_clip_param': 50.0,
'vf_loss_coeff': 0.05,
'vf_share_layers': False} # define a custom planner policy configuration.

policy_mapping_fun = lambda i: "a" if str(i).isdigit() else "p"

policies_to_train = ["a", "p"]

trainer_config = {
"multiagent": {
"policies": policies,
"policies_to_train": policies_to_train,
"policy_mapping_fn": policy_mapping_fun,

"num_workers": 5,
"num_envs_per_worker": 1,
# Other training parameters
"train_batch_size": 4000,
"sgd_minibatch_size": 4000,
"num_sgd_iter": 1

env_config = {
"env_config_dict": env_config_dict,
"num_envs_per_worker": trainer_config.get('num_envs_per_worker'),

"env_config": env_config


trainer = PPOTrainer(env=RLlibEnvWrapper, config=trainer_config )

However, the following error is generated:

RayActorError: The actor died because of an error raised in its creation task, ray::RolloutWorker.init() (pid=1353, ip=, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x16a373790>)
File "/Users/asataryd/miniforge3/envs/modified-ai-economist/lib/python3.10/site-packages/ray/rllib/evaluation/", line 613, in init
File "/Users/asataryd/miniforge3/envs/modified-ai-economist/lib/python3.10/site-packages/ray/rllib/evaluation/", line 1784, in _build_policy_map
File "/Users/asataryd/miniforge3/envs/modified-ai-economist/lib/python3.10/site-packages/ray/rllib/policy/", line 123, in create_policy
self[policy_id] = create_policy_for_framework(
File "/Users/asataryd/miniforge3/envs/modified-ai-economist/lib/python3.10/site-packages/ray/rllib/utils/", line 71, in create_policy_for_framework
return policy_class(
File "/Users/asataryd/miniforge3/envs/modified-ai-economist/lib/python3.10/site-packages/ray/rllib/algorithms/ppo/", line 83, in init
File "/Users/asataryd/miniforge3/envs/modified-ai-economist/lib/python3.10/site-packages/ray/rllib/policy/", line 81, in init
self.model = self.make_model()
File "/Users/asataryd/miniforge3/envs/modified-ai-economist/lib/python3.10/site-packages/ray/rllib/policy/", line 221, in make_model
return ModelCatalog.get_model_v2(
File "/Users/asataryd/miniforge3/envs/modified-ai-economist/lib/python3.10/site-packages/ray/rllib/models/", line 587, in get_model_v2
raise ValueError(
ValueError: It looks like you are still using <rllib.tf_models.KerasConvLSTM object at 0x16a6f2ec0>.register_variables() to register your model's weights. This is no longer required, but if you are still calling this method at least once, you must make sure to register all created variables properly. The missing variables are {<Reference wrapping <tf.Variable 'a_wk4/conv2D_2_pol/kernel:0' shape=(3, 3, 16, 32) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/value/bias:0' shape=(1,) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/conv2D_1_pol/bias:0' shape=(16,) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/layer_norm_pol/beta:0' shape=(128,) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/conv2D_1_pol/kernel:0' shape=(3, 3, 26, 16) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/lstm_pol/lstm_cell/bias:0' shape=(512,) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/lstm_val/lstm_cell_1/bias:0' shape=(512,) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/value/kernel:0' shape=(128, 1) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/dense2_pol/bias:0' shape=(128,) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/logits/bias:0' shape=(80,) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/conv2D_2_val/kernel:0' shape=(3, 3, 16, 32) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/layer_norm_val/beta:0' shape=(128,) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/lstm_pol/lstm_cell/recurrent_kernel:0' shape=(128, 512) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/logits/kernel:0' shape=(128, 80) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/dense1_val/kernel:0' shape=(234, 128) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/dense1_val/bias:0' shape=(128,) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/embedding_pol/embeddings:0' shape=(100, 4) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/layer_norm_pol/gamma:0' shape=(128,) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/conv2D_2_val/bias:0' shape=(32,) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/conv2D_2_pol/bias:0' shape=(32,) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/conv2D_1_val/bias:0' shape=(16,) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/lstm_val/lstm_cell_1/recurrent_kernel:0' shape=(128, 512) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/dense2_pol/kernel:0' shape=(128, 128) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/lstm_pol/lstm_cell/kernel:0' shape=(128, 512) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/conv2D_1_val/kernel:0' shape=(3, 3, 26, 16) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/dense2_val/kernel:0' shape=(128, 128) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/dense2_val/bias:0' shape=(128,) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/dense1_pol/kernel:0' shape=(234, 128) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/dense1_pol/bias:0' shape=(128,) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/embedding_val/embeddings:0' shape=(100, 4) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/layer_norm_val/gamma:0' shape=(128,) dtype=float32>>, <Reference wrapping <tf.Variable 'a_wk4/lstm_val/lstm_cell_1/kernel:0' shape=(128, 512) dtype=float32>>}, and you only registered {<tf.Variable 'a_wk4/conv2D_2_pol/kernel:0' shape=(3, 3, 16, 32) dtype=float32>, <tf.Variable 'a_wk4/conv2D_1_pol/bias:0' shape=(16,) dtype=float32>, <tf.Variable 'a_wk4/value/bias:0' shape=(1,) dtype=float32>, <tf.Variable 'a_wk4/conv2D_1_pol/kernel:0' shape=(3, 3, 26, 16) dtype=float32>, <tf.Variable 'a_wk4/layer_norm_pol/beta:0' shape=(128,) dtype=float32>, <tf.Variable 'a_wk4/lstm_pol/lstm_cell/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'a_wk4/lstm_val/lstm_cell_1/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'a_wk4/value/kernel:0' shape=(128, 1) dtype=float32>, <tf.Variable 'a_wk4/dense2_pol/bias:0' shape=(128,) dtype=float32>, <tf.Variable 'a_wk4/conv2D_2_val/kernel:0' shape=(3, 3, 16, 32) dtype=float32>, <tf.Variable 'a_wk4/logits/bias:0' shape=(80,) dtype=float32>, <tf.Variable 'a_wk4/layer_norm_val/beta:0' shape=(128,) dtype=float32>, <tf.Variable 'a_wk4/lstm_pol/lstm_cell/recurrent_kernel:0' shape=(128, 512) dtype=float32>, <tf.Variable 'a_wk4/logits/kernel:0' shape=(128, 80) dtype=float32>, <tf.Variable 'a_wk4/dense1_val/kernel:0' shape=(234, 128) dtype=float32>, <tf.Variable 'a_wk4/dense1_val/bias:0' shape=(128,) dtype=float32>, <tf.Variable 'a_wk4/embedding_pol/embeddings:0' shape=(100, 4) dtype=float32>, <tf.Variable 'a_wk4/layer_norm_pol/gamma:0' shape=(128,) dtype=float32>, <tf.Variable 'a_wk4/conv2D_2_val/bias:0' shape=(32,) dtype=float32>, <tf.Variable 'a_wk4/conv2D_2_pol/bias:0' shape=(32,) dtype=float32>, <tf.Variable 'a_wk4/conv2D_1_val/bias:0' shape=(16,) dtype=float32>, <tf.Variable 'a_wk4/lstm_val/lstm_cell_1/recurrent_kernel:0' shape=(128, 512) dtype=float32>, <tf.Variable 'a_wk4/dense2_pol/kernel:0' shape=(128, 128) dtype=float32>, <tf.Variable 'a_wk4/lstm_pol/lstm_cell/kernel:0' shape=(128, 512) dtype=float32>, <tf.Variable 'a_wk4/conv2D_1_val/kernel:0' shape=(3, 3, 26, 16) dtype=float32>, <tf.Variable 'a_wk4/dense2_val/kernel:0' shape=(128, 128) dtype=float32>, <tf.Variable 'a_wk4/dense2_val/bias:0' shape=(128,) dtype=float32>, <tf.Variable 'a_wk4/dense1_pol/kernel:0' shape=(234, 128) dtype=float32>, <tf.Variable 'a_wk4/dense1_pol/bias:0' shape=(128,) dtype=float32>, <tf.Variable 'a_wk4/embedding_val/embeddings:0' shape=(100, 4) dtype=float32>, <tf.Variable 'a_wk4/layer_norm_val/gamma:0' shape=(128,) dtype=float32>, <tf.Variable 'a_wk4/lstm_val/lstm_cell_1/kernel:0' shape=(128, 512) dtype=float32>}. Did you forget to call register_variables() on some of the variables in question?

This happen even if I use RandomAction network. Also, in your provided notebook, you have not used these custom modelsI. Moreover, I think this is most probably not related to the changes that I made in the AI-Economist but it might be related to the compatibility issues of different versions of the modules that I have in my environment. I greatly appreciate any comments.

By the way, please let me know if you need further information. If it is required I would be happy to share the GitHub repository of my own version of AI-Economist with you. Many thanks in advance!

sa1g commented Nov 4, 2022

Hi there, as you noticed there are many compatibility issues in AI-Economist...
To fix your problem you just have to comment register_variables(). Note that many other issues are ahead of this one. I think there are compatibility issues between rllib, tensorflow + non full retro-compatibility when you start upgrading stuff.

aslansd commented Nov 4, 2022

Hi @sa1g, thanks, I already solved this problem. However, I have now other issue which I posted here:

I was wondering if you could be helpful in this case too! Many thanks in advance!

