Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Q-Network wrong output spec #896

Open
rissois opened this issue Nov 13, 2023 · 1 comment
Open

Q-Network wrong output spec #896

rissois opened this issue Nov 13, 2023 · 1 comment

Comments

@rissois
Copy link

rissois commented Nov 13, 2023

I am receiving the following error: Expected q_network to emit a floating point tensor with inner dims (464,); but saw network output spec: TensorSpec(shape=(6, 4, 464), dtype=tf.float32, name=None)

I am building a custom environment for DqnAgent with an observation shape of (6,4,4). The action is scalar (I would have liked a (2,), but apparently that's not possible at the moment. I am following this tutorial as closely as I can for my use case.

The environment class is initialized with:

self._action_spec = array_spec.BoundedArraySpec(
    shape=(), dtype=np.int32, minimum=0, maximum=463, name='action'
)

# Six 4x4 boards
self._observation_spec = array_spec.BoundedArraySpec(
    (6, 4, 4), np.int32,
    minimum=self.createMinMaxBoards([0, 0, 0, 0, 0, -1]),
    maximum=self.createMinMaxBoards([1, 1, 1, 1, 3, 2]),
)

I was able to successfully validate the environment and run the environment with a fixed policy, as per the tutorial, so the environment itself is in good shape. I then jumped over to this tutorial to add the agent and copy and pasted those two blocks of code directly:

fc_layer_params = (100, 50)
action_tensor_spec = tensor_spec.from_spec(env.action_spec())
num_actions = action_tensor_spec.maximum - action_tensor_spec.minimum + 1

# Define a helper function to create Dense layers configured with the right
# activation and kernel initializer.
def dense_layer(num_units):
  return tf.keras.layers.Dense(
      num_units,
      activation=tf.keras.activations.relu,
      kernel_initializer=tf.keras.initializers.VarianceScaling(
          scale=2.0, mode='fan_in', distribution='truncated_normal'))

# QNetwork consists of a sequence of Dense layers followed by a dense layer
# with `num_actions` units to generate one q_value per available action as
# its output.
dense_layers = [dense_layer(num_units) for num_units in fc_layer_params]
q_values_layer = tf.keras.layers.Dense(
    num_actions,
    activation=None,
    kernel_initializer=tf.keras.initializers.RandomUniform(
        minval=-0.03, maxval=0.03),
    bias_initializer=tf.keras.initializers.Constant(-0.2))
q_net = sequential.Sequential(dense_layers + [q_values_layer])
optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)

train_step_counter = tf.Variable(0)

agent = dqn_agent.DqnAgent(
    train_env.time_step_spec(),
    train_env.action_spec(),
    q_network=q_net,
    optimizer=optimizer,
    td_errors_loss_fn=common.element_wise_squared_loss,
    train_step_counter=train_step_counter)

agent.initialize()

The error is thrown at agent = dqn_agent.DqnAgent(...). There is a line in dqn_agent.py: q_network.create_variables(net_observation_spec) which seems to create the (6,4,464) shape. I would have imagined the network output would automatically be adopted from q_values_layer num_actions. More then likely this is a failure on my end, but I have seen unresolved posts on StackOverflow. Can anyone please help correct my understanding / code here?

@LokeshNEU747
Copy link

Even I'm facing the same issue. Have you resolved the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants