Different kinds of outputs #247

erupturatis · 2022-08-04T18:08:48Z

Hey,

Generally the action space in a game or environment more generally can be represented in a lot of ways (for example one-hot encoded inputs, probabilities, values between 0 and 1, etc).
This is also a bit related to #184
I think we should be able to make a more detailed setup in the config files for different types of outputs such as

[LanderGenome]

...
num_outputs = 20
Custom_outputs = True
Softmaxed_outputs = [3, 2]
Clamped_outputs = [(3,1,2),(2,0,1)]
one_hot_encoded = [5]
normal_outputs = [5]

Custom outputs would signal neat to return outputs in another form then usual (I was thinking about a dictionary of outputs depending on the customization)

Softmaxed_outputs = [3, 2] # 1 group of 3 softmaxed outputs and another of 2 softmaxed outputs
Clamped_outputs = [(3,1,2),(2,0,1)] # a group of 3 clamped outputs between 1 and 2 and another group of 2 clamped outputs between 0 and 1
one_hot_encoded = [5] , 5one hot encoded values
normal_outputs = [5] the last 5 outputs will have raw data in them
The output matrix for what I am describing should look like this:

outputs = {
"softmax": [[0.2, 0.4, 0.4], [0.45, 0.55]], # the 2 pair of one hot encoded values
"clamped":[[1,2,1], [1,0]],
"encoded":[[0,0,0,1,0]],
"nomal": [12.1,32.1,43.1,1.1,2.3]
}

The total number of custom outputs should be equal to the total number of outputs

Also I am not sure how this would affect the performance of the networks. For example if the output of a network would be mostly over 10 you can't directly clamp the values between 0 and 1 since they would all be 1 so we would need to normalize some of the special values first like the clamped ones

Please let me know what you think

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different kinds of outputs #247

Different kinds of outputs #247

erupturatis commented Aug 4, 2022 •

edited

Different kinds of outputs #247

Different kinds of outputs #247

Comments

erupturatis commented Aug 4, 2022 • edited

erupturatis commented Aug 4, 2022 •

edited