High frequency in RPMs when include action buffer in observation space can couse problems in real hardware #212

piratax007 · 2024-05-10T10:20:09Z

First of all, congratulations on this wonderful repo.

Now, I'm training a policy to control a real Crazyflie, I'm using RPM as action space instead of ONE_DIM_RPM and I have success in simulation except for the RPM plot.

As you can see here:

the RMP has this high frequency that makes it useless to implement in a real drone.

I understand what you said in #180 "The main thing to note is that the observation contains the actions of the last .5 seconds, so increasing the ctrl freq will increase the obs space." and "The idea of the action buffer is that the policy might be better guided by knowing what the controller had done just before, the proportionality to the control frequency makes it dependent on the wall-clock only, and not the type of controller (but it might be appropriate to change that, depending on application).". Nevertheless, adding the action buffer in the observation space has as a consequence the high frequency shown before. If I remove the buffer and use only the states (12 inputs) as observation space, the drone achieves the target position and orientation (because I'm controlling yaw) and the RPM doesn't present the high frequency reported

Questions:

What is the difference between transferring the trained policy to a real drone, with the action buffer in the observation space and without it?
I'm trying to add a low pass filter to reduce the high frequency in the RPMs, can you help me to deduce what is the best cut-off and sample frequency to set up the filter?
In the SB3 documentation that you refer to, I cannot find anything about using this action buffer in the observation space and I have some questions about it like, how to determine the size of the buffer. As you said, the buffer's size is related to the CTRL_FREQUENCY, but why? What means CTRL_FREQUENCY? and what is the relation between CTRL_FREQUENCY and PYB_FREQUENCY and time-step? (I know that in BaseAviary.py line 481 you define the time step using PYB_FREQUENCY).
What is the frequency in which the policy interacts with the drone (send actions and receive observations and rewards)?

Thanks for your time.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High frequency in RPMs when include action buffer in observation space can couse problems in real hardware #212

High frequency in RPMs when include action buffer in observation space can couse problems in real hardware #212

piratax007 commented May 10, 2024 •

edited

High frequency in RPMs when include action buffer in observation space can couse problems in real hardware #212

High frequency in RPMs when include action buffer in observation space can couse problems in real hardware #212

Comments

piratax007 commented May 10, 2024 • edited

piratax007 commented May 10, 2024 •

edited