Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High frequency in RPMs when include action buffer in observation space can couse problems in real hardware #212

Open
piratax007 opened this issue May 10, 2024 · 0 comments

Comments

@piratax007
Copy link

piratax007 commented May 10, 2024

Hello @JacopoPan,

First of all, congratulations on this wonderful repo.

Now, I'm training a policy to control a real Crazyflie, I'm using RPM as action space instead of ONE_DIM_RPM and I have success in simulation except for the RPM plot.

As you can see here:

rpms

the RMP has this high frequency that makes it useless to implement in a real drone.

I understand what you said in #180 "The main thing to note is that the observation contains the actions of the last .5 seconds, so increasing the ctrl freq will increase the obs space." and "The idea of the action buffer is that the policy might be better guided by knowing what the controller had done just before, the proportionality to the control frequency makes it dependent on the wall-clock only, and not the type of controller (but it might be appropriate to change that, depending on application).". Nevertheless, adding the action buffer in the observation space has as a consequence the high frequency shown before. If I remove the buffer and use only the states (12 inputs) as observation space, the drone achieves the target position and orientation (because I'm controlling yaw) and the RPM doesn't present the high frequency reported

xyz
rpy
rpms

Questions:

  1. What is the difference between transferring the trained policy to a real drone, with the action buffer in the observation space and without it?
  2. I'm trying to add a low pass filter to reduce the high frequency in the RPMs, can you help me to deduce what is the best cut-off and sample frequency to set up the filter?
  3. In the SB3 documentation that you refer to, I cannot find anything about using this action buffer in the observation space and I have some questions about it like, how to determine the size of the buffer. As you said, the buffer's size is related to the CTRL_FREQUENCY, but why? What means CTRL_FREQUENCY? and what is the relation between CTRL_FREQUENCY and PYB_FREQUENCY and time-step? (I know that in BaseAviary.py line 481 you define the time step using PYB_FREQUENCY).
  4. What is the frequency in which the policy interacts with the drone (send actions and receive observations and rewards)?

Thanks for your time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant