Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detailed PX4 CI documentation #2284

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

junwoo091400
Copy link
Contributor

About

More rigorous documentation actually detailing what goes behind the curtain on PX4 CI system is added.

Motivation

This was one of the most magical things I learned when I was working on Frames metadata support in QGC: mavlink/qgroundcontrol#10472

In hopes of having this mysterious CI system getting better documented, I created this PR!

- `mavros_mission_tests.yml`: Executes Mission plans using MAVROS. It can check for functional compatibility between MAVROS and PX4
- `checks.yml`: Runs various make targets (using `make` command). The commands for each targets are documented in the [Makefile](https://github.com/PX4/PX4-Autopilot/blob/main/Makefile)
- `check_format`: Checks the coding style format using Clang Tidy ?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dagar Is this different from the clang-tidy.yml part below?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one uses astyle

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaEtUgR This section might be affected by linter changes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hamishwillee Certainly, thanks for the heads-up. We won't change it over night. Style is currently exclusively checked by astyle, see https://github.com/PX4/PX4-Autopilot/blob/3b2d76657312d47e6495736dfd4d694b546497d5/Tools/astyle/fix_code_style.sh#L8-L11

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so this is good for now. Thanks!

@junwoo091400 junwoo091400 changed the title Add PX4 CI build documentation Detailed PX4 CI documentation Feb 17, 2023
## Other

Jenkins also updates the PX4 ROS messages repository [PX4/px4_msgs](https://github.com/PX4/px4_msgs) and uploads the metadata to the [Amazon S3 bucket server](https://px4-travis.s3.amazonaws.com/).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"uploads the metadata to the Amazon S3 bucket server." - what metadata? YOu already covered a lot here, is this metadata something associated with px4_msgs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh I think so, I didn't cover it specifically but can add that info as well

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is metadata associated with ROS 2 messages then "probably not". The only dependency now from PX4 in ROS2 is the PX4/px4_msgs. There used to be more stuff, but XRCE-DDS means it isn't needed any more.

@hamishwillee
Copy link
Collaborator

@junwoo091400 I've restructured this a bit. All the stuff specific to QGC and PX4 docs should go under those sections. So you can talk about what PX4 does with actions and Jenkins in those folders, and there might end up being some duplication in the QGC and Docs sections.

It is not clear to me how Jenkins is kicked off? Is this magic or is it done by deploy_all action? In other words what is the relationship/handover, if any between the systems?

I can't review this for technical accuracy. Who can?

@hamishwillee
Copy link
Collaborator

PS The Jenkins stuff doesn't happen by default - it has been broken for months and months. I kick it off manually every week or so. No one seems to care.

@junwoo091400
Copy link
Contributor Author

It is not clear to me how Jenkins is kicked off? Is this magic or is it done by deploy_all action?

This I have no clue for now. I can ask @dagar

Jenkins stuff doesn't happen by default - it has been broken for months and months

I'm sorry that this has been happening. I was aware & wanted to at least figure out for myself what was going wrong, which was partly the motivation for this PR.

I will ask about Jenkins situation in the maintainers chat 🙏. I will help getting this resolved!

@hamishwillee
Copy link
Collaborator

I will ask about Jenkins situation in the maintainers chat 🙏. I will help getting this resolved!

Thanks. @dagar is probably the only one who can fix it.

@hamishwillee
Copy link
Collaborator

@junwoo091400 Is this waiting on review? In that case, perhaps we could ask @bkueng for a sanity check.

@junwoo091400
Copy link
Contributor Author

Yes that makes sense!

en/test_and_ci/continous_integration.md Outdated Show resolved Hide resolved
en/test_and_ci/continous_integration.md Outdated Show resolved Hide resolved
- `mavros_mission_tests.yml`: Executes Mission plans using MAVROS. It can check for functional compatibility between MAVROS and PX4
- `checks.yml`: Runs various make targets (using `make` command). The commands for each targets are documented in the [Makefile](https://github.com/PX4/PX4-Autopilot/blob/main/Makefile)
- `check_format`: Checks the coding style format using Clang Tidy ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one uses astyle

en/test_and_ci/continous_integration.md Outdated Show resolved Hide resolved
- `ekf_update_change_indicator.yml`: Check if there's any functionality change in EKF ?
- `failsafe_sim.yml`: Builds [failsafe simulator](../config/safety_simulation.md)
- `clang-tidy.yml`: Runs Clang Tidy to check if coding style is consistent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also runs static analysis to detect problematic code patterns (e.g. dead code).

- Events metadata (`events/*.xz`)
- Actuators metadata (`actuators.json*`)
- Target binary (`*.px4`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only metadata. compile_nuttx uploads the binaries, but I'm not sure how they get to s3.

Suggested change
- Target binary (`*.px4`)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out, they are uploaded via the "PX4_misc/Firmware_s3_deploy_*" pipelines that seems to be triggered manually! (right? @dagar)

https://github.com/PX4/PX4-user_guide/pull/2284/files#diff-7a54d71845d3b204f4ac49e9f2f65213d70376270e2856b33257405ca349cd7bR110-R112

en/test_and_ci/continous_integration.md Outdated Show resolved Hide resolved
en/test_and_ci/continous_integration.md Outdated Show resolved Hide resolved
- [.github/workflows/](https://github.com/PX4/PX4-Autopilot/tree/main/.github/workflows).

The Jenkins build status can be viewed in [ci.px4.io:8080/job/PX4/](http://ci.px4.io:8080/job/PX4/)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just discovered that there's 2 Jenkins Server:

@dagar What is the difference between them? And why do we need two separate Jenkins server?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just an optimization because Jenkins transfers quite a lot of data between the server and runners and it's also a bit more reliable overall. For the hardware test rack each board is Jenkins slave, but it's just a container all on the same physical system with the Jenkins server.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah so you mean your Jenkins is the hardware rack testing & px4 Jenkins (CI.px4.io) is for everything else? (Metadata deployment, world magnetic model update, etc)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think more information about test rack and how it runs can be added here. If it is mentioned anywhere in docs a link might suffice

- Target binary (`*.px4`)

These files are uploaded to the [Amazon S3 bucket server](https://px4-travis.s3.amazonaws.com/).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically thru this link: https://api.github.com/repos/PX4/Firmware/releases

It returns a JSON file that includes the versioning info

@hamishwillee
Copy link
Collaborator

Note, I am not tracking this. @junwoo091400 when you have resolved issues with @bkueng et al, come back to me and I will subedit.

@junwoo091400
Copy link
Contributor Author

Updated the PR. Two remaining notes:

  1. @dagar Seems like test rack CI is not running recently (jenkins/continuous-integration one that used to fail a lot). Was that disabled?
  2. As I reported in PX4 Metadata upload to S3 isn't being separated to appropriate releases PX4-Autopilot#22383, currently metadata isn't being uploaded into appropriate directories correctly so I didn't go in detail regarding that mechanism in the docs, since it's not fixed yet (still docs is true to code's status)

Comment on lines 12 to 66

Although referred to as "Jenkins", PX4 has two Jenkins servers, and multiple pipelines running in parallel. Here are the summary of most pipelines:

### Non-hardware Jenkins

A Webhook is configured to [ci.px4.io/](http://ci.px4.io/) for every event of Pushes (new commits, branches) and Pull Requests in PX4-Autopilot repository.

| Pipeline Name | Description | Trigger | Script |
|---|---|---|---|
| [PX4_misc/*](http://ci.px4.io/job/PX4_misc/) | These are helper pipelines for once-in-a-while, miscellaneous tasks needed for PX4 ecosystem, and the scripts are not documented within the PX4-Autopilot repository. | NA | NA |
| [PX4_misc/Firmware-compile](http://ci.px4.io/job/PX4_misc/job/Firmware-compile/) (On Github shown as **Jenkins Compile All Boards**) | Builds specific list of board targets to check if they all compile. | Pushes & Pull Requests (all Webhook events) | [.ci/Jenkinsfile-compile](https://github.com/PX4/PX4-Autopilot/blob/main/.ci/Jenkinsfile-compile) |
| [PX4_misc/Firmware_update-world_magnetic_model](http://ci.px4.io/job/PX4_misc/job/Firmware_update-world_magnetic_model/) | Create PR updating world magnetic model data | Manual | In Jenkins |
| [PX4_misc/Firmware_update_nuttx_kconfigs](http://ci.px4.io/job/PX4_misc/job/Firmware_update_nuttx_kconfigs/) | Create PR updating inconsistent NuttX KConfig files | Manual | In Jenkins |
| [PX4_misc/Firmware_update_submodules](http://ci.px4.io/job/PX4_misc/job/Firmware_update_submodules/) | Create PRs updating the submodules to latest | Manual | In Jenkins |
| [PX4/*](http://ci.px4.io/job/PX4/) | Main CI pipelines for PX4 ecosystem | NA | NA |
| [PX4/PX4-Autopilot](http://ci.px4.io/job/PX4/job/PX4-Autopilot/) (On Github shown as **continuous-integration/jenkins/branch or pr-head**) | Updates the `master` metadata files in the S3 bucket | Push to `main` | [Jenkinsfile](https://github.com/PX4/PX4-Autopilot/blob/main/Jenkinsfile) |
| [PX4/PX4-user_guide](http://ci.px4.io/job/PX4/job/PX4-user_guide/) | Builds the Html version of the user guide from the markdown format and deploys to the web | Updates in `main` (latest) or `v1.*` (past releases) branches | [PX4-user_guide/Jenkinsfile](https://github.com/PX4/PX4-user_guide/blob/main/Jenkinsfile) |

For the full view of all different pipelines, visit [ci.px4.io/](http://ci.px4.io/).

### Hardware testing Jenkins

A Webhook is configured to [px4-jenkins.dagar.ca](http://px4-jenkins.dagar.ca:8080/) for every event of Pushes (new commits, branches) and Pull Requests in PX4-Autopilot repository.

| Pipeline Name | Description | Trigger | Script |
|---|---|---|---|
| [PX4-Autopilot](http://px4-jenkins.dagar.ca:8080/job/PX4-Autopilot/) (On Github shown as **continuous-integration/jenkins/branch**) | Runs hardware test script | Pushes & Pull Requests (all Webhook events) | [.ci/Jenkinsfile-hardware](https://github.com/PX4/PX4-Autopilot/blob/main/.ci/Jenkinsfile-hardware) |

This CI is crucial for detecting failures in hardware that can't be detected via Software CI tools, such as hardfault, NuttShell, NuttX bugs, etc.

To view all the past runs of the hardware rack CI, visit [px4-jenkins.dagar.ca](http://px4-jenkins.dagar.ca:8080/job/PX4-Autopilot/).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this table because I thought the list view (still kept, but commented out below) was confusing, would like to know your thoughts on this! @hamishwillee

An example commit of parameter metadata can be found [here](https://github.com/mavlink/qgroundcontrol/commit/7f4f3b6253fe80a881e9a91a1f5b6d960ad11834).

Note that QGroundControl normally gets metadata for [Events](../concept/events_interface.md#implementation)and [Parameters](../advanced/parameters_and_configurations.md#publishing-parameter-metadata-to-a-gcs) from the vehicle (it is built into the firmware, or a URL is provided in the firmware from which it can be downloaded).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it is true to say that 'URL is provided in the firmware from which metadata can be downloaded'? @bkueng

Comment on lines +41 to +64

- `cubepilot_cubeorange_test`: Cubepilot CubeOrange
- `cuav_x7pro_test`: CUAV X7 Pro
- `px4_fmu-v4_test`: FMU v4
- `px4_fmu-v4pro_test`: FMU v4 Pro
- `px4_fmu-v5_debug`: FMU v5 with debug flag, prints out debug information
- `px4_fmu-v5_stackcheck`: FMU v5 compiled with stackcheck, detects stack overflow, etc.
- `px4_fmu-v5_test`: FMU v4
- `nxp_fmuk66-v3_test`: FMU k66 v3

Note that `_test` label for the target means ... (TODO)

It performs the following tests on each build node targets:

- Build: Build bootloader and firmware binary files
- Flash: Flash the hardware via USB, JLink, and other methods (specific to hardware)
- Tests: Various tests regarding sensors, commander module, uorb topics, etc.
- Status: Reboot and check filesystem (/proc, /dev, etc), module, system commands, and do quick IMU calibration
- Print topics: Print out selected set of uORB topic data (for debugging purposes)

This CI is crucial for detecting failures in hardware that can't be detected via Software CI tools, such as hardfault, NuttShell, NuttX bugs, etc.

For the full test script, visit [.ci/Jenkinsfile-hardware](https://github.com/PX4/PX4-Autopilot/blob/main/.ci/Jenkinsfile-hardware).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added general information about the hardware rack here. Anything else you want me to add? @mnumanuyar

- `nxp_fmuk66-v3_test`: FMU k66 v3

Note that `_test` label for the target means ... (TODO)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bkueng I actually don't know how "stackcheck", "test", etc labels get incorporated into the build process. What does '_test' postfix do regarding the built target binary?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just different build variants for selecting a certain sets of modules: the label defines which .px4board file is used under the board directory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahaa true, I forgot about that! Thanks for the pointer 👍

junwoo091400 and others added 11 commits November 23, 2023 16:39
More rigorous documentation actually detailing what goes behind the
curtain on PX4 CI system.

Hope this helps someone!
Co-authored-by: Junwoo Hwang <junwoo091400@gmail.com>
Co-authored-by: Beat Küng <beat-kueng@gmx.net>
Co-authored-by: Beat Küng <beat-kueng@gmx.net>
Clarify that Firmware and Metadata deployments are handled separately,
from different entities (for now): Github Actions and Jenkins

Also, add further feedback from the comments from the PR
@hamishwillee
Copy link
Collaborator

@junwoo091400 Just a reminder to me when you're ready for final review. That should be after Beat and Daniel say they are happy enough for this to be published.

@junwoo091400
Copy link
Contributor Author

@junwoo091400 Just a reminder to me when you're ready for final review. That should be after Beat and Daniel say they are happy enough for this to be published.

I would hold it a bit for now, as we just had discussion on the overall CI infrastructure and there are some improvements we can have, and change the Docs appropriately.

Copy link

No flaws found

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants