Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building ubuntu 24.04 image is slow #6531

Closed
1 of 2 tasks
xlazom00 opened this issue Apr 27, 2024 · 18 comments · Fixed by #6644
Closed
1 of 2 tasks

Building ubuntu 24.04 image is slow #6531

xlazom00 opened this issue Apr 27, 2024 · 18 comments · Fixed by #6644
Labels
Help needed We need your involvement Work in progress Unfinished / work in progress

Comments

@xlazom00
Copy link
Contributor

What happened?

I am building image for wdk2023 on ubuntu 22.04 and 24.04 (fresh install of ubuntu)
But whole process is much slower than I remember.
When I check top in most case I see
'/usr/libexec/qemu-binfmt/aarch64-binfmt-P /usr/bin/apt-get apt-get upgrade -s -q'
or
'/usr/libexec/qemu-binfmt/aarch64-binfmt-P /usr/bin/python3 /usr/bin/python3 /usr/lib/cnf-update-db'
with cpu load on 100%
It took me more then 2 hours to build image
Any idea?

How to reproduce?

nothing special

Branch

main (main development branch)

On which host OS are you running the build script and observing this problem?

Ubuntu 24.04 Noble

Are you building on Windows WSL2?

  • Yes, my Ubuntu/Debian/OtherOS is running on WSL2

Relevant log URL

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
Copy link

Jira ticket: AR-2128

@EvilOlaf
Copy link
Member

Noble is not a supported build environment (yet). So Jammy is still the way to go.
So basically it is slow on both Jammy and noble?

@xlazom00
Copy link
Contributor Author

xlazom00 commented Apr 28, 2024

2hours to build image and I didn't build kernel. On 8 core intel i7 machine.
Btw How long is it take to build image?

@EvilOlaf
Copy link
Member

Depending on hw ressources, pre-filled cache and usage/availability of artifacts usually takes from a few seconds to minutes.

@xlazom00
Copy link
Contributor Author

@EvilOlaf Ok I will try to run it twice. So how can I setup package cache ?

@xlazom00
Copy link
Contributor Author

./compile.sh build BOARD=uefi-arm64 BRANCH=edge BUILD_DESKTOP=no BUILD_MINIMAL=no EXPERT=yes KERNEL_CONFIGURE=no RELEASE=noble
on ubuntu 24.04
Runtime [ 40:40 min ]

@SteeManMI
Copy link
Contributor

I can confirm that building Noble images takes my build environment about 45 minutes. The same build of Jammy takes 17 minutes.

@xlazom00 xlazom00 changed the title Building image is slow Building ubuntu 24.04 image is slow Apr 29, 2024
@EvilOlaf
Copy link
Member

Can confirm as well

@igorpecovnik igorpecovnik added the Help needed We need your involvement label May 1, 2024
@alexl83
Copy link
Contributor

alexl83 commented May 12, 2024

command-not-found seems to be excruciatingly slow; and getting worse by the day - need to test on noble
Perhaps removing it during buildtime can help mitigate at least

that's it, command-not-found --> /usr/libexec/qemu-binfmt/aarch64-binfmt-P /usr/bin/python3 /usr/bin/python3 /usr/lib/cnf-update-db

apt hook: /etc/apt/apt.conf.d/50command-not-found

@alexl83
Copy link
Contributor

alexl83 commented May 12, 2024

Came up with a quick&dirty extension - if anyone wants to test it, should it work, then this workaround could be integrated in the framework

extensions/disable-cnf_chroot.sh

function pre_install_distribution_specific__1_disable_cnf_apt_hook(){
        display_alert "Disabling command-not-found during build-time to speed up image creation" "${BOARD}:${RELEASE}-${BRANCH} :: ${EXTENSION}" "info"
        run_host_command_logged mv "${SDCARD}"/etc/apt/apt.conf.d/50command-not-found "${SDCARD}"/etc/apt/apt.conf.d/50command-not-found.disabled
}


function post_post_debootstrap_tweaks__2_restore_cnf_apt_hook(){
        display_alert "Enabling command-not-found after build-time " "${BOARD}:${RELEASE}-${BRANCH} :: ${EXTENSION}" "info"
        run_host_command_logged mv "${SDCARD}"/etc/apt/apt.conf.d/50command-not-found.disabled "${SDCARD}"/etc/apt/apt.conf.d/50command-not-found

}

depending on CCACHE and ORAS cache avalability, my times went down to 15-24 minutes

Noble test

@viraniac
Copy link
Collaborator

viraniac commented May 20, 2024

Has anyone tried exporting QEMU_CPU, setting it manually to different values supported by qemu-aarch64-static and see if that makes a difference?

I am seeing similar slowness on fenix (build system used by Khadas) as well. For fenix, I observed that exporting QEMU_CPU as cortex-a53 gives the best performance where build time reduces significantly for arm64 builds performed on amd64 hosts

@viraniac viraniac mentioned this issue May 20, 2024
6 tasks
@alexl83
Copy link
Contributor

alexl83 commented May 20, 2024

Could be intereseting, but why should we invest efforts in a functionality (command-not-found) that is simply not useful in scope during debootstrapping? Could be we end up overengineering a solution :)

c-n-f is an interactive helper for end user to suggest packages to install via apt when calling a non-existant binary - if we disable at build time we lose nothing and we gain massive compute time (on my ryzen 7700x) I get trixie in 14 minutes instead of 40 - so it doesn't impact just Noble (unfortunately)

First apt update on my SBCs require no appreciable overhead and c-n-f works flawlessly
Maybe it can be useful on github actions, but on bare metal I think it is safe to disable c-n-f hook so apt doens't trigger it
after debootstrap we re-enable and it's done :)

@viraniac
Copy link
Collaborator

viraniac commented May 20, 2024

I get trixie in 14 minutes instead of 40 - so it doesn't impact just Noble (unfortunately)

Is that with doing the debootstrap or is that with cached rootfs? Have you tried doing a build with ARTIFACT_IGNORE_CACHE=yes and then measure how much of difference it was making? I know it doesn't just impact Noble, because the issue is coming from newer version of qemu-user-static.

Cnf is one thing being affected by the slowness of qemu-user-static, but its not the only thing. While you call my solution as over engineering, I am trying to solve the root cause of the problem and not a single symptom of it.

@alexl83
Copy link
Contributor

alexl83 commented May 20, 2024

I get trixie in 14 minutes instead of 40 - so it doesn't impact just Noble (unfortunately)

Is that with doing the debootstrap or is that with cached rootfs? Have you tried doing a build with ARTIFACT_IGNORE_CACHE=yes and then measure how much of difference it was making? I know it doesn't just impact Noble, because the issue is coming from newer version of qemu-user-static.

Cnf is one thing being affected by the slowness of qemu-user-static, but its not the only thing. While you call my solution as over engineering, I am trying to solve the root cause of the problem and not a single symptom of it.

Apologies @viraniac, I didn't mean to undervalue your solution neither insult you, perhaps your findings are going to be an integral part of or even the solution itself
Have no doubts about the root cause being targeted and being more complex than just c-n-f - so sorry for my choice of wording.

the point of view I'm offering is purely "consumer", can't really compare myself to armbian team and seasoned supporters/engineers - on a "philosophy" plane I ask myself why I should keep c-n-f alive if it serves no purpose during build time.
Of course qemu slowness impacting other aspects won't be solved by just that - sorry I was missing this angle.

My test case:
orangepi5-plus
Trixie CLI non minimal
edge kernel +3 userpatches and a custom extension
cached armbian packages
cached rootfs
around 14 minutes without c-n-f / around 40 with c-n-f

Same variables with no cached rootfs and no cached armbian-packages + ARTIFACT_IGNORE_CACHE=yes - so full debootstrap process
around 24 minutes without c-n-f / short of 55 with c-n-f enabled

I will pull your PR and will happily enjoy a faster qemu: I was concerned that playing with it could break cross-compilation or other aspects of the build process.

my system is a ryzen 7700x a bit overclocked/undervolted - 64gig ram - pcie4 nvme ssd

@viraniac
Copy link
Collaborator

I didn't mean to undervalue your solution neither insult you,

No problem mate. Text is hard because we don't hear the tone and not see the faces which makes it hard to guess what the other person was implying. Its especially harder for me as my social interaction is almost zero. Let me assure you I am not offended, just wasn't sure about the comment and how I should respond to it. So if I gave any indication of being offended, thats a limitation of my texting skills and overthinking when writing a response.

I will pull your PR and will happily enjoy a faster qemu: I was concerned that playing with it could break cross-compilation or other aspects of the build process.

Please do test the PR, nothing will make me happier. I also assure you it should not break cross compilation. So do play with it and let me know if it helps bring build times even lower for your use case.

@alexl83
Copy link
Contributor

alexl83 commented May 20, 2024

I can confirm this QEMU solution is extraordinary:
pulled @viraniac pr and test case above took 06mm45ss with c-n-f enabled

@alexl83
Copy link
Contributor

alexl83 commented May 20, 2024

From a very non-rigorous testing (looking a cnf-update-db using top) we can still save a couple minutes in execution time by disabling it - but the bigger saving comes from QEMU tweaking :)

@alexl83
Copy link
Contributor

alexl83 commented May 20, 2024

BOARD=orangepi5-plus
BRANCH=edge
RELEASE=trixie
DEST_LANG="en_US.UTF-8"
COMPRESS_OUTPUTIMAGE=sha,xz
KERNEL_CONFIGURE=no
BSPFREEZE=yes
CLEAN_LEVEL=debs
BUILD_DESKTOP=no
BUILD_MINIMAL=no
BUILD_EXTERNAL=yes
EXTERNAL_NEW=compile
INSTALL_HEADERS=yes
NO_APT_CACHER=yes
#ARTIFACT_IGNORE_CACHE=yes
KERNEL_GIT=shallow
ENABLE_EXTENSIONS="kali-ale systemd-resolved_resolvconf zerotier install-packages"
HOST=kalian
KEEP_ORIGINAL_OS_RELEASE=yes
FORCE_BOOTSCRIPT_UPDATE=yes
FORCE_UBOOT_UPDATE=yes
DOCKER_FLAGS+=(--privileged)

LOG PR6644 Runtime [ 6:52 min ]

LOG PR6644 + PR6616 Runtime [ 5:19 min ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Help needed We need your involvement Work in progress Unfinished / work in progress
Development

Successfully merging a pull request may close this issue.

6 participants