Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to 64bit kernel+OS environment (x86_64, aarch64) #903

Open
19 of 35 tasks
jens-maus opened this issue Sep 12, 2020 · 6 comments
Open
19 of 35 tasks

Switch to 64bit kernel+OS environment (x86_64, aarch64) #903

jens-maus opened this issue Sep 12, 2020 · 6 comments
Labels
💡 enhancement-ideas New feature or change request 💻 hardware support This issue refs tickets/issue introducing/fixing some hardware support 🏷️ HmIPServer This refs the HmIPServer component 🏷️ ReGaHss This refs the ReGaHss component 🏷️ WebUI This refs the WebUI component 🙏 help wanted Extra attention is needed

Comments

@jens-maus
Copy link
Owner

jens-maus commented Sep 12, 2020

Is your feature request related to a problem? Please describe.
The following kernel output (catched on a vmWare ESXi used OVA RaspberryMatic) should explain it all:

[    0.051050] ************************************************************
[    0.051050] ** WARNING! WARNING! WARNING! WARNING! WARNING! WARNING!  **
[    0.051051] **                                                        **
[    0.051051] ** You are using 32-bit PTI on a 64-bit PCID-capable CPU. **
[    0.051051] ** Your performance will increase dramatically if you     **
[    0.051051] ** switch to a 64-bit kernel!                             **
[    0.051052] **                                                        **
[    0.051052] ** WARNING! WARNING! WARNING! WARNING! WARNING! WARNING!  **
[    0.051052] ************************************************************

Thus, aside from the not-so-important point that with a 64bit native kernel+OS we could address more than 4GB RAM per application (barely required with RaspberryMatic, of course), we would perhaps benefit a lot with a 64bit native kernel+OS. In addition, some CCU Addons like RedMatic (https://github.com/rdmtc/RedMatic) are partly (on x86) only possible if a 64bit OS is used (see rdmtc/RedMatic#374). So there are some good points to why a switch to a 64bit kernel+OS would perfectly sense (if this is easily possible, of course).

Describe the solution you'd like
Due to the point that the eQ3 distributed binaries in the OCCU environment (https://github.com/eq-3/occu) are distributed as 32bit applications, as well as some CCU addons also still require a 32bit OS the following 3 step transition procedure is suggested:

  • (1) Develop a multilib buildroot environment so that we will have a 64bit kernel + OS but provide the possibility to also still use 32bit applications. While this is unfortunately not natively possible with buildroot, there had been already some work in the multlib branch of RaspberryMatic (see https://github.com/jens-maus/RaspberryMatic/tree/multilib). This branch should be first finalized for the x86 target (already in alpha state!) and then ported over to ARM64 at a later state so that even a RaspberryPi3 or RaspberryPi4 could benefit from running a 64bit kernel+OS.
  • (2) After having such a multlib version of RaspberryMatic released to the public (potentially still in 2020), we try to get eQ3 and others to move forward and port/distribute also 64bit versions of all majore CCU/OCCU components.
  • (3) After most/all applications/ccu addons are available as 64bit native apps we could consider dropping 32bit support alltogether at some state (e.g. >2022) or consider keeping it for another while or even forever?!? But this could mean an additional overhead to still maintain the 32bit multlib compatibility in future.

Caveats
Due to using 32bit-only CPUs, the tinkerboard and rpi0 target won't benefit from it. In addition, we will have to add a new "rpi2" target because the RaspberryPi2 (except from the latest 1.2 revision) is also shipped with a 32bit-only CPU and thus won't be able to benefit. Nevertheless, there should not be any limitations in still maintaining the rpi0, rpi2 and tinkerboard targets as 32bit only versions and only ship the OVA (x86) and rpi3, rpi4 target versions as full fledged 64bit OS versions.

Additional context
The following list should be a working list of applications and addons/components which are currently limited to a 32bit OS environment and for which we might have to get ported to be running more smoothly in a 64bit OS so that point 3 could be fulfilled at some point in future (checked elements are already distributed or had been tested with 64bit compatibility):

OCCU:

  • rfd
  • hs485d
  • hs485dLoader
  • multimacd
  • eq3configcmd
  • SetInterfaceClock
  • crypttool
  • libhsscomm.so
  • libeq3config.so
  • libUnifiedLanComm.so
  • libLanDeviceUtils.so
  • tclsh
  • libtcl8.2.so
  • tclrega.so
  • tclrpc.so
  • tclticks.so (obsolete)
  • hss_led
  • eq3configd
  • libelvutils.so
  • ReGaHss
  • libXmlRpc.so
  • libxmlparser.so
  • ssdpd
  • eq3_char_loop kernel driver
  • HMIPServer.jar (only aarch64 libNRJavaSerialv8.so missing, see Switch to 64bit kernel+OS environment (x86_64, aarch64) #903 (comment))
  • hmip-copro-update.jar

Third-Party OS components:

  • generic_raw_uart kernel driver
  • detect_radio_module

CCU-Addons (with 32bit/64bit binary dependencies):

@jens-maus jens-maus added 💡 enhancement-ideas New feature or change request 🙏 help wanted Extra attention is needed 🏷️ ReGaHss This refs the ReGaHss component 🏷️ WebUI This refs the WebUI component 🏷️ HmIPServer This refs the HmIPServer component labels Sep 12, 2020
@jens-maus jens-maus added this to the future release milestone Sep 12, 2020
jens-maus added a commit that referenced this issue Sep 15, 2020
cleaned up all installation routines to only copy relevant, non-obsolete
binaries/libraries and to keep hands of some non required ones. This
refs #903.
@jens-maus jens-maus pinned this issue Oct 6, 2020
@jens-maus jens-maus changed the title Switch to 64bit kernel+OS environment (x86_64, arm64, etc.) Switch to 64bit kernel+OS environment (x86_64, aarch64) Oct 11, 2020
@jens-maus
Copy link
Owner Author

jens-maus commented Oct 11, 2020

Here some development updates on the current state of affairs regarding development on the multilib 64bit OS works:

  1. The multilib OS environments for the x86_64 targets (ova, intelnuc) seem to work flawlessly now. That means the bootloader and OS are ported to 64bit and the 32bit multilib environment (/lib32 vs /lib64) seem to work so that 64bit apps and 32bit apps can perfectly coexist and run within the 64bit RaspberryMatic. Only tests with a x86_64 version of RedMatic are missing and then these targets should be ready for a wider test audience.
  2. The first ARM-based aarch64 target (rpi4) is ported to 64bit as well. However there are currently still some issues that need to be resolved separately:
    1. The HMIPServer.jar Java app only comes with a armhf (32bit) binary of the libNRJavaSerial.so library which is used to communicate with the rf module. A workaround has been integrated (b1133ad#diff-f63cfb2bb642b7b0ce5840def4fba072) but for a permanent solution eQ3 has to update their libNRJavaSerial dependency to the latest version which comes with a ARM64 (aarch64) version of this library.
    2. Due to sone unknown reasons some armhf 32bit binaries shipped with OCCU for ARM do not seem to work properly in the new multilib environment and, e.g. multimacd, rfd and ReGHss exit with a `Illegal instruction' error or do not work properly at all. Interestingly, when using the CCU3 native armhf binaries instead of the supplied ones from OCCU, these binaries work and the whole multilib systems boots up fine and works. Thus, more investigations required to identify the root cause and to potentially use a different cross compiler or setting to get these binaries running fine in the multilib environment. or we directöy switch to aarch64 compilation as explained in step 2 above. (fixed by c42bcb2)

jens-maus added a commit that referenced this issue Oct 11, 2020
kernel_defconfig so that 32bit armhf binaries can be used again.
This refs #903.
@jens-maus
Copy link
Owner Author

jens-maus commented Oct 30, 2020

Some more status updates on the 64bit versions of RaspberryMatic: We are already in the beta test phase now that the aarch64 (Pi3+Pi4) as well as the x86_64 (ova+intelnuc) platform support seem to have stabilized somewhat faster than expected. Thanks to the first beta testers we could also fix+verify some issues with CCU addon compatibility.

However, if someone wants to run some own beta tests feel free to check the following two URLs for discussion and downloads of the 64bit beta versions:

https://homematic-forum.de/forum/viewtopic.php?f=65&t=62074
https://cloud.light-speed.de/s/m72rpTkHxoBwNst

Looking forward to receiving more reports on the usability of the new 64bit versions before I will merge them to the master branch for integration into the next upcoming RaspberryMatic release.

@jens-maus jens-maus added the 💻 hardware support This issue refs tickets/issue introducing/fixing some hardware support label Oct 30, 2020
Repository owner deleted a comment from GeorgWolter Nov 22, 2020
Repository owner deleted a comment from GeorgWolter Nov 22, 2020
@jens-maus
Copy link
Owner Author

Here some further status update on the 64bit kernel+OS support. With the final release of 3.53.34.20201121 the first 64bit OS version had been shipped to end users and so far only minor reports had been received of certain incompatibilities or non-working CCU Addons (e.g. Mosquitto CCU Addon) for which we can develop easy workarounds. Thus, point (1) of the above list is fulfilled now and we have a working 64bit RaspberryMatic with a 32bit compatibility layer.

Now we should concentrate to get the basic OCCU/CCU binaries like rfd, multimacd, etc. ported over to be also shipped as 64bit binaries. In addition, CCU Addon authors which ship theirs addons with 32bit binaries should be motivated to port their Addons to also ship 64bit binaries, if possible. This should then in long run allow to have everything completely ported over to 64bit and thus we could then (in some years) remove the 32bit compatibility layer again to get the image size and build times down.

For reference we will constantly update the above list with still missing / non-64bit binaries or Addons so that we can monitor potential updates here and get things done at some point in future.

@angelnu
Copy link
Contributor

angelnu commented Jan 18, 2021

I spent some days debugging a problem between RasberryMatic and Glusterfs that resulted in corrupted Homematic devices within rega (while HomeMatic IP worked). While this might be a quite specialized problem it might affect others running on 64 bits handlers.

The summary is that the rfd daemon is in 32 bits AND it is expecting 32 bits inodes while in 64 bits kernels the inodes are 64 bits. It most cases this is not noticed since inodes in local filesystems tend to use numbers bellow 32 bits but there is no guaranty. So we are being lucky most of the time (until you use remote filesystems for high availability...)

More info here: https://joejulian.name/post/broken-32bit-apps-on-glusterfs/ and heketi/heketi#1525

After I applied the above mount option to use 32 bits it worked like a charm and I was finally able to install RasberryMatic in Kubernetes in combination with my redundant Glusterfs.

@jens-maus - would you recommend raising an issue to OCCU? According to the docs even 32 bit kernels introduced 10 years changes to also use 64 bits inodes but it is still required to adjust the applications. They will have to do anyway if they want to eventually produce 64 bit binaries. The other OCCU SW did not show any similar corruption so it looks like a rfd problem only.

@jens-maus
Copy link
Owner Author

After I applied the above mount option to use 32 bits it worked like a charm and I was finally able to install RasberryMatic in Kubernetes in combination with my redundant Glusterfs.

Can you please explicitly state the mount option you are using and that solved the issue for you here.

@jens-maus - would you recommend raising an issue to OCCU? According to the docs even 32 bit kernels introduced 10 years changes to also use 64 bits inodes but it is still required to adjust the applications. They will have to do anyway if they want to eventually produce 64 bit binaries. The other OCCU SW did not show any similar corruption so it looks like a rfd problem only.

Well, This ticket here is the right and currently only place for that type of issue (32bit vs. 64bit). I am currently (together qith eQ3) in the process of porting all OCCU applications (rfd, multimacd, ReGaHss, etc.) to 64bit, thus this issue should hopefully vanish as soon as all these applications are finally ported to 64bit. What makes me wonder, however, is that you mention rfd here since rfdalone should not perform any filesystem based operations really. The only application that regularly performs disk operations is ReGaHss. So are you sure that rfd is the app that was causing the issues you were seeing? And what about debug information? Did you manage to catch any debug info regarding that issue?

@angelnu
Copy link
Contributor

angelnu commented Jan 19, 2021

The option is --enable-ino32 which is then used for the fuse program that the gluster client uses natively:

/usr/sbin/glusterfs --enable-ino32 --log-level=ERROR --log-file=/var/snap/microk8s/common/var/lib/kubelet/plugins/kubernetes.io/glusterfs/raspberrymatic-clusterfs/raspberrymatic-0-glusterfs.log --fuse-mountopts=auto_unmount --process-name fuse --volfile-server=192.168.2.86 --volfile-server=192.168.2.84 --volfile-server=192.168.2.86 --volfile-server=192.168.2.87 --volfile-id=ccu --fuse-mountopts=auto_unmount /var/snap/microk8s/common/var/lib/kubelet/pods/fd7f1a57-2be0-46da-bc5e-26d450b3861b/volumes/kubernetes.io~glusterfs/raspberrymatic-clusterfs

Unfortunately the logs did not show anything but I have to admit I focused in the rfd daemon. The error manifested by the homematic devices dissapearing from the UI after a few minutes of restoring a backup. After this the HMIP devices still worked. My test might be biased since I have way more Homematic devices than Homematic IP.

I will try to increase the log level of Rega, dissable my circumvention and see if I can get any additional logs showing the error. I could also run with strace and see if I detect any syscal queering the inode. If you have an< other suggestion on what debug data to collect please let me know - it will have to wait likely until I get time over the weekend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💡 enhancement-ideas New feature or change request 💻 hardware support This issue refs tickets/issue introducing/fixing some hardware support 🏷️ HmIPServer This refs the HmIPServer component 🏷️ ReGaHss This refs the ReGaHss component 🏷️ WebUI This refs the WebUI component 🙏 help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants