Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PowerPC {32, 64}-bit Block Trampolines #272

Merged
merged 41 commits into from Feb 12, 2024
Merged

PowerPC {32, 64}-bit Block Trampolines #272

merged 41 commits into from Feb 12, 2024

Conversation

hmelder
Copy link
Collaborator

@hmelder hmelder commented Jan 20, 2024

Status

Besides UnexpectedException failing (just like on AArch64), all unit tests are passing. However, I would like to incorporate a proper fix for the hard-coded page sizes (See #271) as marking the whole page RWX is not really great... (edit: This was due to mprotect failing to set PROT_EXEC on an unaligned page as ppc64el has a page size of 64k on Debian)
This is why this PR is still marked as a draft.

It would also be good to have a CI workflow for PowerPC.

PowerPC 32-bit

System (QEMU):

  • OS: Debian GNU/Linux trixie/sid ppc
  • Host: PowerMac3,1
  • CPU: 7400 (1) @ 900MHz

Clang:

root@debian:~# clang --version
Debian clang version 16.0.6 (19)
Target: powerpc-unknown-linux-gnu

I needed to link objc to libatomic. Ideas on how to check this in CMake are welcome!

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 4e080d8..d2a3d3a 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -245,6 +245,7 @@ if (WIN32 AND NOT MINGW)
 endif()

 target_link_libraries(objc tsl::robin_map)
+target_link_libraries(objc atomic)

 set_target_properties(objc PROPERTIES
        LINKER_LANGUAGE C

Results:

100% tests passed, 0 tests failed out of 190

Total Test time (real) = 213.12 sec

The following tests did not run:
	115 - objc_msgSend (Skipped)
	116 - objc_msgSend_optimised (Skipped)
	117 - objc_msgSend_legacy (Skipped)
	118 - objc_msgSend_legacy_optimised (Skipped)
	151 - UnexpectedException (Skipped)
	152 - UnexpectedException_optimised (Skipped)
	153 - UnexpectedException_legacy (Skipped)
	154 - UnexpectedException_legacy_optimised (Skipped)
	171 - FastPathAlloc (Skipped)
	172 - FastPathAlloc_optimised (Skipped)

POWER9 (PowerPC 64-bit Little-Endian)

System (QEMU):

  • OS: Debian GNU/Linux trixie/sid ppc64le
  • Host: IBM pSeries (emulated by qemu)
  • CPU: POWER9 (architected) (4) @ 1.000GHz

Clang:

root@debian:~/libobjc2# clang --version
Debian clang version 16.0.6 (19)
Target: powerpc64le-unknown-linux-gnu

Results:

100% tests passed, 0 tests failed out of 190

Total Test time (real) = 116.38 sec

The following tests did not run:
	115 - objc_msgSend (Skipped)
	116 - objc_msgSend_optimised (Skipped)
	117 - objc_msgSend_legacy (Skipped)
	118 - objc_msgSend_legacy_optimised (Skipped)
	151 - UnexpectedException (Skipped)
	152 - UnexpectedException_optimised (Skipped)
	153 - UnexpectedException_legacy (Skipped)
	154 - UnexpectedException_legacy_optimised (Skipped)
	171 - FastPathAlloc (Skipped)
	172 - FastPathAlloc_optimised (Skipped)

block_to_imp.c Outdated Show resolved Hide resolved
block_trampolines.S Outdated Show resolved Hide resolved
@hmelder
Copy link
Collaborator Author

hmelder commented Jan 21, 2024

I cannot reproduce the build error from the powerpc-linux-gnu cross builder. It seems like the GitHub Ubuntu image has some additional packages in /usr/lib32 which are incorrectly used instead.

The powerpc64le-linux-gnu cross builder works just fine.

@hmelder hmelder marked this pull request as ready for review January 21, 2024 20:08
@hmelder
Copy link
Collaborator Author

hmelder commented Jan 21, 2024

@rmottola do you want to try the changes out on real hardware, or perhaps have a look at the ASM?

@davidchisnall
Copy link
Member

Take a look at what we do in snmalloc for the hacks required to make PowerPC run correctly. The Debian / Ubuntu binfmt packages set it up with the wrong page size. It's very annoying.

asmconstants.h Outdated Show resolved Hide resolved
@hmelder
Copy link
Collaborator Author

hmelder commented Jan 22, 2024

Take a look at what we do in snmalloc for the hacks required to make PowerPC run correctly.

Alright. Thank you :)

The Debian / Ubuntu binfmt packages set it up with the wrong page size. It's very annoying.

The weird thing is that it works on a fresh 22.04 aarch64 rootfs with powerpc cross.

Co-authored-by: David Chisnall <davidchisnall@users.noreply.github.com>
@hmelder
Copy link
Collaborator Author

hmelder commented Jan 22, 2024

Take a look at what we do in snmalloc for the hacks required to make PowerPC run correctly.

It seems that snalloc only has a powerpc64le configuration in the workflow, but the problem lies with PowerPC (32-bit, Big-Endian). We can skip the integration for now, as it is a legacy architecture.

@rmottola
Copy link
Member

rmottola commented Jan 24, 2024

@rmottola do you want to try the changes out on real hardware, or perhaps have a look at the ASM?

Yes, I can test it on real hardware. Setting up thinks on my iBook G4 as a first test. Will report.

I can use:

Debian clang version 16.0.6 (19)
Target: powerpc-unknown-linux-gnu
Thread model: posix

processor       : 0
cpu             : 7447A, altivec supported

@rmottola
Copy link
Member

Built completes of the branch ppc-block-trampoline.

I get a couple of warnings
/home/multix/code/libobjc2/sarray2.h:55:8: warning: variable length array folded to constant array as an extension [-Wgnu-folding-constant]

/home/multix/code/libobjc2/objc_msgSend.S:16:2: warning: objc_msgSend() not implemented for your architecture [-W#warnings] #warning objc_msgSend() not implemented for your architecture

@rmottola
Copy link
Member

How should I run tests?

multix@iBookG4-14:~/code/libobjc2/Build$ make test
make: *** No rule to make target 'test'.  Stop

@hmelder
Copy link
Collaborator Author

hmelder commented Jan 28, 2024

objc_method_cache_version is defined as an _Atomic(uint64_t). Can we make this architecture dependent?

@davidchisnall
Copy link
Member

Hmm, that’s needed for the safe caching, but it isn’t useful on platforms where 64-bit atomic reads are expensive. It needs to be 64-bit to avoid overflows. We can probably just not enable it on PowerPC32.

@hmelder
Copy link
Collaborator Author

hmelder commented Jan 28, 2024

Sorry my question was malformed.

It needs to be 64-bit to avoid overflows.

This answers it.

We can probably just not enable it on PowerPC32.

I am not familiar with the performance differences of non-native atomic reads, but is it that big of a deal?
What would you use instead?

@davidchisnall
Copy link
Member

I am not familiar with the performance differences of non-native atomic reads, but is it that big of a deal?

Non-native ones require acquiring a lock and then doing the read. This basically serialises them all and so will offset any potential benefit from caching.

What would you use instead?

Don't do lookup caching on PowerPC32. The compiler doesn't (yet?) do this automatically anyway.

@hmelder
Copy link
Collaborator Author

hmelder commented Feb 4, 2024

Don't do lookup caching on PowerPC32. The compiler doesn't (yet?) do this automatically anyway.

If I understand the lookup caching implementation correctly, the caller can opt in and cache a pointer to the current slot, checking whether the slot was invalidated before using it.

When you call objc_slot_lookup_version, the final
parameter is used to return either the current value of
objc_method_cache_version or 0 if the slot is uncacheable.

What about wrapping the atomic objc_method_cache_version operations in ifdefs (not including them on ppc32), and always write 0 to *version in objc_method_cache_version?

@davidchisnall
Copy link
Member

What about wrapping the atomic objc_method_cache_version operations in ifdefs (not including them on ppc32), and always write 0 to *version in objc_method_cache_version?

Yup, I think the thing to do on PowerPC32 is to always write 0 to *version and to hide the declaration of objc_method_cache_version.

@hmelder
Copy link
Collaborator Author

hmelder commented Feb 4, 2024

And to hide the declaration of objc_method_cache_version.

But does the declaration of an _Atomic type pull in the libatomic library?

@hmelder
Copy link
Collaborator Author

hmelder commented Feb 4, 2024

I do not like conditionally hiding parts of the public API.

@davidchisnall
Copy link
Member

Updates to it will pull in atomic things, we should simply remove it if we’re on a platform without 64-bit atomics.

@hmelder
Copy link
Collaborator Author

hmelder commented Feb 12, 2024

Caching is now disabled on ppc32. All tests are passing.

100% tests passed, 0 tests failed out of 190

Total Test time (real) = 201.64 sec

The following tests did not run:
	115 - objc_msgSend (Skipped)
	116 - objc_msgSend_optimised (Skipped)
	117 - objc_msgSend_legacy (Skipped)
	118 - objc_msgSend_legacy_optimised (Skipped)
	151 - UnexpectedException (Skipped)
	152 - UnexpectedException_optimised (Skipped)
	153 - UnexpectedException_legacy (Skipped)
	154 - UnexpectedException_legacy_optimised (Skipped)
	171 - FastPathAlloc (Skipped)
	172 - FastPathAlloc_optimised (Skipped)
root@debian:~/libobjc2/build-ppc# uname -a
Linux debian 6.6.8-powerpc-smp #1 SMP Debian 6.6.8-1 (2023-12-22) ppc GNU/Linux

@davidchisnall
Copy link
Member

LGTM, please do some squashing before you merge!

@hmelder hmelder merged commit e882423 into master Feb 12, 2024
59 checks passed
@hmelder hmelder deleted the ppc-block-trampoline branch February 12, 2024 11:09
@qmfrederik qmfrederik mentioned this pull request Feb 27, 2024
qmfrederik pushed a commit that referenced this pull request Mar 21, 2024
* Implement PowerPC block trampoline

* Adjust pagesize on ppc64

* Skip UnexpectedException test for PowerPC

* Move PAGE_SIZE to asmconstants.h

* Use PAGE_SIZE and PAGE_SHIFT macros for PowerPC

* Add ppc64el and powerpc qemu-crossbuild targets

* Add NO_SAFE_CACHING definition and guards

* Do not export objc_method_cache_version on ppc32

---------

Co-authored-by: David Chisnall <davidchisnall@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants