Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8331934: [s390x] Add support for primitive array C1 clone intrinsic #19220

Closed
wants to merge 26 commits into from

Conversation

offamitkumar
Copy link
Member

@offamitkumar offamitkumar commented May 13, 2024

Adds JDK-8302850 Port for s390x.

Testing:

make test TEST="hotspot_compiler" JTREG="JAVA_OPTIONS=-XX:TieredStopAtLevel=1"

==============================                                                                                                          
Test summary                                                                                                                            
==============================                                                                                                          
   TEST                                              TOTAL  PASS  FAIL ERROR                                                            
   jtreg:test/hotspot/jtreg:hotspot_compiler          1166  1166     0     0                                                            
==============================                                                                                                          
TEST SUCCESS

* Tier1 Test with Fast debug build. 

BenchMarking:

Without Patch: 

Benchmark                 (size)  Mode  Cnt     Score    Error  Units
ArrayClone.byteArraycopy       0  avgt   15    10.838 ±  0.461  ns/op
ArrayClone.byteArraycopy      10  avgt   15    28.919 ±  1.695  ns/op
ArrayClone.byteArraycopy     100  avgt   15    48.815 ±  0.901  ns/op
ArrayClone.byteArraycopy    1000  avgt   15   256.357 ±  7.901  ns/op
ArrayClone.byteClone           0  avgt   15    90.398 ±  3.119  ns/op
ArrayClone.byteClone          10  avgt   15   103.774 ±  4.468  ns/op
ArrayClone.byteClone         100  avgt   15   126.628 ±  6.952  ns/op
ArrayClone.byteClone        1000  avgt   15   326.409 ± 31.635  ns/op
ArrayClone.intArraycopy        0  avgt   15    10.450 ±  0.509  ns/op
ArrayClone.intArraycopy       10  avgt   15    36.903 ±  0.753  ns/op
ArrayClone.intArraycopy      100  avgt   15    85.964 ±  1.806  ns/op
ArrayClone.intArraycopy     1000  avgt   15   841.512 ± 40.335  ns/op
ArrayClone.intClone            0  avgt   15    89.332 ±  3.695  ns/op
ArrayClone.intClone           10  avgt   15   110.639 ±  2.476  ns/op
ArrayClone.intClone          100  avgt   15   195.781 ±  8.622  ns/op
ArrayClone.intClone         1000  avgt   15  1058.479 ± 92.468  ns/op
Finished running test 'micro:java.lang.ArrayClone'


with patch:

Benchmark                 (size)  Mode  Cnt    Score    Error  Units
ArrayClone.byteArraycopy       0  avgt   15   10.526 ±  0.289  ns/op
ArrayClone.byteArraycopy      10  avgt   15   27.110 ±  0.656  ns/op
ArrayClone.byteArraycopy     100  avgt   15   49.872 ±  1.562  ns/op
ArrayClone.byteArraycopy    1000  avgt   15  269.518 ±  4.567  ns/op
ArrayClone.byteClone           0  avgt   15   10.766 ±  0.899  ns/op
ArrayClone.byteClone          10  avgt   15   18.341 ±  0.394  ns/op
ArrayClone.byteClone         100  avgt   15   40.986 ±  0.674  ns/op
ArrayClone.byteClone        1000  avgt   15  227.512 ±  7.643  ns/op
ArrayClone.intArraycopy        0  avgt   15   10.320 ±  0.294  ns/op
ArrayClone.intArraycopy       10  avgt   15   36.557 ±  0.860  ns/op
ArrayClone.intArraycopy      100  avgt   15   89.837 ±  2.364  ns/op
ArrayClone.intArraycopy     1000  avgt   15  836.678 ± 27.920  ns/op
ArrayClone.intClone            0  avgt   15   10.043 ±  0.216  ns/op
ArrayClone.intClone           10  avgt   15   29.149 ±  0.723  ns/op
ArrayClone.intClone          100  avgt   15   88.046 ±  2.211  ns/op
ArrayClone.intClone         1000  avgt   15  840.163 ± 58.748  ns/op
Finished running test 'micro:java.lang.ArrayClone'

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8331934: [s390x] Add support for primitive array C1 clone intrinsic (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19220/head:pull/19220
$ git checkout pull/19220

Update a local copy of the PR:
$ git checkout pull/19220
$ git pull https://git.openjdk.org/jdk.git pull/19220/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 19220

View PR using the GUI difftool:
$ git pr show -t 19220

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19220.diff

Webrev

Link to Webrev Comment

galderz and others added 25 commits February 1, 2024 06:24
* Clone calls that involve Phi nodes are not supported.
* Add unimplemented stubs for other platforms.
* Copy state including locals for clone
so that reexecution works as expected.
* Added a jtreg test to verify the null check works.
Without the fix this test fails with a SEGV crash.
* Combine array length, new type array and arraycopy for clone in c1 graph.
* Add OmitCheckFlags to skip arraycopy checks.
* Instantiate ArrayCopyStub only if necessary.
* Avoid zeroing newly created arrays for clone.
* Add array null after c1 clone compilation test.
* Pass force reexecute to intrinsic via value stack.
This is needed to be able to deoptimize correctly this intrinsic.
* When new type array or array copy are used for the clone intrinsic,
their state needs to be based on the state before for deoptimization
to work as expected.
Co-authored-by: Boris <42576543+bulasevich@users.noreply.github.com>
* Added byte[] and long[] tests.
* Verified that the cloned array has the same contents.
* Increase number of iterations reach tier 3 threshold.
Co-authored-by: Dean Long <17332032+dean-long@users.noreply.github.com>
@bridgekeeper
Copy link

bridgekeeper bot commented May 13, 2024

👋 Welcome back amitkumar! A progress list of the required criteria for merging this PR into pr/17667 will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented May 13, 2024

@offamitkumar This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8331934: [s390x] Add support for primitive array C1 clone intrinsic

Reviewed-by: mdoerr, sjayagond

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 67 new commits pushed to the master branch:

  • 9bfae88: 8332297: annotation processor that generates records sometimes fails due to NPE in javac
  • 4e169d1: 8332401: G1: TestFromCardCacheIndex.java with -XX:GCCardSizeInBytes=128 triggers underflow assertion
  • 7ffc999: 8332498: [aarch64, x86] improving OpToAssembly output for partialSubtypeCheckConstSuper Instruct
  • e529101: 8332473: ubsan: growableArray.hpp:290:10: runtime error: null pointer passed as argument 1, which is declared to never be null
  • 414a7fd: 8311175: Move BufWriter::asByteBuffer to BufWriterImpl
  • 451cc23: 8332486: ClassFile API ArrayIndexOutOfBoundsException with label metadata
  • 5f2b8d0: 8332448: Make SpaceMangler inherit AllStatic
  • 8a49d47: 8332462: ubsan: c1_ValueStack.hpp:229:49: runtime error: load of value 171, which is not a valid value for type 'bool'
  • ce99198: 8332181: Deprecate for removal the MulticastSocket.send(DatagramPacket, byte) and setTTL/getTTL methods on DatagramSocketImpl and MulticastSocket
  • f5ab7df: 8332494: java/util/zip/EntryCount64k.java failing with java.lang.RuntimeException: '\A\Z' missing from stderr
  • ... and 57 more: https://git.openjdk.org/jdk/compare/2f10a316ff0c5a4c124b94f6fabb38fb119d2c82...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot changed the title JDK-8331934 8331934: [s390x] Add support for primitive array C1 clone intrinsic May 13, 2024
@openjdk
Copy link

openjdk bot commented May 13, 2024

@offamitkumar The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label May 13, 2024
@offamitkumar offamitkumar marked this pull request as ready for review May 14, 2024 08:27
@openjdk openjdk bot added the rfr Pull request is ready for review label May 14, 2024
@mlbridge
Copy link

mlbridge bot commented May 14, 2024

Webrevs

@offamitkumar
Copy link
Member Author

@RealLucy @TheRealMDoerr Would you please review this one. :-)

Testing seems clear on s390x. I have posted Benchmark result as well.

Please let me know if any further testing is required.

@openjdk-notifier openjdk-notifier bot changed the base branch from pr/17667 to master May 15, 2024 07:51
@openjdk-notifier
Copy link

The parent pull request that this pull request depends on has now been integrated and the target branch of this pull request has been updated. This means that changes from the dependent pull request can start to show up as belonging to this pull request, which may be confusing for reviewers. To remedy this situation, simply merge the latest changes from the new target branch into this pull request by running commands similar to these in the local repository for your personal fork:

git checkout c1_clone_intr
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# if there are conflicts, follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk
Copy link

openjdk bot commented May 15, 2024

@offamitkumar this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout c1_clone_intr
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label May 15, 2024
@openjdk openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label May 15, 2024
@offamitkumar
Copy link
Member Author

Result from the LPAR:

without patch:
Benchmark                 (size)  Mode  Cnt    Score   Error  Units
ArrayClone.byteArraycopy       0  avgt   15    4.021 ± 0.032  ns/op
ArrayClone.byteArraycopy      10  avgt   15   15.931 ± 0.171  ns/op
ArrayClone.byteArraycopy     100  avgt   15   16.201 ± 0.076  ns/op
ArrayClone.byteArraycopy    1000  avgt   15   45.268 ± 0.213  ns/op
ArrayClone.byteClone           0  avgt   15   70.300 ± 0.244  ns/op
ArrayClone.byteClone          10  avgt   15   77.112 ± 0.558  ns/op
ArrayClone.byteClone         100  avgt   15   79.860 ± 0.606  ns/op
ArrayClone.byteClone        1000  avgt   15  112.834 ± 0.526  ns/op
ArrayClone.intArraycopy        0  avgt   15    4.007 ± 0.012  ns/op
ArrayClone.intArraycopy       10  avgt   15   15.378 ± 0.055  ns/op
ArrayClone.intArraycopy      100  avgt   15   25.387 ± 0.102  ns/op
ArrayClone.intArraycopy     1000  avgt   15  161.278 ± 0.719  ns/op
ArrayClone.intClone            0  avgt   15   70.341 ± 0.265  ns/op
ArrayClone.intClone           10  avgt   15   78.209 ± 0.514  ns/op
ArrayClone.intClone          100  avgt   15   89.845 ± 0.571  ns/op
ArrayClone.intClone         1000  avgt   15  257.037 ± 2.809  ns/op
Finished running test 'micro:java.lang.ArrayClone'
with patch: 
Benchmark                 (size)  Mode  Cnt    Score   Error  Units
ArrayClone.byteArraycopy       0  avgt   15    4.021 ± 0.027  ns/op
ArrayClone.byteArraycopy      10  avgt   15   16.106 ± 0.859  ns/op
ArrayClone.byteArraycopy     100  avgt   15   16.212 ± 0.045  ns/op
ArrayClone.byteArraycopy    1000  avgt   15   45.147 ± 0.137  ns/op
ArrayClone.byteClone           0  avgt   15    3.570 ± 0.010  ns/op
ArrayClone.byteClone          10  avgt   15    6.033 ± 0.018  ns/op
ArrayClone.byteClone         100  avgt   15    6.868 ± 0.020  ns/op
ArrayClone.byteClone        1000  avgt   15   33.437 ± 0.114  ns/op
ArrayClone.intArraycopy        0  avgt   15    4.008 ± 0.010  ns/op
ArrayClone.intArraycopy       10  avgt   15   15.373 ± 0.044  ns/op
ArrayClone.intArraycopy      100  avgt   15   29.543 ± 3.687  ns/op
ArrayClone.intArraycopy     1000  avgt   15  161.554 ± 0.414  ns/op
ArrayClone.intClone            0  avgt   15    3.571 ± 0.010  ns/op
ArrayClone.intClone           10  avgt   15    6.184 ± 0.016  ns/op
ArrayClone.intClone          100  avgt   15   13.304 ± 0.043  ns/op
ArrayClone.intClone         1000  avgt   15  133.755 ± 0.362  ns/op
Finished running test 'micro:java.lang.ArrayClone'

@offamitkumar
Copy link
Member Author

@RealLucy @TheRealMDoerr Would you please review this one. :-)

pinging you again, if you got bandwidth then please review it.

Copy link
Contributor

@TheRealMDoerr TheRealMDoerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label May 17, 2024
@offamitkumar
Copy link
Member Author

@galderz if possible can you review this ?

Maybe this could ease you a bit while review: Testing for {tier1} X {fastdebug, slowdebug, release} and {tier1 -XX:TieredStopAtLevel=1} X {fastdebug, slowdebug, release} was clean. Benchmarking also shows that we are good on this. You can check the result here :-)

@galderz
Copy link
Contributor

galderz commented May 20, 2024

@offamitkumar I'm no s390 expert, so I can only do a light review on the code. The changes look good to me and the benchmark results show improvements. One thing I would suggest is maybe expanding the testing a bit, e.g. hotspot_compiler, hotspot_gc, hotspot_serviceability, hotspot_runtime, and tier1-3 see #17667 (comment) for further details.

Copy link
Member

@sid8606 sid8606 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@offamitkumar
Copy link
Member Author

Ran this command: make test TEST="hotspot_compiler hotspot_gc hotspot_serviceability hotspot_runtime tier1 tier2 tier3" JTREG="JAVA_OPTIONS=-XX:TieredStopAtLevel=1 -Xcomp" and didn't see any new failure appearing due to my changes.

@offamitkumar
Copy link
Member Author

Thanks Martin & Sid for approval ;-)

/integrate

@openjdk
Copy link

openjdk bot commented May 21, 2024

Going to push as commit ae9ad86.
Since your change was applied there have been 68 commits pushed to the master branch:

  • 3479b46: 8332595: Serial: Remove unused TenuredGeneration::should_collect
  • 9bfae88: 8332297: annotation processor that generates records sometimes fails due to NPE in javac
  • 4e169d1: 8332401: G1: TestFromCardCacheIndex.java with -XX:GCCardSizeInBytes=128 triggers underflow assertion
  • 7ffc999: 8332498: [aarch64, x86] improving OpToAssembly output for partialSubtypeCheckConstSuper Instruct
  • e529101: 8332473: ubsan: growableArray.hpp:290:10: runtime error: null pointer passed as argument 1, which is declared to never be null
  • 414a7fd: 8311175: Move BufWriter::asByteBuffer to BufWriterImpl
  • 451cc23: 8332486: ClassFile API ArrayIndexOutOfBoundsException with label metadata
  • 5f2b8d0: 8332448: Make SpaceMangler inherit AllStatic
  • 8a49d47: 8332462: ubsan: c1_ValueStack.hpp:229:49: runtime error: load of value 171, which is not a valid value for type 'bool'
  • ce99198: 8332181: Deprecate for removal the MulticastSocket.send(DatagramPacket, byte) and setTTL/getTTL methods on DatagramSocketImpl and MulticastSocket
  • ... and 58 more: https://git.openjdk.org/jdk/compare/2f10a316ff0c5a4c124b94f6fabb38fb119d2c82...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label May 21, 2024
@openjdk openjdk bot closed this May 21, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 21, 2024
@openjdk
Copy link

openjdk bot commented May 21, 2024

@offamitkumar Pushed as commit ae9ad86.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@offamitkumar offamitkumar deleted the c1_clone_intr branch May 25, 2024 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
5 participants