Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem machines for release #2662

Open
Haroon-Khel opened this issue Jul 7, 2022 · 32 comments
Open

Problem machines for release #2662

Haroon-Khel opened this issue Jul 7, 2022 · 32 comments

Comments

@Haroon-Khel
Copy link
Contributor

Haroon-Khel commented Jul 7, 2022

test-docker-fedora34-x64-1 and (newly created) test-docker-fedora34-x64-2 ref #2631 JDK8

The following tests are failing on both -1 and -2. Links are for -2
java/nio/file/Files/probeContentType/Basic.java
https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5133/console
java/net/Inet6Address/B6206527.java.B6206527
https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5135/console
java/net/ipv6tests/B6521014.java
https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5136/console

test-osuosl-centos74-ppc64le-1/ and test-osuosl-centos74-ppc64le-2/ ref #2625 JDK8

On test-osuosl-centos74-ppc64le-1

sun/security/pkcs11/fips/TestTLS12.java
https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5092/console

sun/tools/jinfo/Basic.sh
https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5094/console

On test-osuosl-centos74-ppc64le-2

sun/security/pkcs11/fips/TestTLS12.java
https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5095/console

sun/tools/jinfo/Basic.sh resolved
https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5103/console

test-azure-win2012r2-x64-3 and test-azure-win2019-x64-1 ref #2645 JDK11

@sophia-guo
Copy link

ERROR: Cannot delete workspace :Malformed input or input contains unmappable characters #2630

@sophia-guo
Copy link

test-azure-win2012r2-x64-1

ERROR: Cannot delete workspace :Unable to delete 'D:\jenkins\workspace\Test_openjdk11_hs_sanity.openjdk_x86-64_windows\openjdkbinary\j2sdk-image\lib\modules'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts.

Recent two run:
https://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.openjdk_x86-64_windows/647/console
https://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.openjdk_x86-64_windows/647/console

@Haroon-Khel
Copy link
Contributor Author

Haroon-Khel commented Jul 14, 2022

That directory is being used by leftover jcmd.exe processes
image

#2635 is related. It is surprising to find that this is occurring on a different machine this time

@Haroon-Khel
Copy link
Contributor Author

Haroon-Khel commented Jul 14, 2022

sun/tools/jinfo/Basic.sh on the 2 linux ppc64le machines has been resolved, #2625 (comment)

@Haroon-Khel
Copy link
Contributor Author

Stewart has added jcmd to the list of process to kill https://ci.adoptopenjdk.net/view/Tooling/job/SXA-processCheck/, #2635 (comment)

@sophia-guo
Copy link

sophia-guo commented Jul 14, 2022

test-azure-win2012r2-x64-1

https://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.openjdk_x86-64_windows/648/console
https://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.openjdk_x86-64_windows/647/console
https://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.openjdk_x86-64_windows/646/console

[WS-CLEANUP] Deleting project workspace...
[WS-CLEANUP] Deferred wipeout is disabled by the job configuration...
ERROR: Cannot delete workspace :Unable to delete 'D:\jenkins\workspace\Test_openjdk11_hs_sanity.openjdk_x86-64_windows\openjdkbinary\j2sdk-image\lib\modules'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts.
[Pipeline] }
[Pipeline] // timeout
[Pipeline] echo
Exception: hudson.AbortException: Cannot delete workspace: Unable to delete 'D:\jenkins\workspace\Test_openjdk11_hs_sanity.openjdk_x86-64_windows\openjdkbinary\j2sdk-image\lib\modules'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts.

All three recent jobs are assigned to this machine and failed . all failed with running this specific machine

@sxa
Copy link
Member

sxa commented Jul 15, 2022

test-azure-win2012r2-x64-1

https://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.openjdk_x86-64_windows/648/console https://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.openjdk_x86-64_windows/647/console https://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.openjdk_x86-64_windows/646/console

[WS-CLEANUP] Deleting project workspace...
[WS-CLEANUP] Deferred wipeout is disabled by the job configuration...
ERROR: Cannot delete workspace :Unable to delete 'D:\jenkins\workspace\Test_openjdk11_hs_sanity.openjdk_x86-64_windows\openjdkbinary\j2sdk-image\lib\modules'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts.
[Pipeline] }
[Pipeline] // timeout
[Pipeline] echo
Exception: hudson.AbortException: Cannot delete workspace: Unable to delete 'D:\jenkins\workspace\Test_openjdk11_hs_sanity.openjdk_x86-64_windows\openjdkbinary\j2sdk-image\lib\modules'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts.

All three recent jobs are assigned to this machine and failed . all failed with running this specific machine

Fixed as per #2209 (comment)

@Haroon-Khel
Copy link
Contributor Author

Haroon-Khel commented Jul 15, 2022

Machines that are still problematic:

Any fedora dockerstatic container. ref #2631, any fedora container on https://ci.adoptopenjdk.net/computer/docker-packet-ubuntu2004-intel-1/ will pass ipv6 tests while those on https://ci.adoptopenjdk.net/computer/docker-packet-ubuntu2004-amd-1/ will fail them. The difference needs to be investigated.
I cant get java/nio/file/Files/probeContentType/Basic.java to pass on any Fedora container, see #2631 (comment)

test-osuosl-centos74-ppc64le-1 and -2
sun/tools/jinfo/Basic.sh now passes, but sun/security/pkcs11/fips/TestTLS12.java still fails. See #2625 (comment)

test-azure-win2012r2-x64-3 and test-azure-win2019-x64-1
see #2645 (comment)
Failures are intermittent, but more failures than passes.

If by Monday these issues are not resolved, I'll turn the jenkins nodes offline for the release

@Haroon-Khel
Copy link
Contributor Author

I was able to get java/nio/file/Files/probeContentType/Basic.java to pass on our fedora boxes, see #2631 (comment), however I have not solved the failing ipv6 tests on fedora containers hosted on https://ci.adoptopenjdk.net/computer/docker-packet-ubuntu2004-amd-1/.

And sun/security/pkcs11/fips/TestTLS12.java continues to fail on test-osuosl-centos74-ppc64le-1 and -2, see #2625

I have temporarily turned offline the following nodes for this release

https://ci.adoptopenjdk.net/computer/test-docker-fedora34-x64-1/
https://ci.adoptopenjdk.net/computer/test-docker-fedora34-x64-2/
https://ci.adoptopenjdk.net/computer/test-docker-fedora36-x64-1/
https://ci.adoptopenjdk.net/computer/test-osuosl-centos74-ppc64le-1/
https://ci.adoptopenjdk.net/computer/test-osuosl-centos74-ppc64le-2/

@sxa
Copy link
Member

sxa commented Oct 3, 2022

@Haroon-Khel Can you give a status update on the systems that were problematic - have they all now been resolved or is there still work to do here. Need to know whether it can be closed or whether it needs to move to October.

@Haroon-Khel
Copy link
Contributor Author

Since sun/security/pkcs11/fips/TestTLS12.java continues to fail on test-osuosl-centos74-ppc64le-1 and -2 this issue should be kept open

@sxa
Copy link
Member

sxa commented Nov 23, 2022

Related: #2815

@sxa sxa added this to the 2022-12 (December) milestone Nov 23, 2022
@Haroon-Khel Haroon-Khel changed the title Problem machines for the upcoming July release Problem machines for release Jan 17, 2023
@Haroon-Khel
Copy link
Contributor Author

Haroon-Khel commented Jan 17, 2023

Ipv6 failures on new ppc64le machine #2883
test-docker-ubuntu2204-ppc64le-1
test-docker-debian11-ppc64le-1
Could also affect:
test-docker-ubuntu2204-ppc64le-2
test-docker-debian11-ppc64le-2
test-docker-debian11-ppc64le-3

has run on the machines during setup, annoyingly it isn't fixing the problem

#2884 affects the same machines

@Haroon-Khel
Copy link
Contributor Author

Haroon-Khel commented Jan 17, 2023

ref #2886

Taking test-docker-centos8-x64-2

@Haroon-Khel
Copy link
Contributor Author

Taking offline the following machines due to #2884

test-docker-ubuntu2204-ppc64le-1
test-docker-debian11-ppc64le-1
test-docker-ubuntu2204-ppc64le-2
test-docker-debian11-ppc64le-2

@Haroon-Khel
Copy link
Contributor Author

Haroon-Khel commented Jan 17, 2023

test-docker-ubi8-x64-2 and test-docker-fedora35-x64-1 both offline ref #2882

@Haroon-Khel
Copy link
Contributor Author

ref #2885 test-ibmcloud-win2012r2-x64-1 offline

@Haroon-Khel Haroon-Khel removed this from the 2023-01 (January) milestone Jan 31, 2023
@sxa sxa modified the milestones: 2023-06 (June), 2023-07 (July) Jul 7, 2023
@sxa
Copy link
Member

sxa commented Aug 30, 2023

Also we've been having some inconsistencies on test issues in #2536 across different mac machines.

@sxa
Copy link
Member

sxa commented Nov 2, 2023

extended.perf dacapo-xalan-0 success varies depending on machine: adoptium/aqa-tests#3122 (comment)

@Haroon-Khel
Copy link
Contributor Author

Haroon-Khel commented Nov 20, 2023

Summary of AQA triage on s390x jdk-21.0.1+12.1 https://github.com/temurin-compliance/temurin-compliance/issues/431#issuecomment-1810092968 (ongoing)

MiniMix_aot_5m_0, DBBLoadTest_5m_0, DBBLoadTest_5m_1 intermittently pass on all machines, but fail consistently on test-marist-sles12-s390x-2 and test-marist-sles15-s390x-2

java/foreign/TestLargeSegmentCopy.java from jdk_foreign fails on test-marist-rhel8-s390x-2, test-marist-rhel7-s390x-2,
test-marist-sles15-s390x-2

The following sanity system tests fail intermittently on all machines, but seem to fail consistently on test-marist-sles15-s390x-2

TestJlmRemoteClassAuth_1
TestJlmRemoteClassAuth_0
TestJlmRemoteClassNoAuth_0
TestJlmRemoteClassNoAuth_1
TestJlmRemoteMemoryAuth_0
TestJlmRemoteMemoryAuth_1
TestJlmRemoteMemoryNoAuth_0
TestJlmRemoteMemoryNoAuth_1
TestJlmRemoteNotifierProxyAuth_0
TestJlmRemoteNotifierProxyAuth_1
TestJlmRemoteThreadAuth_0
TestJlmRemoteThreadAuth_1
TestJlmRemoteThreadNoAuth_0
TestJlmRemoteThreadNoAuth_1
NioLoadTest_5m_0 
NioLoadTest_5m_1

The remaining failures below, from extended openjdk, are being run on all machines (grinders 8060 to 8067)

jdk_other_0
jdk_net_0
jdk_net_1
jdk_nio_0
jdk_nio_1
jdk_security3_0
jdk_security3_1
jdk_management_0
jdk_jmx_1
jdk_tools_0
jdk_tools_1
jdk_jfr_0
jdk_rmi_0
jdk_jdi_0

Grinder Machine Time Status
8060 test-ubuntu2004-1 18h32
8061 test-sles15-2 ABORTED after 40h (jdk_security_x = 7h each). Rerun 8077 (Next line!)
8077 test-sles15-2 No jdk_security_x, 345 failed [*]
8062 test-rhel7-2 ABORTED after 40h (jdk_security_x = 7h each) Rerun 8078 (Next line!)
8078 test-rhel7-2 No jdk_security_x 340 failures [*]
8063 test-ubuntu2204-1 28 hours 14 failures (mostly timeouts) Re-run failed targets 13 failures inc. multicast
8064 docker-sles12-1 17h11 1 fail: com.sun.jdi.FinalizerTest (re-run jdk_jdi_0 - same)
8065 test-rhel8-2 15h25 2 failures both in java.net.HttpClient (Re-run jdk_net-0/1 - 1 fail UdpSocket
8066 test-sles12-2 ABORTED after 40h (jdk_securty_x = 7h each) Rerun 8078 (Next line!)
8079 test-sles12-2 No jdk_security_x, 345 failures [*]
8067 docker-sles15-1 17h09 1 fail: sun.security.ssl.SSLSocketImpl (Re-run jdk_security3_0) PASS

[*] - the 340/345 failing tests Include many which are failing with something similar to this: Exception creating connection to: 148.100.74.92; nested exception is: java.net.NoRouteToHostException: No route to host |

@jiekang
Copy link

jiekang commented Dec 1, 2023

Data from the October CPU AQA triage can be found here:
https://docs.google.com/spreadsheets/d/16vAQvYzL_-azDoD5OhQ6lObD3-suJwqKfjtABuWoIkc/edit#gid=1601438678

This has a summary sheet, and a sheet for JDK Version with a list of: suite failures, action taken, and if applicable, problematic machine and failure type.

This list should be used to help drive individual actions to improve test infrastructure and reduce the number of re-runs due to machine configuration related issues. The rows that have a 'Bad Machine' and 'Failure Type' listed should be investigated first.

There is also a list of 'To Investigate' topics in each JDK Version sheet that may not necessarily be machine configuration issues, but look promising to me to understand and resolve. When I get more cycles, I intend to open separate, individual issues for these in the appropriate repos.

@Haroon-Khel
Copy link
Contributor Author

Haroon-Khel commented Dec 20, 2023

JDK17

test-docker-ubuntu2004-armv8l-3

TEST: java/beans/PropertyChangeSupport/Test4682386.java
TEST: java/beans/PropertyEditor/TestFontClassJava.java
TEST: java/beans/PropertyEditor/TestFontClassValue.java
TEST: java/beans/XMLEncoder/javax_swing_DefaultCellEditor.java
TEST: java/beans/XMLEncoder/javax_swing_JTree.java
TEST: java/beans/XMLEncoder/Test4631471.java
TEST: java/beans/XMLEncoder/Test4903007.java
  • jdk_imageio
TEST: javax/imageio/plugins/shared/ImageWriterCompressionTest.java

Installed fontconfig, rerunning https://ci.adoptium.net/view/Test_grinder/job/Grinder/8281/console. Passes ✅
Need to install fontconfig everywhere

test-docker-ubuntu2010-armv8l-2

TEST: javax/imageio/plugins/shared/ImageWriterCompressionTest.java

Unable to install fontconfig on Ubuntu 2010

Err:1 http://ports.ubuntu.com/ubuntu-ports groovy/main arm64 fonts-dejavu-core all 2.37-2
  404  Not Found [IP: 185.125.190.39 80]
E: Failed to fetch http://ports.ubuntu.com/ubuntu-ports/pool/main/f/fonts-dejavu/fonts-dejavu-core_2.37-2_all.deb  404  Not Found [IP: 185.125.190.39 80]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
...
root@93d2b4e13a22:~# apt-get update
Ign:1 http://ports.ubuntu.com/ubuntu-ports groovy InRelease
Ign:2 http://ports.ubuntu.com/ubuntu-ports groovy-updates InRelease
Ign:3 http://ports.ubuntu.com/ubuntu-ports groovy-backports InRelease
Ign:4 http://ports.ubuntu.com/ubuntu-ports groovy-security InRelease
Err:5 http://ports.ubuntu.com/ubuntu-ports groovy Release
  404  Not Found [IP: 185.125.190.39 80]
Err:6 http://ports.ubuntu.com/ubuntu-ports groovy-updates Release
  404  Not Found [IP: 185.125.190.39 80]
Err:7 http://ports.ubuntu.com/ubuntu-ports groovy-backports Release
  404  Not Found [IP: 185.125.190.39 80]
Err:8 http://ports.ubuntu.com/ubuntu-ports groovy-security Release
  404  Not Found [IP: 185.125.190.39 80]
Reading package lists... Done
E: The repository 'http://ports.ubuntu.com/ubuntu-ports groovy Release' no longer has a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
E: The repository 'http://ports.ubuntu.com/ubuntu-ports groovy-updates Release' no longer has a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
E: The repository 'http://ports.ubuntu.com/ubuntu-ports groovy-backports Release' no longer has a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
E: The repository 'http://ports.ubuntu.com/ubuntu-ports groovy-security Release' no longer has a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.

Looks like repo is no longer there, likely due to Ubuntu 2010 being EOL
Update: This machine has been replaced with https://ci.adoptium.net/computer/test-docker-ubuntu2310-armv8l-1/
AQA test pipeline running on this machine https://ci.adoptium.net/job/AQA_Test_Pipeline/202/console

test-docker-sles12-s390x-1

TEST: java/beans/PropertyChangeSupport/Test4682386.java
TEST: java/beans/PropertyEditor/TestFontClassJava.java
TEST: java/beans/PropertyEditor/TestFontClassValue.java
TEST: java/beans/XMLEncoder/javax_swing_DefaultCellEditor.java
TEST: java/beans/XMLEncoder/javax_swing_JTree.java
TEST: java/beans/XMLEncoder/Test4631471.java
TEST: java/beans/XMLEncoder/Test4903007.java

Installed fontconfig-devel, rerunning https://ci.adoptium.net/view/Test_grinder/job/Grinder/8293/

test-marist-ubuntu2204-s390x-1

TEST: sun/management/jdp/JdpDefaultsTest.java
TEST: sun/management/jdp/JdpJmxRemoteDynamicPortTest.java
TEST: sun/management/jdp/JdpSpecificAddressTest.java

test-docker-fedora33-ppc64le-1

test-skytap-ubuntu2004-ppc64le-1

TEST RESULT: Failed. Execution failed: `main' threw exception: java.lang.RuntimeException: Actual abort ratio (1002) should lower or equal to specified (0).: expected that 1002 <= 0

Passed 2 out of 5 times.

@Haroon-Khel
Copy link
Contributor Author

Haroon-Khel commented Dec 21, 2023

JDK21

test-docker-centos8-x64-1

TEST: java/lang/ProcessHandle/InfoTest.java
TEST: java/lang/reflect/Proxy/ClassRestrictions.java
TEST: java/lang/runtime/SwitchBootstrapsTest.java
TEST: java/lang/ScopedValue/UnboundValueAfterOOME.java
TEST: java/lang/String/RegionMatches.java
TEST: java/lang/System/LoggerFinder/RecursiveLoading/PlatformRecursiveLoadingTest.java
TEST: java/lang/System/LoggerFinder/RecursiveLoading/RecursiveLoadingTest.java
TEST: java/lang/System/LoggerFinder/SignedLoggerFinderTest/SignedLoggerFinderTest.java

test-docker-debian11-ppc64le-2

TEST: sun/management/jmxremote/bootstrap/CustomLauncherTest.java
TEST: sun/management/jmxremote/bootstrap/LocalManagementTest.java
  • jdk_tools_1
TEST: com/sun/tools/attach/BasicTests.java
TEST: com/sun/tools/attach/TempDirTest.java
TEST: sun/jvmstat/monitor/MonitoredVm/TestPollingInterval.java
TEST: sun/tools/jcmd/TestJcmdDefaults.java
TEST: sun/tools/jcmd/TestJcmdSanity.java
TEST: sun/tools/jinfo/JInfoTest.java
TEST: sun/tools/jps/TestJps.java
TEST: sun/tools/jps/TestJpsSanity.java
TEST: sun/tools/jstat/JStatInterval.java
TEST: tools/jlink/JLinkDedupTestBatchSizeOne.java
TEST: sun/jvmstat/monitor/MonitoredVm/MonitorVmStartTerminate.java
  • jdk_jfr_0
TEST: jdk/jfr/api/consumer/streaming/TestBaseRepositoryAfterStart.java
TEST: jdk/jfr/api/consumer/streaming/TestBaseRepositoryLastModified.java

test-docker-debian11-ppc64le-1

TEST: java/util/concurrent/LinkedTransferQueue/WhiteBox.java
TEST: jdk/internal/util/ArchTest.java

@sxa
Copy link
Member

sxa commented Jan 24, 2024

List of test failures on JDK8/arm32 (at a minimum) including the perf test suites which are failing in the containerised environments on the arm64 hosts, but are ok on the two physical ODROID machines:
#3043

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

No branches or pull requests

4 participants