Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System unavailable: Jenkins failing to initiate new jobs correctly. #3552

Closed
sxa opened this issue May 5, 2024 · 3 comments
Closed

System unavailable: Jenkins failing to initiate new jobs correctly. #3552

sxa opened this issue May 5, 2024 · 3 comments

Comments

@sxa
Copy link
Member

sxa commented May 5, 2024

  • Link to any log file showing the problem: And recent job e.g.

  • Please describe the issue: Pipeline jobs, seemingly including all build jobs and test jobs, are failing to start properly. There is a problem with git operations in particular in this situation and it is unclear why. The same comments that it is trying to execute run ok when run manually on the machine as the jenkins user (even in the same directory). For example:
    Caused by: java.io.IOException: Cannot run program "git"

Here is an example from an attempt to run a Grinder (#9858 but it doesn't seem to matter):

  • 2024-05-04 13:34:05.581+0000 [id=4373048] WARNING o.j.p.w.flow.FlowExecutionList#unregister: Owner[Grinder/9858:Grinder #9858] was not in the list to begin with: [Owner[build-scripts/utils/betaTrigger_21ea/49:build-scripts/utils/betaTrigger_21ea #49], Owner[build-scripts/openjdk21-pipeline/272:build-scripts/openjdk21-pipeline #272], Owner (truncated - there's a large list of things in that array.
    Something similar happens on build pipelines, which leaves me to believe it's a general jenkins issue rather than something specific to a particular pipeline e.g.:
  • 2024-05-04 13:36:05.422+0000 [id=4373094] WARNING o.j.p.w.flow.FlowExecutionList#unregister: Owner[build-scripts/jobs/jdk17u/jdk17u-linux-aarch64-temurin/502:build-scripts/jobs/jdk17u/jdk17u-linux-aarch64-temurin #502] was not in the list to begin with: [Owner[build-scripts/utils/betaTrigger_21ea/49:build-scripts/utils/betaTrigger_21ea #49], Owner[build-scripts/openjdk21-pipeline/272:build-scripts/openjdk21-pipeline #272], Owner

There have been no recent plugin updates.

Full log details are in the next two collapsed sections for reference:

Here is the full log from that Grinder job showing the full exception track:
Started by user [Stewart X Addison](https://ci.adoptium.net/user/sxa)
Checking out git ${ADOPTOPENJDK_REPO} into /home/jenkins/.jenkins/workspace/Grinder@script/7d272c0688f17ab4e5b2f6ce77a7dc9cf4df33ff05c3a95eddd38682ef795b79 to read openjdk-tests/buildenv/jenkins/openjdk_tests
The recommended git tool is: git
No credentials specified
Cloning the remote Git repository
Using shallow clone with depth 1
Cloning repository https://github.com/adoptium/aqa-tests.git
 > git init /home/jenkins/.jenkins/workspace/Grinder@script/7d272c0688f17ab4e5b2f6ce77a7dc9cf4df33ff05c3a95eddd38682ef795b79/openjdk-tests # timeout=10
ERROR: Error cloning remote repo 'origin'
hudson.plugins.git.GitException: Could not init /home/jenkins/.jenkins/workspace/Grinder@script/7d272c0688f17ab4e5b2f6ce77a7dc9cf4df33ff05c3a95eddd38682ef795b79/openjdk-tests
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$5.execute(CliGitAPIImpl.java:1073)
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$2.execute(CliGitAPIImpl.java:819)
	at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1222)
	at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1305)
	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:129)
	at org.jenkinsci.plugins.workflow.cps.CpsScmFlowDefinition.create(CpsScmFlowDefinition.java:165)
	at org.jenkinsci.plugins.workflow.cps.CpsScmFlowDefinition.create(CpsScmFlowDefinition.java:71)
	at org.jenkinsci.plugins.workflow.job.WorkflowRun.run(WorkflowRun.java:311)
	at hudson.model.ResourceController.execute(ResourceController.java:101)
	at hudson.model.Executor.run(Executor.java:442)
Caused by: hudson.plugins.git.GitException: Error performing git command: git init /home/jenkins/.jenkins/workspace/Grinder@script/7d272c0688f17ab4e5b2f6ce77a7dc9cf4df33ff05c3a95eddd38682ef795b79/openjdk-tests
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2858)
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2762)
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2757)
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommand(CliGitAPIImpl.java:2051)
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$5.execute(CliGitAPIImpl.java:1071)
	... 9 more
Caused by: java.io.IOException: Cannot run program "git" (in directory "/home/jenkins/.jenkins/workspace/Grinder@script/7d272c0688f17ab4e5b2f6ce77a7dc9cf4df33ff05c3a95eddd38682ef795b79/openjdk-tests"): error=0, Failed to exec spawn helper: pid: 2568427, exit value: 1
	at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1143)
	at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1073)
	at hudson.Proc$LocalProc.<init>(Proc.java:252)
	at hudson.Proc$LocalProc.<init>(Proc.java:221)
	at hudson.Launcher$LocalLauncher.launch(Launcher.java:994)
	at hudson.Launcher$ProcStarter.start(Launcher.java:506)
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2835)
	... 13 more
Caused by: java.io.IOException: error=0, Failed to exec spawn helper: pid: 2568427, exit value: 1
	at java.base/java.lang.ProcessImpl.forkAndExec(Native Method)
	at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:314)
	at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:244)
	at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1110)
	... 19 more
ERROR: Error cloning remote repo 'origin'
ERROR: Maximum checkout retry attempts reached, aborting
Finished: FAILURE
Entry in jenkins server log from the above job with the full `WARNING` line
  • 2024-05-04 13:34:05.581+0000 [id=4373048] WARNING o.j.p.w.flow.FlowExecutionList#unregister: Owner[Grinder/9858:Grinder #9858] was not in the list to begin with: [Owner[build-scripts/utils/betaTrigger_21ea/49:build-scripts/utils/betaTrigger_21ea #49], Owner[build-scripts/openjdk21-pipeline/272:build-scripts/openjdk21-pipeline #272], Owner[build-scripts/jobs/jdk21u/jdk21u-linux-riscv64-temurin/33:build-scripts/jobs/jdk21u/jdk21u-linux-riscv64-temurin #33], Owner[build-scripts/utils/pipeline_jobs_generator_jdk21u/190:build-scripts/utils/pipeline_jobs_generator_jdk21u #190], Owner[AQA_Test_Pipeline/243:AQA_Test_Pipeline #243], Owner[Test_openjdk21_hs_extended.perf_x86-64_linux/51:Test_openjdk21_hs_extended.perf_x86-64_linux #51], Owner[Test_openjdk22_hs_extended.system_x86-64_linux/46:Test_openjdk22_hs_extended.system_x86-64_linux #46], Owner[Test_openjdk21_hs_extended.openjdk_x86-64_linux/52:Test_openjdk21_hs_extended.openjdk_x86-64_linux #52], Owner[Test_openjdk21_hs_extended.system_x86-64_linux/171:Test_openjdk21_hs_extended.system_x86-64_linux #171], Owner[Test_openjdk21_hs_sanity.system_x86-64_linux/173:Test_openjdk21_hs_sanity.system_x86-64_linux #173], Owner[Test_openjdk21_hs_sanity.perf_x86-64_linux/169:Test_openjdk21_hs_sanity.perf_x86-64_linux #169], Owner[Test_openjdk11_hs_sanity.functional_x86-64_linux/530:Test_openjdk11_hs_sanity.functional_x86-64_linux #530], Owner[Test_openjdk11_hs_special.functional_x86-64_linux/191:Test_openjdk11_hs_special.functional_x86-64_linux #191], Owner[Test_openjdk8_hs_extended.system_x86-64_linux/1176:Test_openjdk8_hs_extended.system_x86-64_linux #1176], Owner[Test_openjdk8_hs_sanity.system_x86-64_linux/1179:Test_openjdk8_hs_sanity.system_x86-64_linux #1179], Owner[Test_openjdk21_hs_extended.functional_x86-64_linux/161:Test_openjdk21_hs_extended.functional_x86-64_linux #161], Owner[Test_openjdk8_hs_extended.functional_x86-64_linux/613:Test_openjdk8_hs_extended.functional_x86-64_linux #613], Owner[Test_openjdk21_hs_special.functional_x86-64_linux/49:Test_openjdk21_hs_special.functional_x86-64_linux #49], Owner[Test_openjdk21_hs_sanity.functional_x86-64_linux/163:Test_openjdk21_hs_sanity.functional_x86-64_linux #163], Owner[Test_openjdk21_hs_sanity.openjdk_x86-64_linux/189:Test_openjdk21_hs_sanity.openjdk_x86-64_linux #189], Owner[Test_openjdk8_hs_sanity.openjdk_x86-64_linux/1199:Test_openjdk8_hs_sanity.openjdk_x86-64_linux #1199], Owner[Test_openjdk11_hs_sanity.system_x86-64_linux/917:Test_openjdk11_hs_sanity.system_x86-64_linux #917], Owner[Test_openjdk8_hs_extended.perf_x86-64_linux/177:Test_openjdk8_hs_extended.perf_x86-64_linux #177], Owner[Test_openjdk11_hs_extended.system_x86-64_linux/897:Test_openjdk11_hs_extended.system_x86-64_linux #897], Owner[Test_openjdk8_hs_sanity.functional_x86-64_linux/614:Test_openjdk8_hs_sanity.functional_x86-64_linux #614], Owner[Test_openjdk8_hs_sanity.perf_x86-64_linux/1179:Test_openjdk8_hs_sanity.perf_x86-64_linux #1179], Owner[Test_openjdk11_hs_extended.openjdk_x86-64_linux/186:Test_openjdk11_hs_extended.openjdk_x86-64_linux #186], Owner[Test_openjdk11_hs_extended.functional_x86-64_linux/493:Test_openjdk11_hs_extended.functional_x86-64_linux #493], Owner[Test_openjdk11_hs_sanity.openjdk_x86-64_linux/967:Test_openjdk11_hs_sanity.openjdk_x86-64_linux #967], Owner[Test_openjdk11_hs_sanity.perf_x86-64_linux/916:Test_openjdk11_hs_sanity.perf_x86-64_linux #916], Owner[Test_openjdk8_hs_special.functional_x86-64_linux/709:Test_openjdk8_hs_special.functional_x86-64_linux #709], Owner[Test_openjdk8_hs_extended.openjdk_x86-64_linux/182:Test_openjdk8_hs_extended.openjdk_x86-64_linux #182], Owner[Test_openjdk11_hs_extended.perf_x86-64_linux/184:Test_openjdk11_hs_extended.perf_x86-64_linux #184], Owner[build-scripts/release-openjdk17-pipeline/65:build-scripts/release-openjdk17-pipeline #65], Owner[build-scripts/jobs/release/jobs/jdk17u/jdk17u-release-linux-riscv64-temurin/1:build-scripts/jobs/release/jobs/jdk17u/jdk17u-release-linux-riscv64-temurin #1], Owner[Test_openjdk17_hs_extended.openjdk_riscv64_linux/17:Test_openjdk17_hs_extended.openjdk_riscv64_linux #17], Owner[Test_openjdk17_hs_extended.openjdk_riscv64_linux_testList_2/4:Test_openjdk17_hs_extended.openjdk_riscv64_linux_testList_2 #4]]
Running manually gives no problems: ``` $ id uid=1000(jenkins) gid=1000(jenkins) groups=1000(jenkins) $ ls -ld /home/jenkins/.jenkins/workspace/Grinder@script/7d272c0688f17ab4e5b2f6ce77a7dc9cf4df33ff05c3a95eddd38682ef795b79/openjdk-tests drwxr-xr-x 2 jenkins jenkins 4096 May 4 15:23 /home/jenkins/.jenkins/workspace/Grinder@script/7d272c0688f17ab4e5b2f6ce77a7dc9cf4df33ff05c3a95eddd38682ef795b79/openjdk-tests $ ls -al /home/jenkins/.jenkins/workspace/Grinder@script/7d272c0688f17ab4e5b2f6ce77a7dc9cf4df33ff05c3a95eddd38682ef795b79/openjdk-tests total 8 drwxr-xr-x 2 jenkins jenkins 4096 May 5 11:57 . drwxr-xr-x 3 jenkins jenkins 4096 May 5 11:57 .. $ ``` Running `git init /home/jenkins/.jenkins/workspace/Grinder@script/7d272c0688f17ab4e5b2f6ce77a7dc9cf4df33ff05c3a95eddd38682ef795b79/openjdk-tests` does not show a problem

There is no obvious performance problem based on the logs from the last week:
image

@sxa
Copy link
Member Author

sxa commented May 5, 2024

As the system is relatively idle other than jobs to support #3501 (comment) and one Playbook check job which was in part to verify whether non-pipeline jobs were affected (they are not) I'm going to trigger a jenkins restart (Time 1022UTC)

@sxa
Copy link
Member Author

sxa commented May 5, 2024

Looks to be happier after the restart - Grinder 9861 kicked off without issues
Started by user [Stewart X Addison](https://ci.adoptium.net/user/sxa)
Checking out git ${ADOPTOPENJDK_REPO} into /home/jenkins/.jenkins/workspace/Grinder@script/7d272c0688f17ab4e5b2f6ce77a7dc9cf4df33ff05c3a95eddd38682ef795b79 to read openjdk-tests/buildenv/jenkins/openjdk_tests
The recommended git tool is: git
No credentials specified
 > git rev-parse --resolve-git-dir /home/jenkins/.jenkins/workspace/Grinder@script/7d272c0688f17ab4e5b2f6ce77a7dc9cf4df33ff05c3a95eddd38682ef795b79/openjdk-tests/.git # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/adoptium/aqa-tests.git # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
No valid HEAD. Skipping the resetting
 > git clean -fdx # timeout=10
Pruning obsolete local branches
Using shallow fetch with depth 1
Fetching upstream changes from https://github.com/adoptium/aqa-tests.git
 > git --version # timeout=10
 > git --version # 'git version 2.35.1'
 > git fetch --tags --force --progress --prune --depth=1 -- https://github.com/adoptium/aqa-tests.git +refs/heads/*:refs/remotes/origin/* # timeout=60
 > git rev-parse origin/master^{commit} # timeout=10
JENKINS-19022: warning: possible memory leak due to Git plugin usage; see: https://plugins.jenkins.io/git/#remove-git-plugin-buildsbybranch-builddata-script
Checking out Revision f0319c150c6ec8d6b92659321370dc7f0ccb762f (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f f0319c150c6ec8d6b92659321370dc7f0ccb762f # timeout=10
Commit message: "Exclude TestHandshake in JDK17 and JDK21 (#5279)"
 > git rev-list --no-walk f0319c150c6ec8d6b92659321370dc7f0ccb762f # timeout=10
[Pipeline] Start of Pipeline
[Pipeline] timestamps
[Pipeline] {
[Pipeline] echo
 SPEC: linux_x86-64
[Pipeline] echo
 LABEL: ci.role.test&&hw.arch.x86&&sw.os.linux
[Pipeline] stage
[Pipeline] { (Queue)
[Pipeline] nodesByLabel
 Found a total of 12 nodes with the 'ci.role.test&&hw.arch.x86&&sw.os.linux' label
[Pipeline] echo
 dynamicAgents: [azure, fyre]
[Pipeline] node
 Running on [test-docker-debian12-x64-3](https://ci.adoptium.net/computer/test%2Ddocker%2Ddebian12%2Dx64%2D3/) in /home/jenkins/workspace/Grinder
[...]

On the basis of this I'm going to close this issue. Noting that we have an update cycle planned for this Thursday so hopefully it will behave until then.

@sxa
Copy link
Member Author

sxa commented May 7, 2024

Noting this may have been due to an update to the Temurin JDK that happened a few days ago https://issues.jenkins.io/browse/JENKINS-72665?focusedId=445724&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-445724 (May 4th at 0501)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

1 participant