Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate / clear up exceptions in jenkins system log #3514

Open
Tracked by #3380
sxa opened this issue Apr 10, 2024 · 4 comments
Open
Tracked by #3380

Investigate / clear up exceptions in jenkins system log #3514

sxa opened this issue Apr 10, 2024 · 4 comments

Comments

@sxa
Copy link
Member

sxa commented Apr 10, 2024

A quick search on the recent ones (today and yesterday) can be done with grep -h '^[^\ ]*Exception' /var/log/jenkins/jenkins.log /var/log/jenkins/jenkins.log.1 | sort | uniq -c

The two largest culprits at the moment are

  • 31 java.io.EOFException: unexpected stream termination Looks to be consistently following SEVERE c.m.a.v.r.AzureVMAgentSSHLauncher#launch: Got exception on agent azurebuildagentxxxxxx
  • 62 java.io.IOException: Agent failed to connect, even though the launcher didn't report it. See the log output for details. Looks consistently following WARNING hudson.slaves.NodeProvisioner#update: Unexpected exception encountered while provisioning agent azurebuildagent so both are with the deprecated Azure plugin
  • 657 java.lang.IllegalStateException: completed or broken execution These seem to be exclusively on a particular job from Feb 20 which we can probably delete WARNING o.j.p.w.cps.CpsStepContext$2#onFailure: Failed to proceed after CpsStepContext[287:build]:Owner[build-scripts/jobs/jdk8u/jdk8u-windows-x64-temurin/493:build-scripts/jobs/jdk8u/jdk8u-windows-x64-temurin #493]
  • 1329 java.lang.reflect.InaccessibleObjectException: Unable to make field private static final long java.nio.channels.ClosedChannelException.serialVersionUID accessible: module java.base does not "opens java.nio.channels" to unnamed module @6be968ce Seems Orka related as it's consistently preceded by INFO o.j.p.cloudstats.CloudStatistics#getIdFor: No support for cloud-stats-plugin by class io.jenkins.plugins.orka.OrkaProvisionedAgent
  • 63 java.util.concurrent.TimeoutException: Idle timeout expired: 30000/30000 ms No constant cause on first look but many are No node in READY state is available to deploy to with a mention of orka. The others generally have WARNING h.i.i.InstallUncaughtExceptionHandler#handleException as the preceding line

Based on this clearing up things from the Azure and Orka provisioning would help with bullets 1, 2, 4 with 3 being one specific job that I can delete and the final one is partially with orka (Noting that as of 1142 today 100% of those were orka)

@sxa sxa mentioned this issue Apr 10, 2024
18 tasks
@sxa
Copy link
Member Author

sxa commented Apr 12, 2024

I've deleted job 493 from bullet 3 - if anyone wants to look into it, here is a tarball with the job directory from it, plus the two at either side:
windows-build-493.tar.gz

@sxa
Copy link
Member Author

sxa commented Apr 25, 2024

Azure cloud definition has been removed

@sxa
Copy link
Member Author

sxa commented Apr 30, 2024

Adding these which are also occurring:

Caused: java.io.IOException
java.lang.reflect.InaccessibleObjectException: Unable to make field private static final long java.nio.channels.ClosedChannelException.serialVersionUID accessible: module java.base does not "opens java.nio.channels" to unnamed module @6be968ce
Caused: java.lang.RuntimeException: Failed to serialize hudson.slaves.OfflineCause$ChannelTermination#cause for class hudson.slaves.OfflineCause$ChannelTermination
Caused: java.lang.RuntimeException: Failed to serialize hudson.model.Node#temporaryOfflineCause for class hudson.slaves.DumbSlave
javax.crypto.IllegalBlockSizeException: Input length must be multiple of 16 when decrypting with padded cipher
Caused: java.io.IOException
Caused: java.lang.reflect.InvocationTargetException

(Final three are fairly common but potentially related to the new Debian container that was done for #3043 which seems to have been producing some exceptions - the others less common)
Input length one had 1354 instances yesterday,

@sxa
Copy link
Member Author

sxa commented May 3, 2024

Running this again today, we have 18 occurences of lines like this, all on different build jobs:
java.io.IOException: Cannot save actions for StepAtomNode[id=622, exec=CpsFlowExecution[Owner[build-scripts/jobs/jdk17u/jdk17u-mac-x64-temurin/517:build-scripts/jobs/jdk17u/jdk17u-mac-x64-temurin #517]]] for completed execution CpsFlowExecution[Owner[build-scripts/jobs/jdk17u/jdk17u-mac-x64-temurin/517:build-scripts/jobs/jdk17u/jdk17u-mac-x64-temurin #517]]: [org.jenkinsci.plugins.workflow.cps.actions.ArgumentsActionImpl@2e4d12f0, org.jenkinsci.plugins.workflow.actions.TimingAction@4a163c4a, org.jenkinsci.plugins.workflow.support.actions.LogStorajava.io.IOException: Cannot save actions for StepAtomNode[id=622, exec=CpsFlowExecution[Owner[build-scripts/jobs/jdk17u/jdk17u-mac-x64-temurin/517:build-scripts/jobs/jdk17u/jdk17u-mac-x64-temurin #517]]] for completed execution CpsFlowExecution[Owner[build-scripts/jobs/jdk17u/jdk17u-mac-x64-temurin/517:build-scripts/jobs/jdk17u/jdk17u-mac-x64-temurin #517]]: [org.jenkinsci.plugins.workflow.cps.actions.ArgumentsActionImpl@2e4d12f0, org.jenkinsci.plugins.workflow.actions.TimingAction@4a163c4a, org.jenkinsci.plugins.workflow.support.actions.LogStorageAction@c0a7d6a, org.jenkinsci.plugins.workflow.actions.LabelAction@1640e7d3, io.jenkins.blueocean.listeners.NodeDownstreamBuildAction@7d535a15] geAction@c0a7d6a, org.jenkinsci.plugins.workflow.actions.LabelAction@1640e7d3, io.jenkins.blueocean.listeners.NodeDownstreamBuildAction@7d535a15]

Other than that we have the following that are in double figures from yesterday's log:

  • 21 java.lang.IllegalStateException: completed or broken execution All seem to be following these warnings (Differing jobs): 2024-05-02 19:54:46.843+0000 [id=64] WARNING o.j.p.w.cps.CpsStepContext$2#onFailure: Failed to proceed after CpsStepContext[69:build]:Owner[build-scripts/evaluation-openjdk22-pipeline/26:build-scripts/evaluation-openjdk22-pipeline #26]
  • 26 java.lang.NullPointerException No consistent cause immediately obvious
  • 1803 java.lang.reflect.InaccessibleObjectException: Unable to make field private static final long java.nio.channels.ClosedChannelException.serialVersionUID accessible: module java.base does not "opens java.nio.channels" to unnamed module @6be968ce Likely orka related - covered in Look at moving macstadium machines to orka #2536 (comment)
  • 38 java.util.concurrent.TimeoutException: Idle timeout expired: 30000/30000 ms 24 of these are from orka
  • 17 org.kohsuke.stapler.framework.adjunct.NoSuchAdjunctException: Neither lib.form.hetero-list.hetero-list.css, .js, .html, nor .jelly were found Possibly access/crumb issue

Significantly fewer than before, and most of the remaining ones seem to be relating to the orka plugin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant