Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: Cross-test caching of loaded scripts #606

Open
reitzig opened this issue Jul 11, 2023 · 3 comments
Open

Performance: Cross-test caching of loaded scripts #606

reitzig opened this issue Jul 11, 2023 · 3 comments

Comments

@reitzig
Copy link

reitzig commented Jul 11, 2023

What feature do you want to see added?

Currently, PipelineTestHelper#loadScript (or, more specifically, GroovyScriptEngine#loadScriptByName) accounts for roughly 95% of the running time of our test suite (>200 tests).
image

Now, I'm sure we've been working against anything that even closely resembles best practices, but still.

In a @BeforeEach method, we

  • run DelarativePipelineTestHelper#setUp,
  • set up some mocks,
  • register a shared library, and
  • load the pipeline script under test.

In the test class at hand, we have 27 tests, each of which performs a run of the pipeline (with different parameters, alas).

I see that GroovyScriptEngine has a cache, but that one is reset for every test.
I tried adding a cache for the class objects myself:

class CachingPipelineTestHelper extends PipelineTestHelper {
    private final static Map<String, Class> scriptClassCache = new HashMap<>()

    @Override
    Script loadScript(String scriptName, Binding binding) {
        Objects.requireNonNull(binding, "Binding cannot be null.");

        Class scriptClass;
        if (scriptClassCache.containsKey(scriptName)) {
            scriptClass = scriptClassCache.get(scriptName)
        } else {
            Objects.requireNonNull(gse,
                    "GroovyScriptEngine is not initialized: Initialize the helper by calling init().");
            scriptClass = gse.loadScriptByName(scriptName)
            scriptClassCache.put(scriptName, scriptClass)
        }

        setGlobalVars(binding)
        Script script = InvokerHelper.createScript(scriptClass, binding)
        InterceptingGCL.interceptClassMethods(script.getMetaClass(), this, binding)
        return script
    }
}

When I use that, I get java.lang.IllegalStateException: No agent description found in all but the first test; basically, they fail on the first expression inside pipeline { ... }.

I don't really understand what's going on, so I'll go ahead and phrase it as a feature request: Please provide facilities that allow us to load pipeline scripts only once per test run (or at least test class).

Upstream changes

No response

@nre-ableton
Copy link
Contributor

(>200 tests)

Just out of curiosity, how long (in absolute time) does your test suite take to run? 200 tests isn't very much. We have a library with 218 tests and it takes 16 seconds to run.

@reitzig
Copy link
Author

reitzig commented Jul 13, 2023

Huh. 😲 🤔

On my laptop, the whole suite runs for 3-4min; on the build server, 6-7min.

Most of that time (2-3min locally) is taken up by that one test (class) I profiled above with 20-ish tests, each of which ends up re-loading a ~500 line pipeline script (which liberally uses classes from the library, in case that matters).

I don't think I can share any code, unfortunately. 😕 Anything I should be on the lookout for?

@nre-ableton
Copy link
Contributor

nre-ableton commented Jul 13, 2023

Yeah, your analysis is on point. The difference in performance is definitely due to loading in the pipeline script for every test. And although I wouldn't consider a ~500 line pipeline script to be large, it's not small either. I'm not sure how complex the pipeline code is, but that might also play a role here as well.

Regardless, the reason our tests pass so quickly is because of the way that we test. We don't test any of our pipelines, but instead we build libraries and move any significant logic from the pipelines to them (see https://github.com/jenkinsci/JenkinsPipelineUnit/#writing-testable-libraries). This means that for our test cases we are literally loading an empty library and just testing class functionality, which explains the performance difference. Unfortunately, that also means we are comparing apples to oranges here...

I'm open to the idea of caching scripts, but this would need to be done in a very careful manner and also probably disabled by default. The obvious concern would be that pipeline scripts could contain static state which, in the worst case scenario a test suite would only run properly when the tests are executed in a certain order (for example). Generally speaking, the desired behavior for any unit testing framework is to start fresh with each test and not retain any state. If you can get your above code working and make a PR, I'd be happy to review it.

However, I think that you might have better luck either in refactoring the pipeline (ie, breaking up some of the logic into individually tested libraries or to shell/python scripts etc). I realize that this may be a case of "easier said than done" given the various restraints of your codebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants