Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TeamCity server cpu and memory spike after enabling tcwebhooks #212

Open
dom747 opened this issue Mar 22, 2023 · 8 comments
Open

TeamCity server cpu and memory spike after enabling tcwebhooks #212

dom747 opened this issue Mar 22, 2023 · 8 comments

Comments

@dom747
Copy link

dom747 commented Mar 22, 2023

Expected Behavior

We had the tcwebhooks enabled for a long time and there are 79 webhooks configured. It was working.

Current Behavior

Last week, after changing out agent ami and applying the cloud profiles, the agents were not able to upgrade due to plugins out of date. The cpu got to 100% usage and stayed there. Agents were not able to start builds. We contacted jetbrains support with a support ticket and their response was that the tc webhooks plugi was causing the issue. We disabled it and the server got back to normal.
Then I upgraded the plugin to the latest version (1.2.1) and enabled it again, and restarted the server. Again the memory and cpu spiked way up. I've had to disable it again now.

This was the response from Jetbrains:

The main cause of the issue, it seems, is related to the https://plugins.jetbrains.com/plugin/8948-web-hooks-tcwebhooks- plugin - the executors it spawns seem to be CPU-intensive and are running for prolonged periods of time:

28m:07s Task: 'webhook.teamcity.executor.BuildEventWebHookRunner@78777d59'
at webhook.teamcity.payload.content.ExtraParameters.getActual(ExtraParameters.java:130)
at webhook.teamcity.payload.content.ExtraParameters.put(ExtraParameters.java:79)
at webhook.teamcity.payload.content.ExtraParameters.putAll(ExtraParameters.java:98)
at webhook.teamcity.WebHookContentBuilder.mergeParameters(WebHookContentBuilder.java:431)
at webhook.teamcity.WebHookContentBuilder.buildWebHookContent(WebHookContentBuilder.java:185)
at webhook.teamcity.executor.BuildEventWebHookRunner.getWebHookContent(BuildEventWebHookRunner.java:53)
at webhook.teamcity.executor.AbstractWebHookExecutor.run(AbstractWebHookExecutor.java:65)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Can you please check the version of plugin in use and try to upgrade it if a newer release is available? If that does not help, can you please try to disable the plugin and let me know if it helps with the server performance?"

The support ticket with jetbrains: Request #4905074
The logs are there.
I can provide the logs here if there is a way to provide them..

Your Environment

  • tcWebHooks Version: 1.2.1
  • TeamCity Version: [TeamCity Enterprise] 2022.04.4 (build 108763)
  • TeamCity server Operating System: Centos 7
  • Are you using a WebHook Template?: Yes

Example Configuration (xml)

Can you let me know where the xml file is ? Unfortunately, I have not configured any of these and I don't have experience with the plugin. I might be able to get more info on it.

@arthursmel
Copy link

Since the plugin was disabled, we no longer can access the webhook configurations for each project.
We were using the following template:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<webhook-templates>
    <webhook-template id="tc-pr-verify-webhook" enabled="true" rank="100" format="jsonTemplate">
        <template-description>PR Verify Webhook</template-description>
        <template-tool-tip></template-tool-tip>
        <preferred-date-format></preferred-date-format>
        <templates max-id="0">
            <template id="0">
                <template-text use-for-branch-template="true">{
    'event_key': '${notifyType}',
    'build_id': '${buildId}',
    'build_tags': '${buildTags}',
    'build_text': '${text}',
    'build_url': '${buildStatusUrl}'
}</template-text>
                <branch-template-text></branch-template-text>
                <states>
                    <state type="buildInterrupted" enabled="true"/>
                    <state type="buildSuccessful" enabled="true"/>
                    <state type="buildFailed" enabled="true"/>
                    <state type="buildFixed" enabled="true"/>
                    <state type="buildBroken" enabled="true"/>
                </states>
            </template>
        </templates>
    </webhook-template>
</webhook-templates>

@netwolfuk
Copy link
Member

netwolfuk commented Mar 22, 2023

Hi @dom747 . Thank you for the detailed bug report.

Have you previously been running any of the tcWebHooks pre-release 1.2.0 versions (eg, Alpha, or Release Candidate), or were you previously running 1.1.x ?

I am trying to determine if the change is because of the AMIs, or because of a recent tcWebHooks upgrade. The ExtraParameters logic has changed in 1.2.0 and above, but has been in alpha a year at least. If you happened to try any of those versions that would help me pinpoint the issue.

If you're running Centos, have you changed any of the SELinux configurations? By default SELinux is enabled. I'm not sure that will matter in this case. Just interesting to know.

I can't seem to find ticket 4905074 on the jertbrains youtrack instance. I will email their support team and ask to get access to your support ticket. Is that ok?

The webhook XML file is located on the server in BuildServer/config/projects/yourProject/pluginData/plugin-settings.xml

@netwolfuk
Copy link
Member

netwolfuk commented Mar 22, 2023

Executing webhooks in threads was added in 1.2.0 also.

You could try disabling threading in tcWebHooks by creating a <webhooks> section in your BuildServer/config/main-config.xml

It looks like this...

<?xml version="1.0" encoding="UTF-8"?>
<server>
  <webhooks useThreadedExecutor="false">
  </webhooks>
</server>

@dom747
Copy link
Author

dom747 commented Mar 22, 2023

Thanks for the reply. We have been using 1.2.0 for a while, I believe. We have not changed any SELinux configurations. Yes it's ok for you to get information from them.
Thanks.
I could try your suggestion. The challenge is that it took 2 hours to get the server back to a working state after my last attempt to enable the plugin. We have 1000 AWS agents and they all need an update, and the server got stuck until I restarted 3 times.
My jetbrains ticket was a support ticket , not on their YouTrack.

@netwolfuk
Copy link
Member

Wow, that's awesome!. I have not tested it on 1000 agents. At that scale, it could be a concurrency issue with the ExtraParameters object. I am trying to figure out how this could be an issue based on the number of agents, but I suspect it's that my code is consuming all the threads on the ThreadPool Executor that TeamCity allows plugins to access.

If it's the same thread pool, maybe my code is starving it for all the other threads, including the threads that communicate with the agents.

I feel like the root cause could be something with the change to the lastest centos. I will create some VMs and try to replicate the issue. My personal budget won't scale to 1000 agents though :-).

Are you using a public AMI? If so, I could use the same ones as you for testing.

@dom747
Copy link
Author

dom747 commented Mar 22, 2023

We have our own ami built using packer

@netwolfuk
Copy link
Member

I've reached out to a couple of mates at TeamCity, but don't have a contact for "support".
Could you please email me (my same username as here but at gmail) with a TeamCity support email address?

Also, if you can share you packer script (remove anything that's private), or at least the base AMI that packer is using. Perhaps via email too, rather than on this public forum. Thanks!

@netwolfuk
Copy link
Member

Release 2.0.0. Release Candidate 1 released to try to resolve the issues associated with this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants