New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent Cmdlet failures inside a container with error "An error occurred when creating the WebSocket with the factory of type 'CoreClrClientWebSocketFactory'" #3437
Comments
Are you using BcContainerHelper functions or are you using Invoke-ScriptInBcContainer with your own script? |
In issue #3435 he is also using latest build of Windows 10 (22H2) - 10.0.19045.4170 |
Neither. We are actually running our own service (.NET8) inside the container that executes the cmdlets. It creates a PS session and loads the module, then executes scripts.
Will try and let you know.
Yes. At the moment we're having somewhat similar issues on 3 machines, all running the same version. We've seen this also on Windows Server 2016 (10.0.14393) and Windows Server 2019 (10.0.17763.5576). Currently don't have access to any Server 2022 or Windows 11 machines myself, but many of my colleagues use Windows11. I don't have anything reported from them - yet. |
Some more details: I've received some instances of On a different machine I've received reports of operations timing out because the service crashes. I have looked for details in the container's event log but found none (no usual .NET stacktrace), just the kernel fault.
Afterwards the event log is full of messages that the health check failed. |
Yeah - this looks like the Service Tier crashes and then subsequent admin commandlets will fail with coreclrclient... - that makes sense. |
Just an idea, because I have dealt with a similar issue in the past: Too many sockets being consumed by too many HttpClients being created? If I'm not mistaken you have replaced WCF (.NET Fx) with a REST API (.NET8). I don't know the code, but could it be that each time a cmdlet is invoked a HttpClient is spun up and disposed? Something that WCF used to take care of but now obviously can't anymore? We frequently (very, very frequently) query the apps on the service using Maybe it's not even the service that explodes, but having the client (Cmdlet) and server (BC) on the same machine (docker) and the client consuming all the sockets makes the server go boom? |
Tried this:
Doesn't give any problems here. |
I just reproduced this on my machine. Here's the output of the event log:
|
What did you do? |
No. I have set This time however, the service did not crash. It's still up and running, but I can no longer issue any management cmdlets as they all fail with the same message. The strange thing is: we're using the same identical setup with the same identical scripts for 3ish years now, never had an issue with them. Only with BC24 it started misbehaving. I'll try to replicate and log the actual cmdlet calls that are being made and update you. |
A lot of things changed in BC24 in this layer, but this obviously shouldn't fail. netstat -ab inside the container - what does that return? |
How are you creating sessions, running commands and removing sessions again? |
Now I did. You were almost there. Just iterating up to 100 is not enough, it needs to go up a little more. See below. Mine broke at 145.
It fills my screen with tons and tons of lines like the following
A clean container will have about 12 entries. One that runs the below script will easily reach 32k. Here's a clean repro: Enter a container using
Once the socket reaches 65534 the OS breaks. Mine gave in at iteration 145:
It will eventually release some sockets back to the OS, but way too slow. A development server or container that is heavily deployed against will fatigue fast. |
The error is in the CmdLets of BC24 - will check whether a fix will make it into BC24 |
Describe the issue
We're getting intermittent failures during Cmdlet execution with the new 'Microsoft.BusinessCentral.xxx.dll' modules for PS7. There does not seem to be a pattern, looks like a race condition, something that happens only under load even. Not really reproduceable. The same execution might fail and then work in a few minutes. The error we get is this:
An error occurred when creating the WebSocket with the factory of type 'CoreClrClientWebSocketFactory'. See the inner exception for details.
Might be related to #3435.
As I said, the container spins up fine. Everything works as expected. After a while the cmdlets executed from inside the container start failing with the message above. Once the error occurred, it keeps repeating. After a while they stop and things work again. No matter what cmdlet we invoke.
We are using the new cmdlets from
Admin\Microsoft.BusinessCentral.xxx.dll
using PWSH 7.4.1 inside the container. This only seems to happen on our pipelines when building apps with a rather large dependency chain. We therefore cannot even use the PS5 version of the cmdlets because things time out and the build fails for other reasons.Scripts used to create container and cause the issue
Full output of scripts
The text was updated successfully, but these errors were encountered: