Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

edgeAgent automatically stopping all modules #7227

Open
JelleCaptic opened this issue Mar 4, 2024 · 1 comment
Open

edgeAgent automatically stopping all modules #7227

JelleCaptic opened this issue Mar 4, 2024 · 1 comment
Assignees

Comments

@JelleCaptic
Copy link

Expected Behavior

I would assume modules can stay up without a certain time limit or auto-shutdown mechanism. On top of that, I would expect certain reasoning for automatically stopping modules.

Current Behavior

At a random timestamp the edgeAgent stopped all modules and executed a stop command for all my running modules

Steps to Reproduce

Provide a detailed set of steps to reproduce the bug.

  • Up until now I was unable to re-produce the issue. I have other edge devices that have modules running for over 1 month without any issues.

Context (Environment)

Output of iotedge check

All iotedge checks are ok.

Device Information

  • Host OS [Ubuntu 20.04.5 LTS]:
  • Architecture [aarch64]:
  • Container OS [Linux containers]:

Runtime Versions

  • aziot-edged [1.4.20]:
  • Edge Agent [1.4.29]:
  • Edge Hub [1.4.29]:
  • Docker/Moby :
Client:
 Version:           23.0.7+azure-1
 API version:       1.42
 Go version:        go1.19.12
 Git commit:        92955ed461e91154ed722ed831e9f22ea93fd7e9
 Built:             Wed Aug 23 12:04:52 UTC 2023
 OS/Arch:           linux/arm64
 Context:           default

Server:
 Engine:
  Version:          23.0.7+azure-1
  API version:      1.42 (minimum version 1.12)
  Go version:       go1.19.12
  Git commit:       606a7a76949ec08cb26940037b1c71c0024c6b44
  Built:            Wed Sep  6 06:38:37 2023
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.6.24-2
  GitCommit:        61f9fd88f79f081d64d6fa3bb1a0dc71ec870523
 runc:
  Version:          1.1.9-1
  GitCommit:        ccaecfcbc907d70a7aa870a6650887b901b25b82
 docker-init:
  Version:          0.19.0
  GitCommit:

Logs

edge-agent logs

{"log":"\u003c6\u003e 2024-03-04 07:09:38.280 +00:00 [INF] - Starting periodic operation refresh twin config...\n","stream":"stdout","time":"2024-03-04T07:09:38.280722368Z"}
{"log":"\u003c6\u003e 2024-03-04 07:09:38.303 +00:00 [INF] - Obtained Edge agent twin from IoTHub with desired properties version 108 and reported properties version 871.\n","stream":"stdout","time":"2024-03-04T07:09:38.303908896Z"}
{"log":"\u003c6\u003e 2024-03-04 07:09:38.314 +00:00 [INF] - Successfully completed periodic operation refresh twin config\n","stream":"stdout","time":"2024-03-04T07:09:38.31436288Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.122 +00:00 [INF] - Termination requested, initiating shutdown.\n","stream":"stdout","time":"2024-03-04T07:13:25.123151872Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.123 +00:00 [INF] - Main thread terminated\n","stream":"stdout","time":"2024-03-04T07:13:25.123461504Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.131 +00:00 [INF] - Initiating shutdown cleanup.\n","stream":"stdout","time":"2024-03-04T07:13:25.141976576Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.133 +00:00 [INF] - Stopping all modules...\n","stream":"stdout","time":"2024-03-04T07:13:25.142001824Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.168 +00:00 [INF] - Waiting for cleanup to finish\n","stream":"stdout","time":"2024-03-04T07:13:25.1692384Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.270 +00:00 [INF] - Plan execution started for deployment -1\n","stream":"stdout","time":"2024-03-04T07:13:25.273090144Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.290 +00:00 [INF] - Executing command: \"Stop module db\"\n","stream":"stdout","time":"2024-03-04T07:13:25.291056896Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.291 +00:00 [INF] - Executing command: \"Stop module edge-data-manager\"\n","stream":"stdout","time":"2024-03-04T07:13:25.291107968Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.291 +00:00 [INF] - Executing command: \"Stop module api\"\n","stream":"stdout","time":"2024-03-04T07:13:25.291170176Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.291 +00:00 [INF] - Executing command: \"Stop module edge-api-frontend\"\n","stream":"stdout","time":"2024-03-04T07:13:25.291252768Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.291 +00:00 [INF] - Executing command: \"Stop module edge-frontend\"\n","stream":"stdout","time":"2024-03-04T07:13:25.291321536Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.302 +00:00 [INF] - Updated reported properties\n","stream":"stdout","time":"2024-03-04T07:13:25.302620832Z"}
{"log":"2024-03-04 07:13:40  Starting Edge Agent\n","stream":"stdout","time":"2024-03-04T07:13:40.783627552Z"}
{"log":"2024-03-04 07:13:40  Changing ownership of storage folder: /tmp/edgeAgent to 13622\n","stream":"stdout","time":"2024-03-04T07:13:40.811136352Z"}
{"log":"2024-03-04 07:13:40  Changing ownership of backup folder: /tmp/edgeAgent_backup to 13622\n","stream":"stdout","time":"2024-03-04T07:13:40.817397472Z"}
{"log":"2024-03-04 07:13:40  Changing ownership of management socket: /var/run/iotedge/mgmt.sock\n","stream":"stdout","time":"2024-03-04T07:13:40.831135328Z"}
{"log":"2024-03-04 07:13:40  Completed necessary setup. Starting Edge Agent.\n","stream":"stdout","time":"2024-03-04T07:13:40.851940512Z"}
{"log":"2024-03-04 07:13:40.896 +00:00 Edge Agent Main()\n","stream":"stdout","time":"2024-03-04T07:13:40.907348672Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:41.206 +00:00 [INF] - Initializing Edge Agent.\n","stream":"stdout","time":"2024-03-04T07:13:41.232074208Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:41.417 +00:00 [INF] - Version - 1.4.29.85541525 (f049017a5072f85aec250e945ebb2ca52e4af3f4)\n","stream":"stdout","time":"2024-03-04T07:13:41.417553152Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:41.418 +00:00 [INF] - \n","stream":"stdout","time":"2024-03-04T07:13:41.418499744Z"}

edge-hub logs

{"log":"\u003c7\u003e 2024-03-04 07:13:23.775 +00:00 [DBG] [Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets] - Connection id \"\"ID\"\" received FIN.\n","stream":"stdout","time":"2024-03-04T07:13:23.775449856Z"}
{"log":"\u003c7\u003e 2024-03-04 07:13:23.775 +00:00 [DBG] [Microsoft.AspNetCore.Server.Kestrel.Connections] - Connection id \"\"ID\"\" disconnecting.\n","stream":"stdout","time":"2024-03-04T07:13:23.775573856Z"}
{"log":"\u003c7\u003e 2024-03-04 07:13:23.775 +00:00 [DBG] [Microsoft.AspNetCore.Server.Kestrel.Connections] - Connection id \"\"ID\"\" stopped.\n","stream":"stdout","time":"2024-03-04T07:13:23.775600288Z"}
{"log":"\u003c7\u003e 2024-03-04 07:13:23.775 +00:00 [DBG] [Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets] - Connection id \"\"ID\"\" sending FIN because: \"\"The Socket transport's send loop completed gracefully.\"\"\n","stream":"stdout","time":"2024-03-04T07:13:23.775730112Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.127 +00:00 [INF] [EdgeHub] - Termination requested, initiating shutdown.\n","stream":"stdout","time":"2024-03-04T07:13:25.129527008Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.128 +00:00 [INF] [EdgeHub] - Stopping the protocol heads...\n","stream":"stdout","time":"2024-03-04T07:13:25.12964Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.130 +00:00 [INF] [EdgeHub] - Closing protocol heads - (MQTT, HTTP)\n","stream":"stdout","time":"2024-03-04T07:13:25.132696384Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.136 +00:00 [INF] [Microsoft.Azure.Devices.Edge.Hub.Mqtt.MqttProtocolHead] - Stopping MQTT protocol head\n","stream":"stdout","time":"2024-03-04T07:13:25.137041024Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.141 +00:00 [INF] [Microsoft.Azure.Devices.Edge.Hub.Http.HttpProtocolHead] - Closing HTTP head\n","stream":"stdout","time":"2024-03-04T07:13:25.14174832Z"}
{"log":"\u003c7\u003e 2024-03-04 07:13:25.143 +00:00 [DBG] [Microsoft.AspNetCore.Hosting.Diagnostics] - Hosting shutdown\n","stream":"stdout","time":"2024-03-04T07:13:25.1432352Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.177 +00:00 [INF] [EdgeHub] - Waiting for cleanup to finish\n","stream":"stdout","time":"2024-03-04T07:13:25.178299328Z"}
{"log":"\u003c7\u003e 2024-03-04 07:13:25.256 +00:00 [DBG] [Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets] - Connection id \"\"ID\"\" received FIN.\n","stream":"stdout","time":"2024-03-04T07:13:25.257041696Z"}
{"log":"\u003c7\u003e 2024-03-04 07:13:25.259 +00:00 [DBG] [Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets] - Connection id \"\"ID\"\" sending FIN because: \"\"The Socket transport's send loop completed gracefully.\"\"\n","stream":"stdout","time":"2024-03-04T07:13:25.259576512Z"}
{"log":"\u003c7\u003e 2024-03-04 07:13:25.259 +00:00 [DBG] [Microsoft.AspNetCore.Server.Kestrel.Connections] - Connection id \"\"ID\"\" disconnecting.\n","stream":"stdout","time":"2024-03-04T07:13:25.260128576Z"}
{"log":"\u003c4\u003e 2024-03-04 07:13:25.260 +00:00 [WRN] [Microsoft.Azure.Devices.Edge.Hub.Mqtt.ServerWebSocketChannel] - Error reading transport. CorrelationId ID\n","stream":"stdout","time":"2024-03-04T07:13:25.264976352Z"}
{"log":"System.Net.WebSockets.WebSocketException (0x80004005): The remote party closed the WebSocket connection without completing the close handshake.\n","stream":"stdout","time":"2024-03-04T07:13:25.265037504Z"}
{"log":"   at System.Net.WebSockets.ManagedWebSocket.ThrowIfEOFUnexpected(Boolean throwOnPrematureClosure)\n","stream":"stdout","time":"2024-03-04T07:13:25.265044064Z"}
{"log":"   at System.Net.WebSockets.ManagedWebSocket.EnsureBufferContainsAsync(Int32 minimumRequiredBytes, CancellationToken cancellationToken, Boolean throwOnPrematureClosure)\n","stream":"stdout","time":"2024-03-04T07:13:25.265048352Z"}
{"log":"   at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)\n","stream":"stdout","time":"2024-03-04T07:13:25.265078624Z"}
{"log":"   at System.Net.WebSockets.ManagedWebSocket.ReceiveAsyncPrivate[TResult](Memory`1 payloadBuffer, CancellationToken cancellationToken)\n","stream":"stdout","time":"2024-03-04T07:13:25.265084608Z"}
{"log":"   at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource\u003cTResult\u003e.GetResult(Int16 token)\n","stream":"stdout","time":"2024-03-04T07:13:25.265088352Z"}
{"log":"   at System.Threading.Tasks.ValueTask`1.ValueTaskSourceAsTask.\u003c\u003ec.\u003c.cctor\u003eb__4_0(Object state)\n","stream":"stdout","time":"2024-03-04T07:13:25.265092256Z"}
{"log":"--- End of stack trace from previous location ---\n","stream":"stdout","time":"2024-03-04T07:13:25.265095712Z"}
{"log":"   at Microsoft.Azure.Devices.Edge.Hub.Mqtt.ServerWebSocketChannel.DoReadBytes(IByteBuffer byteBuffer) in /mnt/vss/_work/1/s/edge-hub/core/src/Microsoft.Azure.Devices.Edge.Hub.Mqtt/ServerWebSocketChannel.cs:line 239\n","stream":"stdout","time":"2024-03-04T07:13:25.26509904Z"}
{"log":"   at Microsoft.Azure.Devices.Edge.Hub.Mqtt.ServerWebSocketChannel.DoBeginRead() in /mnt/vss/_work/1/s/edge-hub/core/src/Microsoft.Azure.Devices.Edge.Hub.Mqtt/ServerWebSocketChannel.cs:line 133\n","stream":"stdout","time":"2024-03-04T07:13:25.265104608Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.265 +00:00 [INF] [Microsoft.Azure.Devices.Edge.Hub.Mqtt.ServerWebSocketChannel] - Transport aborted. CorrelationId ID\n","stream":"stdout","time":"2024-03-04T07:13:25.26565424Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.266 +00:00 [INF] [Microsoft.AspNetCore.Server.Kestrel.Connections] - Connection id \"\"ID\"\", Request id \"\"ID:00000002\"\": the application aborted the connection.\n","stream":"stdout","time":"2024-03-04T07:13:25.266200416Z"}
{"log":"\u003c7\u003e 2024-03-04 07:13:25.266 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Hub.Http.Middleware.WebSocketHandlingMiddleware] - Request ID:00000002 completed. CorrelationId ID\n","stream":"stdout","time":"2024-03-04T07:13:25.269377952Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.266 +00:00 [INF] [Microsoft.AspNetCore.Hosting.Diagnostics] - \"Request finished HTTP/1.1 GET https://gsd7449186/$iothub/websocket - - - 101 - - 956745.0866ms\"\n","stream":"stdout","time":"2024-03-04T07:13:25.269454272Z"}

Additional Information

All the modules came back up after a couple of minutes automatically.

What I would like to get more info about:

  • What triggers the automatic shutdown of all modules because this needs to be tackled.
  • Is there a way to prevent this?
@david-emakenemi
Copy link

@bilalsellak can you take a look here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants