Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sporadic read ETIMEDOUT error #2651

Closed
invaderb opened this issue May 3, 2024 · 9 comments
Closed

Sporadic read ETIMEDOUT error #2651

invaderb opened this issue May 3, 2024 · 9 comments

Comments

@invaderb
Copy link

invaderb commented May 3, 2024

Hey all,

Hoping to get some help with this sporadic error I'm seeing with mysql2

Same background info on what I'm using
node 20
nestjs 10
typeorm 0.3.20
mysql2 3.9.4

I have 2 nodes running behind a load balancer that do connect over a virtual network tunnel to an on prem server that hosts the Mariadb on version 8.1

Here's my config

{
	name: 'default',
	type: process.env.DB_TYPE as 'mysql' | 'sqlite' || 'mysql',
	host: process.env.DB_HOST || 'localhost',
	port: parseInt(process.env.DB_PORT) || 3306,
	username: process.env.DB_USERNAME,
	password: process.env.DB_PASSWORD,
	database: process.env.DB_NAME,
	entities: [__dirname + '/**/*.entity{.ts,.js}'],
	synchronize: false,
	connectionTimeout: 30000,
	extra: {
		idleTimeout: 30000,
		maxIdle: 10,
		enableKeepAlive: true,
		keepAliveInitialDelay: 300,
	}
}

I'm getting these random read ETIMEDOUT errors through out the day but can't pin point any specific pattern or reason why it's timing out

QueryFailedError: read ETIMEDOUT
    at Query.onResult (/var/apps/lcs-services/src/driver/mysql/MysqlQueryRunner.ts:246:33)
    at PoolConnection._notifyError (/var/apps/lcs-services/node_modules/mysql2/lib/connection.js:228:21)
    at PoolConnection._handleFatalError (/var/apps/lcs-services/node_modules/mysql2/lib/connection.js:183:10)
    at PoolConnection._handleNetworkError (/var/apps/lcs-services/node_modules/mysql2/lib/connection.js:196:10)
    at Socket.emit (node:events:514:28)
    at runInContextCb (/var/apps/lcs-services/node_modules/newrelic/lib/shim/shim.js:1159:22)
    at AsyncLocalStorage.run (node:async_hooks:346:14)
    at AsyncLocalContextManager.runInContext (/var/apps/lcs-services/node_modules/newrelic/lib/context-manager/async-local-context-manager.js:65:36)
    at Shim.applySegment (/var/apps/lcs-services/node_modules/newrelic/lib/shim/shim.js:1149:25)
    at Socket.wrapper (/var/apps/lcs-services/node_modules/newrelic/lib/shim/shim.js:1754:17) {
  query: 'START TRANSACTION',
  parameters: undefined,
  driverError: Error: read ETIMEDOUT
      at TCP.onStreamRead (node:internal/stream_base_commons:217:20)
      at runInContextCb (/var/apps/lcs-services/node_modules/newrelic/lib/shim/shim.js:1159:22)
      at AsyncLocalStorage.run (node:async_hooks:335:14)
      at AsyncLocalContextManager.runInContext (/var/apps/lcs-services/node_modules/newrelic/lib/context-manager/async-local-context-manager.js:65:36)
      at Shim.applySegment (/var/apps/lcs-services/node_modules/newrelic/lib/shim/shim.js:1149:25)
      at TCP.wrapper (/var/apps/lcs-services/node_modules/newrelic/lib/shim/shim.js:1754:17)
      at TCP.callbackTrampoline (node:internal/async_hooks:130:17) {
    errno: -110,
    code: 'ETIMEDOUT',
    syscall: 'read',
    fatal: true
  },
  errno: -110,
  code: 'ETIMEDOUT',
  syscall: 'read',
  fatal: true
}

I've ran pings from the nodes to the on prem server and didn't see any dropped packets on the on prem network side there was a tiny bit of packet loss on the node side but should be within an ok percentage
1318 packets transmitted, 1317 received, 0.0758725% packet loss, time 1318903ms
1360 packets transmitted, 1354 received, 0.441176% packet loss, time 1361060ms
node a had a bit more packet loss

within the time of running the pings there were 2 more ETIMEDOUT errors

I also tried looking at the mysql logs and it's completely empty

Any other troubleshooting steps I could try to help resolve this issue would be greatly appreciated!

@benyaminl
Copy link

Is it using Docker Swarm?

@invaderb
Copy link
Author

invaderb commented May 7, 2024

no I'm just running the node processes with pm2 on a linux VM and the mariadb service is also just on a linux VM

@benyaminl
Copy link

no I'm just running the node processes with pm2 on a linux VM and the mariadb service is also just on a linux VM

But they connect Via VPN, correct?

Could you reduce the maxIdle to zero? I seen a lot of problem on open connection using mysql2 lib with opened connection especially if the config between Maria DB and mysql2 config does not match. Especially for idle timeout.

@invaderb
Copy link
Author

invaderb commented May 8, 2024

ok let me try that and report back

@invaderb
Copy link
Author

ok after a few days of that change being deployed the ETIMEDOUT errors have stopped, what is the config mismatch thats causing this?

@benyaminl
Copy link

benyaminl commented May 17, 2024

ok after a few days of that change being deployed the ETIMEDOUT errors have stopped, what is the config mismatch thats causing this?

The idle time out on the mysql/mariadb, and the other problem is the network (your VPN).

I seen many problem with MySQL2 driver with kubernetes, docker swarm, and other that's virtual network. I have one myself last time and after rechecking because of IPVS on kubernetes.

MySQL2 keep alive keep getting feedback even mysql/mariadb server closed, causing when reusing the connection pool you got bad connection, and ends up with timeout connection, which is sucks.

The first way to tackle that is reducing the idle opened connection to zero, and 2nd thing is to reduce the max Idle time to below 5 minutes if you connection reliable never have interuption (IPVS only open connection for 15 minutes then forget about it), or don't allow any idle connection at all

Anyway if your problem is solved with that, then please do close the issue ticket, because it's not mysql2 driver fault at the first place.

@invaderb
Copy link
Author

@benyaminl Thank you for the explanation and help on this really appreciate it.

@ruanitto
Copy link

e first way to tackle that is reducing the idle opened connection to zero, and 2nd thing is to reduce the max Idle time to below 5 minutes if you connection reliable never have interuption (IPVS only open connection for 15 minutes then forget about it), or don't allow any idle connectio

How this fix could it be implemented using knex + mysql2 lib?

@benyaminl
Copy link

@ruanitto I already state that you can lower the maxIdle to zero, on the mysql2 config.

You need to do it to make the connection pool doesn't have any idle connection

There are open ticket #2599 that still in motion.

Also, as I state before, your env/network driver could blame for it...

So, for now, the solution via config is to set no idle connection at all. After using the connection, disposed and closed. As Simple as that.

you can add extra config on the connection pool to maxIdle: 0, if it still not clear for you

Refer to https://sidorares.github.io/node-mysql2/docs#using-connection-pools

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants