How do I use Knex with AWS Lambda? #1875

austingayler · 2017-01-20T18:35:46Z

I'm running into issues with connection pooling while testing some code. I'm expecting my lambda function to be called maybe several thousand times over a couple of seconds, and having trouble figuring out the best way to connect to my db. Here's a very similar issue for node/postgres: essentially the problem is that I need to be able to get a connection from the pool if one is available, however I can't depend on the pool existing because of how AWS (unreliably) reuses lambda containers.

Basically what I'm looking for is a way to reliably get or create a connection to my db. I have not been able to find any examples (like while(!availableConnections) { tryToGetConnection() }. Do I need to interact with node-pool? How can I do this with Knex?

The text was updated successfully, but these errors were encountered:

elhigu · 2017-01-22T15:56:03Z

Knex pooling works only if connections are made from the same node process.

If AWS lambda instances are not running shared node process you just have to create new pool with min / max connections 1 in each lambda instance and hope that your database has good enough settings to allow hundreds of simultaneous connections (in RDS it depends on instance size).

After reading this https://forums.aws.amazon.com/thread.jspa?threadID=216000

Looks like lambda really is sharing some processes, so if you can figure out which is maximum count of lambda containers, that are created you should be able to calculate optimal pool size (each container has their separate pool so total connections to DB is pool.max * max container count).

Anyways there is no need to do manual connection requesting from pool, knex waits for connection automatically and if everything is over in couple of seconds, none of the timeouts will not trigger in that time.

If you are getting error from DB which says that maximum connection count has been reached, then you need to make max pool size smaller.

elhigu · 2017-01-22T15:57:16Z

Sorry I committed in a middle of sentence, my kid just threw like 3 liters of water to floor 🥇 I'll update the comment above in a moment...

austingayler · 2017-01-23T20:04:49Z

Thanks for the reply. It sounds like estimating the maximum pool size will be my best bet, even though the number of connections will vary drastically over time.

To be clear, there is no way to close a connection in Knex, correct? Only the ability to destroy a connection pool? The fact that Lambda will sometimes reuse containers kind of throws everything off.

elhigu · 2017-01-24T08:31:54Z

Destroying connection pool destroys also all the connections (gracefully waiting that they complete first) and I suppose that when lambda destroys container, all its open TCP sockets will close implicitly when process dies.

I don't see why one should try to close connections explicitly after each request since it would destroy the benefit of pooling. You would get the same effect by creating pool with size 1 and destroying it afterwards.

You can also configure idle timeout for pool which will automatically close connection if it is not used and is just waiting for action in pool.

ghost · 2017-02-17T21:25:24Z

Can I use Knex to send a COPY query to the RedShift cluster, and not wait for the results?

Doing this with pg Pool terminates the query as soon as the end of the Lambda function is reached.

elhigu · 2017-02-20T08:12:25Z

@BardiaAfshin If lambda container is destroyed and all its sockets are freed when lambda function end is reached, in that case also db query will die and might not be finished.

I'm not sure either how postgresql react to client-side connection ending if COPY query will be rolled back due to not finished implicit transaction before reading the result values...

Anyways if the query can be sent or not that way is not up to knex, but it depends on how aws lambda and postgresql work.

ghost · 2017-02-20T23:38:47Z

My observation is that the query is killed on RedShift and it is rolled back.

austingayler · 2017-02-22T15:42:43Z

Anyways there is no need to do manual connection requesting from pool, knex waits for connection automatically and if everything is over in couple of seconds, none of the timeouts will not trigger in that time.

I'm running a db load testing script and this doesn't seem to be true. Anything higher than like 30 simultaneous connections immediately times out, rather than waiting for an open connection.

Whoaa512 · 2017-03-15T18:49:49Z

Is there a way to manually release the currently used connection once a query is done?

kurtzilla · 2017-06-03T17:43:55Z

Does anyone have example code they could share for those of us just getting into AWS Lambda? I am hoping that someone could share patterns and/or anti-patterns of knex/postgres/lambda.

Here is what I am using now - I am certain it can be improved, but hoping for some little bit of vindication as to whether or not I am on the correct path...

'use strict';
var pg = require('pg');

function initKnex(){
  return require('knex')({
      client: 'pg',
      connection: { ...details... }
  });
}

module.exports.hello = (event, context) =>
{
  var knex = initKnex();
  
  // Should I be returning knex here or in the final catch?
  knex
  .select('*')
  .from('my_table')
  .then(function (rows) {
    context.succeed('Succeeded: ' + JSON.stringify(rows || []));
  })
  .catch(function (error) {
    context.fail(error);
  })
  .then(function(){
    // is destroy overkill? - is there an option for knex.client.release, etc?
    knex.destroy();
  })
}

austingayler · 2017-06-05T14:11:25Z

I'm in the same boat--still have not figured out what the best way is despite lots of googling around and testing. What I'm doing right now is:

const dbConfig = require('./db');
const knex = require('knex')(dbConfig);

exports.handler = function (event, context, callback) {
...
connection = {..., pool: { min: 1, max: 1 },

This way the connection (max 1 per container) will stay alive so the container can be reused easily. I don't destroy my connection at the end.

http://blog.rowanudell.com/database-connections-in-lambda/

Not sure if this is the best way but it's worked for me so far.

/shrug

kurtzilla · 2017-06-13T15:59:01Z

@austingayler

I am not exactly sure what happens when const knex is declared - is this where the initial connection is setup? Can someone clarify? (I am assuming you have connection info in your dbConfig)

In your code, isn't the connection itself being overwritten and re-created everytime the handler is called?

hassankhan · 2017-06-20T18:23:05Z

Just chiming in a someone who's in the same boat, I've not had much luck trying to figure out containers vs pool size (as per @elhigu's suggestion). My solution has been to destroy the pool after every connection (I know it's not optimal 😒):

const knex = require('knex');

const client = knex(dbConfig);

client(tableName).select('*')
  .then((result) => { 
    
    return Promise.all([
      result,
      client.destroy(),
    ])  
  })
  .then(([ result ]) => {
    
    return result;
  });

mooyoul · 2017-08-08T17:58:01Z

TL;DR: Simply set context.callbackWaitsForEmptyEventLoop = false before calling callback.

AWS Lambda waits for empty event loop (by default). so function could throw Timeout error even callback executed.

Please see below links for details:
apex/apex@1fe6e91

To @elhigu @tgriesser : This is not an knex issue. This is Definitely issue of Lambda environment. I think tag this issue to question and should be good to close :)

elhigu · 2017-08-15T09:02:16Z

@mooyoul yep, definately not knex issue, but maybe documentation issue... though I think I dont want any aws lambda specific stuff to knex docs, so closing.

jlancelot2007 · 2018-06-16T00:46:18Z

Please look at this link
https://stackoverflow.com/questions/49347210/why-aws-lambda-keeps-timing-out-when-using-knex-js
You need to close the db connection otherwise Lambda runs until it times out.

jamesdixon · 2018-06-19T20:54:23Z

Has anyone found a pattern that works flawlessly for using Knex with Lambda?

kibertoad · 2018-06-19T20:55:27Z

Do you have any other issues when you close connection properly?

jamesdixon · 2018-06-19T20:56:43Z

The thing I'm trying to avoid is closing the connection and making it available across Lambda calls.

kibertoad · 2018-06-19T21:02:48Z

Are you sure Lambda allows maintaining state between calls?

kibertoad · 2018-06-19T21:05:42Z

Check out the node.js part of https://scalegrid.io/blog/how-to-use-mongodb-connection-pooling-on-aws-lambda/ it suggests a solution to hanging up on non-empty event loop.

mmarvick · 2019-03-19T15:06:53Z

I don't like bumping the thread like this, but since I think it's getting lost in the comments, @mooyoul's answer above worked great for us.

If you don't turn that flag to false, Lambda waits for the event loop to be empty. If you do, then the function finishes execution as soon as you call the callback.

hongbo-miao · 2019-05-29T19:54:17Z

For me, it worked on my local machine but not after deploying. I was kind of be mislead.

It turns out the RDS inbound source is not open to my Lambda function. Found solution at Stack Overflow: either changing RDS inbound source to 0.0.0.0/0 or use VPC.

After updating RDS inbound source, I can Lambda with Knex successfully.

The Lambda runtime I am using is Node.js 8.10 with packages:

knex: 0.17.0
pg: 7.11.0

The code below using async also just works

const Knex = require('knex');

const pg = Knex({ ... });

module.exports. submitForm = async (event) => {
  const {
    fields,
  } = event['body-json'] || {};

  return pg('surveys')
    .insert(fields)
    .then(() => {
      return {
        status: 200
      };
    })
    .catch(err => {
      return {
        status: 500
      };
    });
};

Hopefully it will help people who might meet same issue in future.

disbelief · 2019-05-31T07:35:56Z

Something I'd like to draw people's attention to is the serverless-mysql package, which wraps the standard mysql driver but handles a lot of the lambda-specific pain points around connection pool management that are described in this thread.

However I don't think Knex will work with serverless-mysql at this time (according to this issue) since there's no way to swap in a different driver. There could also be incompatibilities since serverless-mysql uses promises instead of callbacks.

The best approach is probably to add a new client implementation in Knex. I'd be happy to give this a shot, but would love someone more familiar with Knex to tell me if they think it's reasonable/doable?

disbelief · 2019-05-31T07:40:20Z

Also, was thinking in the meantime, I could use Knex to build but not execute MySQL queries. So just call toSQL() and pass the output to serverless-mysql to execute.

What I'm wondering though is if Knex can be configured without any db connections? There's no sense opening a connection that's never used.

Would the following work?

connection = {..., pool: { min: 0, max: 0 ) },

elhigu · 2019-05-31T08:57:28Z

@disbelief You can initialize knex without connection. There is an example how to do it in the end of this secion in docs https://knexjs.org/#Installation-client

const knex = require('knex')({client: 'mysql'});

const generatedQuery = knex('table').where('id',1).toSQL().toNative();

disbelief · 2019-05-31T09:29:46Z

@elhigu ah cool, thanks. Will give that a shot in the interim.

disbelief · 2019-05-31T14:44:53Z

Update: the above doesn't appear to work. Knex throws an error if it can't establish a db connection, even if there's never any call to execute a query.

fdecampredon · 2019-07-08T14:38:48Z

@disbelief did you find a solution for that ?

disbelief · 2019-07-08T15:41:37Z

@fdecampredon nope. At the moment I'm simply building queries with squel, calling toString() on them, and passing them to the serverless-mysql client.

Whoaa512 · 2019-07-09T17:04:39Z

@disbelief It's surprising to me that building the queries failed without a db connection.

Would you mind posting the code & config you're using to build the knex client? Also the knex version.

The following works fine for me with
Node v8.11.1
mysql: 2.13.0
knex: 0.18.3

const k = require('knex')

const client = k({ client: 'mysql' })

console.log('Knex version:', require('knex/package.json').version)
// => 0.18.3
console.log('sql:', client('table').where('id',1).toSQL().toNative())
// => { sql: 'select * from `table` where `id` = ?', bindings: [ 1 ] }

39otrebla · 2019-09-04T12:42:32Z

As per how Lambda's resources recycling works, Knex instance should always be kept outside any function or class. Also, it's important to have max 1 in connection in the pool.

Something like:

const Knex = require('knex');
let instance = null;

module.exports = class DatabaseManager {
  constructor({ host, user, password, database, port = 3306, client = 'mysql', pool = { min: 1, max: 1 }}) {
    this._client = client;
    this._poolOptions = pool;
    this._connectionOptions = {
      host: DB_HOST || host,
      port: DB_PORT || port,
      user: DB_USER || user,
      password: DB_PASSWORD || password,
      database: DB_NAME || database,
    };
  }

  init() {
    if (instance !== null) {
      return;
    }

    instance = Knex({
      client: this._client,
      pool: this._poolOptions,
      connection: this._connectionOptions,
      debug: process.env.DEBUG_DB == true,
      asyncStackTraces: process.env.DEBUG_DB == true,
    });
  }

  get instance() {
    return instance;
  }
}

In this way, you'll take the most out of Lambda's recycling, i.e. each activated (freezed) container will hold only the same connection.

As a side note, keep in mind that Lambda scales out by default if you don't limit the number of concurrent containers.

jamesdixon · 2019-10-01T17:01:26Z

I've had Knex running in Lambda for almost a year now with no problems. I'm declaring my Knex instance outside of my Lambda function and utilizing context.callbackWaitsForEmptyEventLoop = false as mentioned in other posts.

That said, over the past day, something seems to have changed on the Lambda side as I'm now seeing a huge connection spike in Postgres; connections don't seem to be closed.

Has anyone else using the aforementioned approach seen any chances over the past day or so?

ImreC · 2021-01-06T10:07:32Z

@jamesdixon just reading this now while refactoring some of our knex implementations on lambda. Any updates on this? Has context.callbackWaitsForEmptyEventLoop = false stopped working?

MatissJanis · 2021-04-26T19:05:22Z

serverless-mysql is a great package for managing MySQL connections in the AWS Lambda context. However, it was not playing nicely together with knex, hence I built a simple dialect to solve the problem: https://github.com/MatissJanis/knex-serverless-mysql

This is another way to solve the same problem.

const Knex = require('knex');
const knexServerlessMysql = require('knex-serverless-mysql');

const mysql = require('serverless-mysql')({
  config: {
    host     : process.env.DB_HOST,
    database : process.env.DB_DATABASE,
    user     : process.env.DB_USERNAME,
    password : process.env.DB_PASSWORD,
  },
});

const knex = Knex({
  client: knexServerlessMysql,
  mysql,
});

exports.run = function () {
  return knex('table_name').where('id', 1);
}

kibertoad · 2021-04-26T19:20:44Z

@MatissJanis Would you be open to submitting a PR adding your dialect to https://github.com/knex/knex/blob/master/ECOSYSTEM.md?

MatissJanis · 2021-04-26T20:09:20Z

@kibertoad sure, here you go: #4454

awm086 · 2022-10-06T18:11:17Z

does anyone know how to run knex migration on serverless setup? In my case I am using aurora postgres but the database tables don't exsist because I am not sure how to run knex migrate. Can the script be called from the lambda handler?

nimish-kumar · 2023-05-18T09:04:07Z

@awm086 yes

elhigu closed this as completed Jan 22, 2017

elhigu reopened this Jan 22, 2017

elhigu closed this as completed Aug 15, 2017

elhigu mentioned this issue Nov 30, 2017

Knex:Error Pool2 - error: too many connections for role #1027

Closed

elhigu mentioned this issue Jan 31, 2018

Transactions hanging in AWS Lambda #2445

Open

ricardograca added the question label Jun 16, 2018

This was referenced Apr 10, 2019

TimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call? #2820

Closed

Disable propagateCreateError by default #3132

Closed

kibertoad added the insightful label Jan 6, 2021

How do I use Knex with AWS Lambda? #1875

How do I use Knex with AWS Lambda? #1875

Comments

austingayler commented Jan 20, 2017

elhigu commented Jan 22, 2017 • edited

elhigu commented Jan 22, 2017

austingayler commented Jan 23, 2017

elhigu commented Jan 24, 2017

ghost commented Feb 17, 2017

elhigu commented Feb 20, 2017

ghost commented Feb 20, 2017

austingayler commented Feb 22, 2017

Whoaa512 commented Mar 15, 2017

kurtzilla commented Jun 3, 2017 • edited

austingayler commented Jun 5, 2017

kurtzilla commented Jun 13, 2017

hassankhan commented Jun 20, 2017 • edited

mooyoul commented Aug 8, 2017

elhigu commented Aug 15, 2017

jlancelot2007 commented Jun 16, 2018

jamesdixon commented Jun 19, 2018

kibertoad commented Jun 19, 2018

jamesdixon commented Jun 19, 2018

kibertoad commented Jun 19, 2018

kibertoad commented Jun 19, 2018

mmarvick commented Mar 19, 2019

hongbo-miao commented May 29, 2019 • edited

disbelief commented May 31, 2019

disbelief commented May 31, 2019 • edited

elhigu commented May 31, 2019

disbelief commented May 31, 2019

disbelief commented May 31, 2019

fdecampredon commented Jul 8, 2019

disbelief commented Jul 8, 2019

Whoaa512 commented Jul 9, 2019

39otrebla commented Sep 4, 2019 • edited

jamesdixon commented Oct 1, 2019

ImreC commented Jan 6, 2021

MatissJanis commented Apr 26, 2021

kibertoad commented Apr 26, 2021

MatissJanis commented Apr 26, 2021

awm086 commented Oct 6, 2022

nimish-kumar commented May 18, 2023

elhigu commented Jan 22, 2017 •

edited

kurtzilla commented Jun 3, 2017 •

edited

hassankhan commented Jun 20, 2017 •

edited

hongbo-miao commented May 29, 2019 •

edited

disbelief commented May 31, 2019 •

edited

39otrebla commented Sep 4, 2019 •

edited