Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GRPC related crash when committing batch updates #829

Closed
jakeleventhal opened this issue Dec 19, 2019 · 17 comments
Closed

GRPC related crash when committing batch updates #829

jakeleventhal opened this issue Dec 19, 2019 · 17 comments
Assignees
Labels
api: firestore Issues related to the googleapis/nodejs-firestore API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@jakeleventhal
Copy link

Environment details

  • OS: OS: Mac OS 10.15 Beta (19A573a)
  • Node.js version: v12.10.0
  • npm version: 6.10.3
  • @google-cloud/firestore version: 3.0.0

Steps to reproduce

  1. Create several batch updates and commit them all at once:
const batches =[b1, b2, b3, b4, b5];
await Promise.all(batches.map(async (b) => b.commit()));

I then get these errors (sometimes but not always)
Screen Shot 2019-12-18 at 8 52 10 PM

@jakeleventhal
Copy link
Author

Screen Shot 2019-12-18 at 9 22 55 PM

Interestingly, it starts with this

@schmidt-sebastian
Copy link
Contributor

@jakeleventhal Thanks for reporting this. Are you able to share client logs with us? I have been unable to reproduce this locally for now. The logs should offer some insight into the order of operations that causes this.

@murgatroid99
Copy link

That metadata error is definitely unrelated. That is just a warning that is logged to the console.

The "Channel has been shut down" error comes from grpc-js, and it is only thrown if Channel#close or Client#close (which calls Channel#close) is called by the user (either google-gax or firestore in this case).

@jakeleventhal
Copy link
Author

@murgatroid99 is the warning due to something on my end?

@schmidt-sebastian the only logs i have are screenshotted in this issue

@jakeleventhal
Copy link
Author

As this is a fatal error that crashes my server, I have to revert to firestore 2.3.0 (which still has the memory leak issue resolved in 3.0.0 with #768)

@yoshi-automation yoshi-automation added the triage me I really want to be triaged. label Dec 19, 2019
@murgatroid99
Copy link

I think that the issue about it, which you posted in, should explain it. But in short, no, we don't know what causes the header corruption, but at this time we have no reason to believe that it's related to anything the user is doing.

@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Dec 24, 2019
@bcoe bcoe added the type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. label Dec 26, 2019
@schmidt-sebastian
Copy link
Contributor

I spent another couple hours debugging this. I believe this is an issue in our library, but I am not able to reproduce this. We can re-open this we get a reproduction with Firestore SDK logs (via setLogFunction()).

@jakeleventhal
Copy link
Author

Screen Shot 2020-01-08 at 11 15 04 PM

The logs here dont appear very useful here. I used the set log funciton and console.logged the log messages

@jakeleventhal
Copy link
Author

can we please reopen this? This is extremely urgent as I am now blocked no matter which way i go:

  • If i use firestore 2 i dont have this error but the memory leaks are too big
  • If i use firestore 3 this error occurs and my application crashes

@urirahimi
Copy link

@schmidt-sebastian this is an existential crisis for my business, please provide an update

@schmidt-sebastian
Copy link
Contributor

I am ok with re-opening this, but I haven't been able to reproduce this and am blocked on getting some logs that show me what is going on. @jakeleventhal Do you happen to have slightly earlier logs. I am especially interested in the following log lines:

  • Re-using existing client with %s remaining operations
  • Creating a new client
  • Garbage collected 1 client

Thank you!

@bcoe bcoe added the priority: p2 Moderately-important priority. Fix may not be included in next release. label Jan 9, 2020
@yoshi-automation yoshi-automation removed 🚨 This issue needs some love. triage me I really want to be triaged. labels Jan 9, 2020
@schmidt-sebastian
Copy link
Contributor

FWIW, this is the code snippet I am running to try to reproduce:

const Firestore = require('@google-cloud/firestore');

Firestore.setLogFunction(console.log);

const firestore = new Firestore();
function run() {
	const batches = [];
	for (let i = 0; i < 250; ++i) {
		const batch = firestore.batch();
		batch.set(firestore.collection('foo').doc(), { test: 'foobar' });
		batches.push(batch);
	}

	batches.forEach(b => b.commit());
}
run();

setInterval(run, 1000);

@jakeleventhal
Copy link
Author

jakeleventhal commented Jan 10, 2020

Perhaps you could try creating larger batches? The code I use has a function that takes in an array of batches, checks to see if the batch at the last index has >=500 entries, then creates a new one. So in my case, each batch is full.

As for the logs, here are several screenshots that occur after one another. It should be noted that these are all from a production environment with many users so not every log is just from this specific test. It could be many operations occurring at once; however, the bulk operations should be the vast majority of what is going on. I don't know exactly which operations are causing the errors but hopefully this is useful.
Logs.zip

I'm open to an arrangement for temporary access to our logs so that the issue can be more easily resolved.

@schmidt-sebastian schmidt-sebastian self-assigned this Jan 10, 2020
@schmidt-sebastian
Copy link
Contributor

@jakeleventhal Thank you for providing the logs. From the logs, it looks like the issue could also happen in a different call (Query.get() or DocumentReference.get()). While I still cannot reproduce it, I have another theory and am looking at a potential issue in our retry mechanism.

@jakeleventhal
Copy link
Author

Screen Shot 2020-01-12 at 8 29 37 PM

Encountered this failure again in a more contained environment with one user. The orange logs at the top go on hundreds of times. Here is the function where the error occurred (I simplified the code drastically to get rid of anything specific to my application that would definitely be unrelated):
export const saveCSV = async (id, docId, rawReportData) => {
  const reportData = rawReportData.slice(1);

  const allThings = {};
  const monthTotals = {};
  const formattedDays = {};
  await Promise.all(reportData.map(async ({
    sku,
    'amount-description': name,
    'posted-date': date
  }) => {
    // If the date has not been encountered before
    if (!(date in formattedDays)) {
      formattedDays[date] = { SKUs: {} };
    }

    // If the SKU has not been encountered before
    if (sku && !(sku in formattedDays[date].SKUs)) {
      formattedDays[date].SKUs[sku] = {};
    }

    // Set the data pointer to either the root or the sku map
    const dataPointer = sku ? formattedDays[date].SKUs[sku] : formattedDays[date];

    // If the thing hasn't been encountered yet
    const thingId = `${id}_${name}`;
    if (!(thingId in allThings)) {
      // If the thing does not exist in the database
      const thingDoc = firestore.doc(`Things/${thingId}`);
      const existingThing = await thingDoc.get();
      if (!existingThing.exists) {
        await thingDoc.set({id, name});
        allThings[name] = false;
      } else {
        allThings[name] = Boolean(existingThing.data().AccountingAccount);
      }
    }

    // Add the thing to the data map
    if (name in dataPointer) {
      dataPointer[name] += numberAmount;
    } else {
      dataPointer[name] = numberAmount;
    }

    const month = moment(date, 'YYYY-MM-DD').format('MMMM');
    if (!(month in monthTotals)) {
      monthTotals[month] = {
        Orders: 0,
        Total: 0
      };
    }
  }));

  const newDoc = firestore.doc(`MyCollection/${docId}`);

  // Save each of the formatted days
  await Promise.all(Object.entries(formattedDays).map(([date, data]) => newDoc
    .collection('SubCollection')
    .doc(`${docId}_${date}`)
    .set({
      Date: date,
      ...data
    })));

  // Save the new MyCollection
  await newDoc.set({
    allThings,
    name
  });
};

@schmidt-sebastian
Copy link
Contributor

I hope that this is addressed by #870, but I am not confident enough to close this issue. Your code snippet and the new log don't provide any counter evidence, which I see as a good sign.

We will release 3.3.4 today. If you can give it a spin, we can find out whether this issue has not been resolved. Sorry for the trouble!

@jakeleventhal
Copy link
Author

Looks like it fixed the issue! Will reopen if I encounter something similar.

@google-cloud-label-sync google-cloud-label-sync bot added the api: firestore Issues related to the googleapis/nodejs-firestore API. label Jan 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: firestore Issues related to the googleapis/nodejs-firestore API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

6 participants