Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop copying dependencies and add them directly into zip file instead #260

Open
antoinegomez opened this issue Dec 14, 2021 · 7 comments
Open

Comments

@antoinegomez
Copy link

antoinegomez commented Dec 14, 2021

After seeing the performance of this plugin degrading I investigated and found out that dependencies where copied during packaging.

This takes ages to complete. Normal packaging of project takes a bit less than 1 minute. Then with more dependencies added it goes up to 6 minutes.

Went to see how serverless-esbuild was managing it and they just don't copy and add files directly to the zip. Which makes sense. Why copying around node_modules?

See: https://github.com/floydspace/serverless-esbuild/blob/8a1dcc02c49574c0b04df084a3b79049b4e8e25f/src/pack.ts#L197

If I have the green light from authors I can work on this and create PR to address this problem.

Originally posted by @antoinegomez in #175 (comment)

@medikoo
Copy link
Contributor

medikoo commented Dec 23, 2021

@antoinegomez I believe the issue is in how fs.copy (from fs-extra) operations.

In your PR you propose to replace it doing another npm install in build folder. This doesn't feel to me best solution. Ideally we should ensure that we're reusing same node_modules also user may not have npm installed (but e.g. yarn)

Ideally would be to find why fs.copy is slow and focus on that. For sure copy operation can work better

@antoinegomez
Copy link
Author

Why would it feel not the best solution to you?
When you deploy an application via CI/CD don't you do npm install --production ?

I see this as using the right tool for the job. We need to have dependencies for our build.
Copying each files one by one is super slow and inefficient. There is an official tool to install dependencies and it is much faster.

Adding a control to see if we have to use yarn instead is possible.

You are saying we should ensure we using the same node_modules. I get what you mean. For this we can also copy the package-lock file.

Here is a copy with bash (nodejs won't be faster than that):

time cp -R node_modules .build
cp -R node_modules .build  0.62s user 16.16s system 9% cpu 2:49.98 total

And npm install --production takes less than 10 seconds.
This is a huge difference and not something trivial.

The other option I see is to skip copying/installing and directly adding the files to the zip by creating the list of dependencies to include to the file. Have to check how you zip first. If you do it via an external command on the folder then you can also create symlinks, but it adds complexity.

I see this as a simple solution, using an external tool to do the job so we don't have to reinvent the wheel. This is feel right for me.

@antoinegomez
Copy link
Author

antoinegomez commented Dec 29, 2021

As I am quite new exploring the logic of this module and serverless packaging, I have some questions:

  • How does serverless is packaging node_modules to the zip?
  • Why do we need to copy them and do it here?

@medikoo
Copy link
Contributor

medikoo commented Jan 3, 2022

Why would it feel not the best solution to you?

It doesn't guarantee that what's packaged for lambda is actually installed in service. That may produce certain WTF situations. Imagine that user intentionally (for debugging purposes) changes the content of some installed package and naturally expects it'll be reflected in deployment. It'll be quite confusing not seeing those changes deployed.

Also as mentioned before user may not have npm installed at all

@medikoo
Copy link
Contributor

medikoo commented Jan 3, 2022

How does serverless is packaging node_modules to the zip?

They're added to zip archive without any fs copy operations involved

Why do we need to copy them and do it here?

I assume, it's because with the transpilation step, it's a good practice to ensure different paths for src and dist (in regular Framework packaging there's no transpilation involved. so no need to do that)

@antoinegomez
Copy link
Author

Sorry for late reply.

I understand a bit better now thanks.

Still would think that adding files to zip directly is the best way to go. I saw that in the past a link was created in the dist folder. I guess that for some reason you preferred to copy instead and had problems with the link. Will try to find in this repo history discussion around that.

For project with a few dependencies copying is fast but as you know with nodejs node_modules can quickly go over ten thousands of files, here for my project we have around 20k files, so would love to find a good solution to this problem. There is a big difference between 1min and 6-7min to package.

@antoinegomez
Copy link
Author

This is the PR I was looking for: #157

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants