Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DX: Zip only production relevant node_modules to decrease bundle size #2709

Closed
ZauberNerd opened this issue Nov 15, 2016 · 23 comments
Closed
Milestone

Comments

@ZauberNerd
Copy link

This is a Feature Proposal

Description

At the moment everything inside node_modules is being zipped and uploaded to s3.
If we would create a plugin for zipping the service (to not only be dependent on node.js), we could leverage npm (or at least their ls algorithm) to find a list of only the production dependencies (even with npm@3 and their flat directory structure).
npm ls --production --parseable creates exactly this - a list of all dependencies and their transient dependencies, that are declared through the package.json's dependencies (excluding devDependencies).

@dougmoscrop
Copy link
Contributor

I wrote a plugin for this, called serverless-plugin-include-dependencies - give it a try?

@pmuens
Copy link
Contributor

pmuens commented Nov 26, 2016

Awesome @dougmoscrop! 🎉
Thanks for sharing!

BTW here's the link https://github.com/dougmoscrop/serverless-plugin-include-dependencies

@miltador
Copy link
Contributor

@dougmoscrop cool, I guess you'll want to add it to the list of plugins.

@pmuens pmuens changed the title Zip only production relevant node_modules to decrease bundle size DX: Zip only production relevant node_modules to decrease bundle size Mar 9, 2017
@buggy
Copy link

buggy commented May 4, 2017

If you're using later versions of npm (I've only tested this with 4.5.0) then it's pretty easy to reduce the size of your deployment pages. Create a package.json with your development dependencies in the project root folder and a package.json with your runtime dependencies in your service folder (the one created by Serverless).

This keeps all of your development dependencies and the packages they depend on out of the zip file as only the node_modules from the services folder are included in the package. I have a more complete write up about this here.

@pmuens
Copy link
Contributor

pmuens commented May 4, 2017

That's neat! (sad that it only works with newer npm versions)

Thanks for posting @buggy 👍

@DavidWells
Copy link
Contributor

DavidWells commented May 5, 2017

There are a number of ways we can include only prod dependancies and exclude dev dependancies

See http://stackoverflow.com/questions/36447801/grunt-and-npm-package-all-production-dependencies

This works.

/**
 * Returns an array of the node dependencies needed for production.
 * See https://docs.npmjs.com/cli/ls for info on the 'npm ls' command.
*/
var getProdDependencies = function(callback) {
  require('child_process').exec('npm ls --prod=true --parseable=true', undefined,
      function(err, stdout, stderr) {
        var array = stdout.split('\n');
        var nodeModuleNames = [];

        array.forEach(function(line) {
          var index = line.indexOf('node_modules');
          if (index > -1) {
            nodeModuleNames.push(line.substr(index + 13));
          }
        });

        callback(nodeModuleNames);
      });
};

Are there any potential risks/downsides to implementing something like this?

@pmuens
Copy link
Contributor

pmuens commented May 5, 2017

@DavidWells great research!

we should only keep in mind to support both yarn and npm.

@DavidWells
Copy link
Contributor

DavidWells commented May 5, 2017

@pmuens I believe yarn uses npm under the hood (or at least package.json) and yarns future is in question https://twitter.com/kentcdodds/status/860516174957690880

The alternative solution would be recursively parsing the package.json prod deps ourselves and zipped up only the prod deps.

Or potentially leveraging @dougmoscrop handy work =) https://github.com/dougmoscrop/serverless-plugin-include-dependencies/blob/master/get-dependency-list.js

@pmuens
Copy link
Contributor

pmuens commented May 8, 2017

@DavidWells yes, I'm also not 100% sure if supporting both (npm and yarn) is super important here.

I came up with that because of this LOC where npm is explicitly used:

require('child_process').exec('npm ls --prod=true --parseable=true', undefined, function(err, stdout, stderr)

AFAIK we have some people relying on yarn.

Another question we've should discuss is a way how we can extend this later on to support other runtimes such as Python or JVM. Maybe we can define an interface for this so that it can be easily extended for other runtimes later down the road.

Something to keep in my although this might be out of the scope for now.

@pmuens pmuens added this to the 2.0 milestone May 8, 2017
@DavidWells
Copy link
Contributor

Another question we've should discuss is a way how we can extend this later on to support other runtimes such as Python or JVM. Maybe we can define an interface for this so that it can be easily extended for other runtimes later down the road.

I think we can do this for node to start. Either with the npm command or manually ourselves via recursively parsing prod dependancies. I will investigate

@marshall007
Copy link

@DavidWells for what it's worth, this can be accomplished with npm prune --production. That might be easier than trying to traverse the dev dependences manually.

@marshall007
Copy link

Also note, there was some discussion around an equivalent command for yarn in yarnpkg/yarn#696. For now, it appears that yarn install --production has the same behavior.

@pmuens
Copy link
Contributor

pmuens commented May 25, 2017

Thanks for the tip @marshall007 👍 That sounds good.

I'm still in favor of the more general solution without npm or yarn. But yeah, parsing the dependencies might get pretty complex.

Can't we just read the package.json dependencies and then cherry pick the corresponding node modules in the node_modules directory and include them / copy them over into the zip? 🤔

@DavidWells
Copy link
Contributor

DavidWells commented May 25, 2017

@pmuens yeah we can start with root package.json and recursively traverse down to each depedancy's package.json copying the required prod dependancies folders over

I believe this is how npm is actually doing it with that npm ls --prod=true --parseable=true command

https://github.com/npm/npm/blob/6f09d6d2d915cdccfee6f423d7f14135be89e9f9/lib/ls.js

@pmuens
Copy link
Contributor

pmuens commented Jun 6, 2017

Quick update: We just started the implementation in #3737. Feel free to test it and provide feedback 👍

@cjelger
Copy link

cjelger commented Jun 12, 2017

@pmuens very useful work, however there is a bug in the code which prevents functions to be packaged individually. I added a comment in your PR to describe the issue. Thx!

@pmuens
Copy link
Contributor

pmuens commented Jun 12, 2017

@pmuens very useful work, however there is a bug in the code which prevents functions to be packaged individually. I added a comment in your PR to describe the issue. Thx!

@cjelger thanks for looking into it and providing feedback 👍 I'll look into the issue you've described ASAP.

@ProTip
Copy link

ProTip commented Jun 12, 2017

@marshall007 This was my solution. Copy package.json and node_modules into the build dir, run prune, :shipit: .

@cjelger
Copy link

cjelger commented Jun 13, 2017

As commented in #3737, there is an issue if functions are packaged individually, and if functions are developed as separate nodejs packages, with each of these packages having its own package.json file. In this case the package.json of the main/service parent package is not relevant when packaging each function: the files to be included are defined by the include property of each function, and I think the only solution to use the right package.json file is to add a new property for that (nodejs-package-json in the example below). For example, something like:

service: my-service
package:
  individually: true
  exclude:
    - ./**

functions:
  function1:
    package:
      nodejs-package-json: src/package-for-function1/package.json
      include:
        - src/package-for-function1/**
      
  function2:
    package:
      nodejs-package-json: src/package-for-function2/package.json
      include:
        - src/package-for-function2/**

where function1 and function2 are developed as 2 separate nodejs packages. In case nodejs-package-json is missing, the packaging code could fallback to using the package.json of the service project.

@HyperBrain
Copy link
Member

Doesn't this approach in general break the concept of microservices propagated by the Serverless framework? In my understanding a Serverless service represents a microsoervice, that is self-contained.
Putting different dependencies per function leads to a Serverless service containing multiple microservices - which is somehow contradictory to the base concept.
Nevertheless I agree to exclude the development dependencies, but have the production dependencies in common for the Serverless service.

There are plugins that solve the issue that you try to solve here already. E.g. the serverless-webpack plugin, which only deploys the used dependencies per function, if you enable packaging per function.

@cjelger
Copy link

cjelger commented Jun 13, 2017

@HyperBrain Yes right you have a point, maybe our development model doesn't really fit with some (implicit?) assumptions, but except for this issue with devDependencies (not specific to our particular case), it works just fine. IMHO I think that assuming that the serverless service is just composed of one nodejs package introduces a new explicit restriction about the structure of the code. That's a design decision, but technically it's possible to support both (which is already the case). Note that even in the case of the single nodejs package, the code in #3737 already assumes that the package.json file is in the same directory as the serverless.yml file and this might be too restrictive in some cases. Having the possibility to define this at both the service and function level (for both under package:nodejs-package-json) would provide the maximum flexibility and would not assume any "project style".

@dougmoscrop
Copy link
Contributor

@HyperBrain - Pragmatically, I don't think it violates anything, since the term "microservice" is a buzzword that has a lot of interpretation to it ;)

We regularly have separate functions that use different libraries even within a 'self contained system' since those functions usually do very different things. Wanting to minimize size seems reasonable.

@pmuens
Copy link
Contributor

pmuens commented Jun 16, 2017

Hey everyone! Thanks for the discussion here. 👍

I just pushed some changes to the PR (#3737) so that nested directories with dependencies are now supported out of the box. Feel free to give it a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants