Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider installing devDependencies by default #431

Closed
jmorrell opened this issue Jun 13, 2017 · 42 comments · Fixed by #519
Closed

Consider installing devDependencies by default #431

jmorrell opened this issue Jun 13, 2017 · 42 comments · Fixed by #519

Comments

@jmorrell
Copy link
Contributor

Pro: Many users getting started with Heroku try to run a build step with webpack, gulp, etc only to find that it's in their devDependencies and not installed by default on Heroku. They then have to figure out how to tell Heroku to install these. Defaulting to installing everything makes for a smoother onboarding.

Con: Longer install times and larger slugs for existing large apps

First step: gather metrics

@edmorley
Copy link
Member

edmorley commented Jun 13, 2017

For users that don't commit build assets to their source directory (building dynamically is the better approach IMO), I see the building assets step as a production dependency, not a development dependency - which makes putting them in devDependencies a bug. After all, in development users will often not perform a full build at all and instead use a dev-server to dynamically serve assets.

An alternative to ease onboarding might be for the buildpack to check (on a non-zero return from the preinstall/postinstall/heroku-* package.json run scripts) whether the failed command is in devDependencies, and if so, output an appropriate message suggesting they use dependencies instead?

@jmorrell
Copy link
Contributor Author

@edmorley I don't necessarily disagree, but there is a big difference between what users should do, and what the actually do in practice. Checking in built assets has its own issues, and goes against established community best practices. We should meet people where they are when we can.

If a warning doesn't already exists I'm definitely for adding that, but I'd prefer for things to "just work" rather than make a non-trivial percentage of users either modify their app or set an environment variable to run on Heroku.

@edmorley
Copy link
Member

Reading back I can see it was ambiguous -- by "the better approach" I meant I prefer not committing assets. I don't for one minute think suggesting people change their workflow in that regard is practical :-)

@jmorrell
Copy link
Contributor Author

Oh, I see now. I don't think forcing people to put things under dependencies is the right solution if it hits a large % of users. I suspect it does, which is why I want to gather metrics.

The larger story is that frontend build errors are a huge % of the number of support tickets that we receive for Node, and anything we can do to make that process easier is a high priority for me.

@Cryptophobia
Copy link

Cryptophobia commented Jun 15, 2017

The right solution would be to not have to rely on the user to move to devDependencies to dependencies and to follow their current nodejs workflow. But installing devDependencies and dependencies together is also not the right approach because of testing frameworks / tools and other things that should not be installed as dependencies in production - nodemon and forever.

Why not just search for a new flag in the package.json file similar to how heroku-postbuild works but call it something like heroku-build-with-dev or similar?

@jasonkarns
Copy link

npm-pack seems to fit the bill here. npm-pack runs the publish lifecycle (which is what would conventionally trigger any build/compile scripts), and it also follows the ignore/include rules for which files get included in the tarball; which means built assets don't need checked in, but they still get included in the packed tarball. This would allow the buildpack to run npm install (including devDeps), then run npm-pack (thus generating a tarball, which then essentially is the slug minus node_modules). Then deployment for production is unzipping the packed tarball, npm-install --production.

(This also gives you flexibility as to whether node_modules are included in the slug itself, since the npm-packed tarball is no longer reliant on them for building, you could get away with not including them in the slug and simply npm install --production later.)

This is pretty close to the workflow I see and advocate in most CI pipelines, anyhow. (npm install with devDeps on a build box, then npm-pack, then saving/moving/deploying the packed tarball to prod box, unzip, npm install --production)

This makes an app's package.json scripts/deps congruent with the scripts/deps for modules, and follows npm's conventions more closely. It also makes sense semantically (npm-publish a module which calls npm-pack internally; and npm-pack an app for deployment)

@jmorrell
Copy link
Contributor Author

@jasonkarns There are a couple of issues that I see there:

  • It doesn't fit with the Heroku deploy model git push heroku master. It's not a good idea for us to say "deploy this way unless you're using Node, then use this other thing"
  • apps that use Node + another platform
  • native dependencies

but mainly we have 10's of thousands of existing apps using the current deploy process.

@jasonkarns
Copy link

I don't understand why this wouldn't work. My intention is that all of the above would happen in the build pack and not have any impact on users (migration path may be necessary for some configurations, i'm unsure).

@jasonkarns
Copy link

jasonkarns commented Jun 15, 2017

A better explanation:

after git push heroku master:

the buildpack does the git checkout of master, runs npm install with devDeps (so app authors don't need to move devDeps to be prod deps). Then the buildpack runs npm pack, generating the app tarball (with with all the compiled/built assets as per the prepare npm script). The app is now fully contained and ready to execute, less node_modules. All that's left is to run npm install --production to get the prod-only node_modules tree. And there you have you slug: npm-packed app tarball + production node_modules).

A couple notes probably going unmentioned here:

  1. the prepare npm script is what users should be using for the build step. that's what npm instructs to be used (it is invoked before publish of modules, and on postinstall when at root of packages, and on postinstall of git-dependencies, and by npm pack for apps)
  2. invoking npm install --production after a full npm install has already populated the node_modules directory (with devDeps) will simply prune all the dev Deps away. So the workflow described above would have the devDeps installed such that npm pack can be invoked, thus triggering the prepare npm script, thus compiling/building the app. then npm install --production would prune away the devDeps, leaving only prod deps in the node_modules for the slug.

Hopefully that clarifies my suggestion.

@nickmccurdy
Copy link

From the package.json docs:

Please do not put test harnesses or transpilers in your dependencies object. See devDependencies.

For this reason, I disagree with the devDependencies section of Heroku Node.js Support. To follow npm's recommendations and avoid confusion, I think it would be better to either set NPM_CONFIG_PRODUCTION=true by default or build the source with devDependencies included in advance (as with @jasonkarns' suggestion of npm pack).

@fenduru
Copy link

fenduru commented Aug 3, 2017

The way I see dependencies vs devDependencies is the difference between runtime dependencies and build time dependencies. For instance, if you are working on an NPM package and you list a transpiler in dependencies then every consumer of that package will pull in the transpiler even though their is no need. I do not think there is a fundamental difference between typical NPM pacakges and those being deployed via Heroku.

@nickmccurdy
Copy link

I agree, as long as you can treat libraries only used on the client side as build time dependencies because a bundler is inlining them for you.

@jmorrell
Copy link
Contributor Author

jmorrell commented Aug 3, 2017

invoking npm install --production after a full npm install has already populated the node_modules directory (with devDeps) will simply prune all the dev Deps away.

If this is quick even for large dependency trees then it sounds like a great solution 👍

@nickmccurdy
Copy link

nickmccurdy commented Aug 3, 2017

That sounds good, the --production flag could just be toggled in an env variable or something like that. But for most apps, adding dev dependencies to the production build shouldn't cause problems (whereas the opposite, which sometimes happens now, can cause problems).

@bbatha
Copy link

bbatha commented Aug 3, 2017

Instead of doing the whole install, pack, reinstall workflow. Maybe it should just run npm prune --production?

@jasonkarns
Copy link

@bbatha at what point? without a non-production install, what would be getting pruned?

how would the app get built? (ie, the babel compilation, etc) this is done automatically by npm-pack via the prepare hook that npm invokes.

@nickmccurdy
Copy link

Also, I could be wrong but I think --production sets NODE_ENV to production, not doing this (including using prune) could break production builds or enable dangerous development features in production.

@jasonkarns
Copy link

@nickmccurdy

--production sets NODE_ENV to production

Nope. Neither npm install nor npm prune mutate your environment.

@jasonkarns
Copy link

Well, I should clarify. It sets NODE_ENV=production during the install. But that doesn't affect runtime.

An example package with a postinstall script: echo node_env: $NODE_ENV

$ echo node_env: $NODE_ENV && npm install --production && echo node_env: $NODE_ENV
node_env:

> prod@0.0.0 postinstall /private/tmp/prod
> echo node_env: $NODE_ENV

node_env: production
npm WARN prod@0.0.0 No description
npm WARN prod@0.0.0 No repository field.

up to date in 0.118s
node_env:

@fenduru
Copy link

fenduru commented Aug 3, 2017

@bbatha at what point? without a non-production install, what would be getting pruned?

This is just an alternative to using npm pack and then having to run npm install a second time (why install things that were already installed moments ago?). This would install everything (including devDependencies) first, then run the build script, then prune out the no longer necessary devDependencies.

@nickmccurdy
Copy link

@jasonkarns I agree that it doesn't affect runtime directly, but if a build tool behaves differently in production (webpack often does depending on its configuration), the code and dependencies may be different at runtime in ways that affect performance, security, stability, etc.

@jasonkarns
Copy link

jasonkarns commented Aug 3, 2017

running npm install twice doesn't actually install anything twice. It builds the virtual dep tree in memory and then ensures that the filesystem matches it. Running npm-prune is functionally equivalent.

So npm i; npm pack; npm prune --production would be the same as running npm i; npm pack; npm i --production. (I personally have less confidence in prune working as intended due to npm's history, which is why I originally suggested install. But these days, I guess I'd trust it. :) )

I was thrown off by the comment that "prune" could replace "install, pack, reinstall".

Instead of doing the whole install, pack, reinstall workflow. Maybe it should just run npm prune --production?

The first install (with devDeps) is obviously still necessary such that devDeps exist in order to build the app. The pack step is necessary to compile the app while devDeps are available. So, a more accurate interpretation of that comment is that "prune" just replaces the last "reinstall". Which I'll concede.

npm i && npm pack && npm prune --production

edit - prune will result in a slightly different node_modules tree because it doesn't re-flatten any duplicated deps. So I'm maintaining my hesitation for using prune in favor of running install a second time (which doesn't run any slower). Though practically, they should still be the same. I just trust npm-install a bit more.

@bbatha
Copy link

bbatha commented Aug 3, 2017

I was thrown off by the comment that "prune" could replace "install, pack, reinstall".

Sorry if that's unclear. What I meant is that install, pack and then prune can be just install then prune. There's no need for the pack step unless your going to upload the built package somewhere else to npm or from CI system as a built slug.

@nickmccurdy
Copy link

nickmccurdy commented Aug 3, 2017

To clarify, those two sets of commands are only the same if your build tools executed in npm install behave the same in production. Running npm install twice doesn't install all dependencies again, but it could rerun hook scripts that may generate different code.

@jasonkarns
Copy link

jasonkarns commented Aug 3, 2017

There's no need for the pack step

@bbatha heroku would need to run the compilation step, regardless. having heroku specify some custom-to-heroku npm script as the build step is inferior to using the conventional (and established norm) of using prepare. And since heroku needs to make a slug anyway of the compiled assets, running npm-pack does the compilation and zipping in one go. Further, npm-pack would respect the package's rules about which files get included in the tarball. Authors would then be in control of whether unit tests and examples and docs, for instance, are included in the slug. This would lead to smaller slugs, and keep heroku in-line with how npm uses pack for publishing.

@nickmccurdy There shouldn't be any build tools being executed during npm-install. The only things running during npm-install are postinstall hooks. Which are widely advised against and should be avoided. It's a strong antipattern to do compilation during postinstall. Any build tools running would be executed by npm-pack (through the prepare script), or directly by some other npm-script that heroku dictates. Both of which would be running after install, where NODE_ENV is not mutated by npm.

@nickmccurdy
Copy link

That's cool, I didn't know npm pack ran prepare. 👍 for this.

I was implying that npm install runs hook scripts like prepare, not that the internal install command calls special scripts directly. npm install under the latest version of npm triggers prepare automatically. Sorry if that was unclear.

@jasonkarns
Copy link

jasonkarns commented Aug 3, 2017

@nickmccurdy Ah! Okay! Now I get you. 👍 thanks for being patient while I caught up :)

So yeah, in that case, I suppose the --production flag could impact the build tools! I'm not certain how I feel about build tools knowing/caring about NODE_ENV. Though if they do alter behavior, production is what we'd want. So the compilation triggered by the first npm-install is probably undesirable. I'd still prefer to use npm-pack because of how it specifically includes only the necessary (specified) files in the final tarball. But I suppose this would mean that:

npm i && npm pack && npm prune --production would actually execute the prepare script twice :(

EDIT:

  • The prepare script is not executed when --production is passed to npm install. Which means doing npm i --production instead of prune would be safe. (wouldn't run prepare a third time) and would also ensure a more consistent node_modules tree.

  • --production is honored when passed to npm pack, so if we want NODE_ENV=production during the compilation step, then npm pack --production would work.

@fenduru
Copy link

fenduru commented Aug 3, 2017

@jasonkarns it is not uncommon for build tools to alter behavior based on NODE_ENV, as seen here

@bbatha
Copy link

bbatha commented Aug 3, 2017

@bbatha heroku would need to run the compilation step, regardless. having heroku specify some custom-to-heroku npm script as the build step is inferior to using the conventional (and established norm) of using prepare. And since heroku needs to make a slug anyway of the compiled assets, running npm-pack does the compilation and zipping in one go. Further, npm-pack would respect the package's rules about which files get included in the tarball. Authors would then be in control of whether unit tests and examples and docs, for instance, are included in the slug. This would lead to smaller slugs, and keep heroku in-line with how npm uses pack for publishing.

Most of the logic here is sound for running pack. 👍 for reusing existing specification mechanisms. However, I don't really see how pack fits in the buildpack. If you're in a run of the buildpack, running pack would necessitate wiping the local directory, unzipping the zip and then running npm install. Then rezipping the whole thing. This seems to duplicate quiet a lot of work for little gain. Running prepare directly might be the right semantics you want inside of the buildpack's execution environment.

If your someone who builds on CI and uploads the slug to heroku. npm pack is a better fit, however it would not include the node_modules directory so you'd still need to run a buildpack to run NPM_CONFIG_PRODUCTION=true npm install. And as far as I know its not possible to have heroku run a buildpack on an uploaded slug like that.

The third thing running pack in the directory doesn't really account for is the multi-buildpack case where you may also want to include other directories in the slug that aren't listed in your package.json. This is also an issue if you want to defer cleanup until the end of all your buildpacks. The later is also an issue for running npm prune --production.

@jasonkarns
Copy link

Yeah, my original suggestion would have resulted in two tarballs that the buildpack would need to handle: the npm-packed tarball, and the node_modules directory itself. Not as clean as just having a single tarball, but I was thinking that two would be easily managed.

Re: multi-buildpacks. I wasn't even thinking about those. Good call.

@nickmccurdy
Copy link

That's unfortunate, should we avoid pack then or is there a way to get it to work with multiple buildpacks?

@Ajaxy
Copy link

Ajaxy commented Feb 9, 2018

Is there any suggestion as a result of this topic?
I'd love to:

  1. Install dev deps
  2. Run my build script (which uses babel) with npm run build
  3. Save built app without dev deps

@jasonkarns
Copy link

@Ajaxy (quick random tidbit: prepare script is run automatically as part of install, so it's generally recommended to have prepare be your "build" script. or at least to have "prepare": "npm run build")

@jmorrell
Copy link
Contributor Author

jmorrell commented Feb 9, 2018

@Ajaxy I'm afraid other work has taken priority, but I'll be working on a PR next week that will make that the default

In the mean time you could run npm prune --production as part of your heroku-postbuild script

@Ajaxy
Copy link

Ajaxy commented Feb 9, 2018

Thanks, so I ended up with:

{
    "build": "babel src -d build",
    "deploy:staging": "git push heroku master -f",
    "deploy:production": "heroku pipelines:promote",
    "prepare": "npm run build",
    "heroku-postbuild": "npm prune --production"
}

@Ajaxy
Copy link

Ajaxy commented Feb 9, 2018

Also, it seems that if I add src folder to .slugignore then I'm not able to build, so I had to change the last one for: "heroku-postbuild": "npm prune --production && rm -rf ./src"

@jmorrell
Copy link
Contributor Author

jmorrell commented Feb 9, 2018

@Ajaxy I would recommend doing everything in heroku-postbuild:

"heroku-postbuild": "npm run build && npm prune --production"

In the next few months I'm going to try to implement a solution to #479 which will run build by default unless heroku-postbuild is defined

@Ajaxy
Copy link

Ajaxy commented Feb 9, 2018

It hasn't worked for me.

remote: -----> Building dependencies
remote: Installing node modules (yarn.lock)
remote: yarn install v1.4.0
remote: [1/4] Resolving packages...
remote: success Already up-to-date.
remote: Done in 1.08s.
remote: Running heroku-postbuild (yarn)
remote: yarn run v1.4.0
remote: $ npm run build && npm prune --production && rm -rf src
...
Error: Cannot find module 'babel-runtime/core-js/object/keys'

So seems dev packages are not installed.
If I move npm run build back to "prepare", then it does install dev packages again:

remote: -----> Building dependencies
remote: Installing node modules (yarn.lock)
remote: yarn install v1.4.0
remote: [1/4] Resolving packages...
remote: [2/4] Fetching packages...
...
remote: [3/4] Linking dependencies...
...
remote: [4/4] Building fresh packages...
remote: $ npm run build
...
remote: Done in 29.02s.
remote: Running heroku-postbuild (yarn)
remote: yarn run v1.4.0
remote: $ npm prune --production && rm -rf src
remote: removed 426 packages in 5.632s
remote: Done in 6.21s.

@Ajaxy
Copy link

Ajaxy commented Feb 9, 2018

Well, it seems the last one was some unrelated issue. Now it's gone.

Also, for those using yarn: according to this topic yarnpkg/yarn#696 you need to use

yarn install --production --ignore-scripts --prefer-offline

instead of

npm prune --production

@lirbank
Copy link

lirbank commented Feb 20, 2018

@jmorrell I just noticed that setting heroku-postbuild to yarn build && yarn --production or yarn build && yarn --production && rm -fR ./src won't work if you're using Heroku CI. The tests will fail since the dev deps are not there (where you'd typically have your test runners etc).

Figured I'd mention it for consideration when you work on #519

@jmorrell
Copy link
Contributor Author

@lirbank I think in that case you can check the value of the $CI environment variable and add a condition. Something like this:

yarn build && [ $CI <> true ] && yarn --production && rm -fR ./src

In #519 I'll be sure to do the reasonable thing in test-compile. I've added a new bullet point for that. Thanks for calling this out 👍

@lirbank
Copy link

lirbank commented Feb 20, 2018

Quick update, for anyone interested, here is my final solution that works with Heroku CI too:

"heroku-postbuild": "yarn build && test \"$CI\" != \"true\" && yarn --production --ignore-scripts --prefer-offline"

jmorrell added a commit that referenced this issue Mar 1, 2018
Fixes #431 

Many users getting started with Heroku try to run a build step with webpack, gulp, etc only to find that it's in their devDependencies and not installed by default on Heroku. They then have to figure out how to tell Heroku to install these. Defaulting to installing everything makes for a smoother onboarding.

Changes the default behavior to install `devDependencies`, run the build step, and then prune the `devDependencies` once the build step is complete.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants