Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't resolve symlinks when requiring #3402

Closed
VanCoding opened this issue Oct 16, 2015 · 204 comments
Closed

Don't resolve symlinks when requiring #3402

VanCoding opened this issue Oct 16, 2015 · 204 comments
Labels
discuss Issues opened for discussions and feedbacks. module Issues and PRs related to the module subsystem.

Comments

@VanCoding
Copy link
Contributor

Currently, Node.js resolves symlinks when requiring and then uses the real location of the package/file as its __filename and __dirname instead of the symlinked one.

This is a problem because symlinked modules don't act the same as locally copied modules.

For example:

app
    index.js //require("dep1")
    node_modules
        dep1
            index.js //require("dep2")
        dep2
            index.js //console.log('fun!'):

works, but

app
    index.js //require("dep1")
    node_modules
        dep1 -> ../../dep1
        dep2
            index.js
dep1
    index.js //require("dep2")

does not, because dep1 does not act like it's located in the node_modules directory of module and thus cannot find dep2.

This is especially a problem when one wants to symlink modules that have peer-dependencies.

Therefore, I suggest changing the behavior to no longer resolve symlinks.
What do you think?

@VanCoding VanCoding changed the title Don't resolve symlinks Don't resolve symlinks when requiring Oct 16, 2015
@mscdex mscdex added the module Issues and PRs related to the module subsystem. label Oct 16, 2015
@Trott Trott added feature request Issues that request new features to be added to Node.js. and removed feature request Issues that request new features to be added to Node.js. labels Oct 16, 2015
@Trott
Copy link
Member

Trott commented Oct 16, 2015

This would break npm link, wouldn't it?

@MylesBorins
Copy link
Member

if you are requiring dep1, and it is a symlink, and it doesn't have dep2 in its path then something is off is it not? Should the module not include its own node_module folder?

That being said flat dependencies in npm@3 do offer a weird edge case to this. But it is worth bringing up that I am pretty sure they are deprecating peer dependencies.

@VanCoding
Copy link
Contributor Author

@Trott why would it? In theory, everything that works with the real path should also work fine with the virtual path. The only real problem that i see is when a module gets required multiple times through different symlinks it would not be cached by npm and get loaded multiple times. But if that really happens, something is badly designed i think.

@thealphanerd
No, because its a peer dependency.
I don't think NPM is depreciating peer dependencies in gerenal, but changing its behavior to just warn when a peer dependency is not fullfilled. Maybe they're even gonna drop the peerDependencies property, but that does not prevent people to peer depend on things. Devs will then just need to manage them on their own.

I don't think the need for peer dependencies will go away in the future.

@VanCoding
Copy link
Contributor Author

I've also asked for a new npm command that would only work if we fix the behavior of require like i suggested here: npm/npm#10000 (comment)

@Trott
Copy link
Member

Trott commented Oct 17, 2015

@VanCoding I was thinking of certain edge cases but I wasn't thinking very deeply about it, so yeah, that particular comment may be a non-issue.

The real point I was sorta kinda trying to gently make was better put by @othiym23 (emphasis added):

I do think you're right that require()'s behavior is unhelpful in this case, but I also think that given how important having that behavior locked down is to the stability of the whole Node ecosystem, it's going to be very tough to change now.

Any semver-major change to require() could result in all sorts of (potentially hard-to-predict) ecosystem breakage. So the bar is very, very high.

@VanCoding
Copy link
Contributor Author

I know, but I think the current behavior maybe even counts as a bug because
there seems to be no reason to behave like this..
Am 17.10.2015 7:06 nachm. schrieb "Rich Trott" notifications@github.com:

Moreover, the Modules API (which require is a part of) is Locked which
means:

Only fixes related to security, performance, or bug fixes will be accepted.
Please do not suggest API changes in this area; they will be refused.


Reply to this email directly or view it on GitHub
#3402 (comment).

@bnoordhuis
Copy link
Member

The litmus test is this: is there a non-zero chance the proposed change is going to break someone's existing application? If the answer is 'yes' (and I think it is), then it should be rejected.

@VanCoding
Copy link
Contributor Author

Well, how are we going to calculate that chance? And what does a zero
chance mean? Zero like not even module is going to break or not 1 % or more
of the pakages? If it's the latter, then i really doubt that 1% of the
modules would be affected by that change. If there really are modules that
break with this change then theyre making use of undocumented (in my
opinion buggy) behavior and therefore it's okay when it breaks.

Let's look at it like this: one can always get the real path from the
virtual path, but if one gets the real path, the virtual path is lost.

So even if we break some modules, isn't the right decision to fix this? I
really think it would help fix issues for npm as well.

But it of course is your decision.
Am 17.10.2015 7:34 nachm. schrieb "Ben Noordhuis" <notifications@github.com

:

The litmus test is this: is there a non-zero chance the proposed change is
going to break someone's existing application? If the answer is 'yes' (and
I think it is), then it should be rejected.


Reply to this email directly or view it on GitHub
#3402 (comment).

@dlongley
Copy link

@VanCoding, maybe hard links would help. I haven't tried it.

@asbjornenge
Copy link

@dlongley You can't hardlink a directory I think?

I am having the exact same issue as @VanCoding and I'm also a bit confused by the current behaviour...?

I would expect a linked package to resolve dependencies based on the location it was linked to. It should of course check it's own node_modules first, but when going back "up the tree" I would expect it to check the folder structure it was linked to, not where it was linked from.

@dlongley
Copy link

@asbjornenge,

You can't hardlink a directory I think?

No, you can't, but maybe you could write a script to create a mock directory and hard link any js files. I think that would be all it would take for simple modules. It's not a perfect solution but maybe it would help in some cases. Clearly there needs to be a better way to link peer dependencies together during development.

I would expect a linked package to resolve dependencies based on the location it was linked to.

Me too, but I believe this has been discussed before -- and there may be projects out there that are depending on the current behavior. We'll just need to come up with a way to specify that the other behavior is desired.

@VanCoding
Copy link
Contributor Author

and there may be projects out there that are depending on the current behavior

As I said before:

  • while there most likely are such modules, there are only a few (estimated way below 1% of the modules)
  • they'll be easy to fix, while it's impossible to work around the current behavior
  • they're making use of undocumented behavior

And as it seems...

  • developers already expect it to behave like i suggested, yet they'd be surprised when they find out how it really behaves

@GeorgeBailey
Copy link

We'll just need to come up with a way to specify that the other behavior is desired.

@dlongley, I don't like it. I'd rather 'keep it simple ..' in the long-term.

@VanCoding makes a compelling argument.

Perhaps we should make this minor 'breaking change', and just add an opt-out capability. The opt-out should be removed in a future version..

@dlongley
Copy link

I agree with @VanCoding's argument.

Perhaps we should make this minor 'breaking change', and just add an opt-out capability. The opt-out should be removed in a future version.

👍 I'm in favor of this if it can get consensus.

@Fishrock123
Copy link
Member

Perhaps we should make this minor 'breaking change', and just add an opt-out capability. The opt-out should be removed in a future version.

That is an addition, so semver-minor. I will warn you that we are very conservative about doing anything with the module system.

@Trott
Copy link
Member

Trott commented Oct 30, 2015

If someone wants to put together a PR for this (with the understanding that, again, the bar for this is very, very high and so even if it's a Great Idea and an Excellent Implementation, it's still very likely that this Will Not Happen), it might be interesting to use https://github.com/nodejs/citgm (if that tool is ready for general use?!) to see if anything Too Big To Fail in the ecosystem chokes on the change.

(To head off the obvious and explain the bold comment above: Yes, it's easy to fix any modules that break. That's not good enough though. Here's the deal: Let's say this breaks the async module. Fixing it in a new version of async is theoretically no problem. But then getting the thousands and thousands of modules that depend on async to update to the fixed version is a huge problem. And getting everyone using those thousands and thousands of modules to update those modules to fixed versions... This is why a stable and imperfect module API is vastly preferable to a dynamic module API.)

Of course, if there was a way to confirm that no modules on npm depend on the current behavior, that would go a long way (in my opinion, anyway) towards making the change palatable. I have no idea how one would go about demonstrating that, though. But you can use citgm to at least confirm that it won't break any of a handful of the most important ones.

@GeorgeBailey
Copy link

@Fishrock123, I tend to think my suggestion should wait until the next semver-major since I suggested a breaking change with 'opt-out'?

@Trott, These are good points. The last thing sysadmins want is trouble when upgrading a datacenter to a new version of Node.js.

We should also consider any arguments for current behavior besides compatibility. Are there cases where the current behavior is more intuitive than the proposed change? If not then I think the proposed change should be implemented sooner or later.

I know that Modules are at Stability 3 - Locked.

Only fixes related to security, performance, or bug fixes will be accepted.
Please do not suggest API changes in this area; they will be refused.

There are good ways to introduce a breaking change over a period of time. Statistics would indicate the best course of action - even if it means extra work on the front end before this change can be implemented.

@fenduru
Copy link

fenduru commented Nov 2, 2015

This is causing me pain right now. I'm working on an enhancement to a library that automagically loads plugins by doing require(pluginName). But since I've used npm link to install the library, the require fails because pluginName can't be found in the tree.

I think this behavior is unexpected, especially given the interaction with npm link. I think a symlinked package should behave identically to an installed package in the same location.

From the modules docs:

If it is not found there, then it moves to the parent directory, and so on, until the root of the file system is reached.

I suppose this is ambiguous, but I would argue that the "parent directory" should be relative to the path used to load the module. If I do require('/home/me/project/node_modules/foo/index.js'), and index.js does require('something-else'), I think it is incorrect to look up something-else relative to anything other than the full path in the first require.

@Trott it might be worth taking a look at https://github.com/brson/taskcluster-crater to see how they test the rust ecosystem against potentially dangerous changes.

@dlongley
Copy link

dlongley commented Nov 3, 2015

@fenduru 👍

Looking at the examples from the documentation, it is, IMO, unexpected behavior for a file that happens to be a symlink to behave differently. I would expect it to behave the same as any other file, which is as @fenduru described.

Furthermore, the documentation gives one good reason for the way module searching is performed:

This allows programs to localize their dependencies, so that they do not clash.

The fact that searching does not work this way with symlink'd modules makes it very difficult to make use of localized dependencies (this is the chief complaint). As pointed out above, it also causes different (and unexpected) behavior with npm link. Perhaps this issue is better classified as an "unexpected behavior" bug rather than a request for change.

@dlongley
Copy link

dlongley commented Nov 3, 2015

Just a note here that a fix to the unexpected behavior shouldn't change the existing behavior for module identifiers that are native or that begin with '/', '../', or './'. In those latter three cases it does make sense to use realpath; changing that would be incorrect and I imagine would wreak havoc.

@kzc
Copy link

kzc commented Nov 3, 2015

I also found this require/symlink behavior to be unexpected and undesirable. It goes against common UNIX practice.

Out of curiosity - does node on Windows also have this require/symlink behavior? (At one time Windows did not support symlinks). If node behavior on UNIX and Windows differ in this regard that would strengthen to case against symlink resolving in require() on UNIX to be consistent.

At the very least an optional node command line flag could be introduced to disable resolving symlinks when require()ing.

@dlongley
Copy link

dlongley commented Nov 3, 2015

@kzc,

Out of curiosity - does node on Windows also have this require/symlink behavior?

Assuming that fs.realPathSync dereferences symlinks on windows then it seems so (but I only checked quickly): https://github.com/nodejs/node/blob/master/lib/module.js#L150

It looks like the problem is that no distinction is made between loading a module with an identifier that starts with '/', '../', or './' and one that doesn't -- all are given the realpath treatment (and it doesn't matter which OS you're using). If, instead, realpath were only used in the former cases, then symlinks would be treated like other files (I think) and you'd get the expected behavior and the ability to localize dependencies for symlink'd modules.

@kzc
Copy link

kzc commented Nov 3, 2015

Regardless of whether require()d modules have relative or absolute paths they still could be converted to canonicalized absolute paths without resolving symlinks. I see the need for the module cache to use absolute paths so that it knows when it's dealing with the same module referenced from various different relative paths. But if someone explicitly introduces a symlink in the path of a module then I'd argue that most would expect it to be treated differently than the symlink resolved version according to its non-resolved directory position.

@dlongley
Copy link

dlongley commented Nov 3, 2015

@kzc,

Regardless of whether require()d modules have relative or absolute paths they still could be converted to canonicalized absolute paths without resolving symlinks.

This is problematic for the case shown here: http://grokbase.com/t/gg/nodejs/12cbaygh7a/require-resolving-symlink

This is required because of the way that executables are typically
linked. On Unix, you have something like:

/usr/local/node_modules/foo/bin/foo.js

which does `require("../lib/foo-internals.js")`.

That script is linked to

/usr/local/bin/foo

If we didn't realpath, then it'd be looking for
/usr/local/lib/foo-internals.js instead of
/usr/local/node_modules/foo/lib/foo-internals.js.

@kzc
Copy link

kzc commented Nov 3, 2015

The node start script is a special case that could be symlink resolved if the file has a shebang or a non .js suffix.

@akyoto
Copy link

akyoto commented Nov 10, 2015

I'm having the same problem as @fenduru.

  1. Lib installs plugins via npm install, they end up in the correct directory (user project directory).
  2. I also checked require.main.paths[0], it's 100% correct and contains the user project directory.
  3. Nonetheless: require and require.resolve both throw.

This is unexpected.

It only happens when symlinked / npm link'd. In a normal installation everything works as expected.
I don't want to npm publish every single change - how am I gonna continue developing the lib?

Since the directory is actually included in require.main.paths this issue is undoubtedly a bug if you ask me. At the very least it's worth fixing.

👍

@ghost
Copy link

ghost commented Dec 4, 2016

Symlinking can be fixed, and works great :)

@majid4466
Copy link

Since this issue has the discuss label, I thought I could ask here ...

The second best thing to being able to do const WebSocket = require('../vendors/node/node_modules/ws'); (which cannot be done) is having a symlinked node_modules directory like below.

Which version of the behavior would allow the following?

/var/www/project/
├── app
│   └── ws
│       ├── node_modules -> /var/www/project/vendors/node/node_modules
│       └── wsserver.js
└── vendors
    └── node
        ├── node_modules
        └── package.json

nightkr added a commit to nightkr/yarn2nix that referenced this issue Jun 15, 2018
zimbatm pushed a commit to nix-community/yarn2nix that referenced this issue Jun 17, 2018
* Preliminary support for local dependencies

* Fixed building packages with no dependencies

* Transitive local dependencies

* Don't resolve workspace dependencies separately

* Typo fix

* Test case for workspace dependencies

* Copy workspace dependencies, otherwise their dependencies can't be resolved

See nodejs/node#3402 for more details

* Let mkYarnModules determine its own name

* Don't use passAsFile

* Got rid of a leftover comment
@mk-pmb
Copy link

mk-pmb commented Mar 14, 2019

@majid4466 symlinking that node_modules dir should work with the current node.

As for the original opening question: dep1 probably shouldn't require dep2, but instead provide a factory that can produce whatever based on dep2. The factory could have a meta data property to declare which other modules the app (or a plugin manager) needs to provide. It's one flavor of dependency injection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issues opened for discussions and feedbacks. module Issues and PRs related to the module subsystem.
Projects
None yet
Development

No branches or pull requests