Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process.chdir does not affect working directory used by skott for creating node id paths #149

Open
mattkindy opened this issue Mar 8, 2024 · 18 comments
Labels
bug Something isn't working

Comments

@mattkindy
Copy link

Summary

When running an analysis with the API and specifying a directory via cwd, the node ids end up being paths relative to the parent node process's current working directory. For example, if running a node script index.js from /usr/application which runs

const api = await skott({
  cwd: '/some/other/path'
  incremental: false,
  circularMaxDepth: 100
});

const { graph } = await api.getStructure();

Node ids would follow the form ../../some/other/path/index.js instead of the expectation of index.js (or even just /some/other/path/index.js). Whether this is intentional or not is difficult to tell.

However, even when running process.chdir('/some/other/path') prior to the above, it doesn't affect the node ids. I would expect that to have some effect.

Reproduction steps

I tried to make a codesandbox that fails, but cannot reproduce there. However, running the following locally causes me the issue

  1. Create index.js file at /tmp/repo/index.js
  2. Run the following with node script.js

script.js

import process from 'process';
import skott from 'skott';

process.chdir('/tmp/repo');

const api = await skott({
  cwd: '/tmp/repo',
  incremental: false,
});

const { graph } = await api.getStructure();
console.log(JSON.stringify(graph, null, 2) + '\n');

Expected result: Graph with single node with id index.js

Actual result: Depending on where running from, graph with single node with id like ../../../tmp/repo/index.js

Details

This appears to be related to the following line:
https://github.com/antoine-coulon/skott/blob/295d4b8aed0593f0f7a738e4d0214cd4520ba32a/packages/skott/src/skott.ts#L158C3-L158C18

this.#baseDir = '.';

This #baseDir is updated during buildFromEntrypoint, but not in buildFromRootDirectory (which is invoked here). In resolveNodePath, this ends up returning path.relative(this.#baseDir, nodePath). I expect the solution here is one of the following:

  • Add an extra parameter to skott that allows the specification of the baseDir of the repo for analysis explicitly. This makes sense when used in conjunction with cwd in the context of a monorepo. Right now, skott implicitly assumes via the API that it's being executed in the root of the project under analysis.
  • When running buildFromRootDirectory, change the #baseDir to be the cwd specified in the options. This may have some interactions when accessing scope "above" the cwd that I'm not aware of, however.

Standard questions

Question Answer
skott installed version? 0.33.0
Operating system? Mac
Would you consider contributing a PR? Yes
Node.js version (node -v)? 18.19.1
@mattkindy mattkindy added the bug Something isn't working label Mar 8, 2024
@antoine-coulon
Copy link
Owner

Hello @mattkindy,

It seems that this is not a bug and actually the expected behavior when using cwd. Just to be sure I'm understanding correctly your point, let me do a quick recap of the situation:

Given a folder such as <root>/package-a/index.js and script.js invoking skott API from there such as:

import skott from "skott";

skott({ cwd: "package-a" })

skott uses cwd as a way of knowing a starting point from the analysis, but not really from a process.cwd standpoint, more like a base directory to know what files should be analyzed. So maybe the problem there is that cwd name is misleading as it lets the user think that after using cwd: "package-a" the process.cwd will be set from <root> to <root>/package-a. So keeping that base directory from node's process where skott is being executed is a way of having node ids closer to a meaningful file system tree where instead of having relative paths everywhere, you just start from the universal directory where the script was run from.

Consequently, resolved files under cwd will still be containing <root> in the path given that the real starting point from a file-system perspective is the directory from which the script is invoked, that is <root>.

The generated graph would then be:

{
   "package-a/index.js": { }
}

While you would expect it to be

{
   "index.js": {}
}

given the cwd option?

This is basically the current difference when providing or not an entrypoint, because if you did skott({ entrypoint: "package-a/index.js }) the graph would actually be

{
   "index.js": {}
}

because it gets rid of the base directory from which the script was started and use the base directory from the entrypoint to settle the analysis base directory.

Reproduction

I just want to be sure that we are talking about the same thing and because I was not able to fully grasp the problems you faced with the provided examples, here is a StackBlitz reproduction sandbox that describes your current use case and the expected behavior https://stackblitz.com/edit/stackblitz-starters-abyuzx?file=script.js (you can run the script by using npm run skott in the terminal).

Please if it does not match the issue you had in mind try to reproduce it there and send me the forked version

@mattkindy
Copy link
Author

Hey, @antoine-coulon

Yeah, sorry for the confusion, I think there are two parts here... one which may be a bug, one which seems to be more of a feature request.

skott uses cwd as a way of knowing a starting point from the analysis, but not really from a process.cwd standpoint, more like a base directory to know what files should be analyzed. So maybe the problem there is that cwd name is misleading as it lets the user think that after using cwd: "package-a" the process.cwd will be set from <root> to <root>/package-a. So keeping that base directory from node's process where skott is being executed is a way of having node ids closer to a meaningful file system tree where instead of having relative paths everywhere, you just start from the universal directory where the script was run from.

This was only slightly confusing -- in the docs, you mention this within the context of a monorepo, so this made sense. I originally tried to use cwd thinking it would behave differently, but ultimately, this is understood. (Side note, I do find it somewhat confusing given the lack of a "base directory" option).

From a possible bug perspective, though, I would expect if I run process.chdir('/something') prior to running skott, that we should expect /something to be the base directory ("universal directory where the script was run from") because that is the directory of the node process from which skott is executed at the time of dispatch. This is where I was having trouble creating a reproduction, as it consistently occurs for me locally (in Docker and native macOS).

I was testing with both of these situations:

script.js

process.chdir('/tmp/repo');
const api = await skott({
  incremental: false,
});

const graph = await api.getStructure();

script_cwd.js

process.chdir('/tmp/repo');
const api = await skott({
  cwd: '/tmp/repo',
  incremental: false,
});

const graph = await api.getStructure();

The first ends up performing analysis from the directory the original script was executed from (e.g. if I ran script.js from /usr/local, it would attempt analysis of /usr/local). The second gives me an analysis of /tmp/repo, but with relative paths from /usr/local.

You can see my attempt to reproduce this behaviour here, but for some reason, I cannot replicate this behaviour in CodeSandbox. Another example forked from your Stackblitz, where I cannot reproduce!


From a feature request perspective, it would be nice to have the opportunity to communicate to skott this "base directory", which could default to '.' for backwards compatibility. That way, we could take this

instead of having relative paths everywhere, you just start from the universal directory where the script was run from

and make sure it is true.


Ultimately, I'm happy to convert this to a feature request or something at this point, as I cannot reproduce outside of my local setup, despite everything being consistent from my perspective. Let me know if this sufficiently clarifies what I'm experiencing.

@mattkindy mattkindy changed the title Node ids are created with paths relative to . instead of cwd Node working directory is not used by skott for creating node id paths Mar 8, 2024
@mattkindy mattkindy changed the title Node working directory is not used by skott for creating node id paths process.chdir does not affect working directory used by skott for creating node id paths Mar 8, 2024
@mattkindy
Copy link
Author

For now I'm dealing with this with

    const normalizeId = (id: string) => path.relative(cwd, id);

    const { graph } = await api.getStructure();
    return Object.fromEntries(
      Object.entries(graph).map(([id, graph]) => {
        const basePath = normalizeId(id);
        return [
          basePath, 
          { 
            ...graph, 
            id: basePath, 
            adjacentTo: graph.adjacentTo.map(normalizeId) 
          }
        ];
      })
    );

@antoine-coulon
Copy link
Owner

Thanks for such a complete feedback @mattkindy!

This was only slightly confusing -- in the docs, you mention this within the context of a monorepo, so this made sense. I originally tried to use cwd thinking it would behave differently, but ultimately, this is understood. (Side note, I do find it somewhat confusing given the lack of a "base directory" option).

I agree, as already stated I think it's a bit misleading, cwd lets the user think that the working directory from a process perspective will be updated, but it's not.

Consequently from an analysis standpoint, the behavior is strictly the same and the graph should end up being the same (if not then it's a bug). So in the end it's a matter of what type of node path you want, whether relative from the cwd or kind of absolute from skott script location.

It turns out that most of the time I found the DX to be better and the graph to be more visible when getting rid of all the relative path madness ../../module.js -> ../../../lib/whatever.js, but as you mentioned the solution might be to let the choice of what should be the base directory (so in the same spirit as when using an --entrypoint, you have the choice of keeping the base directory or to prune it) and it would kind of unify both behaviors (entrypoint vs root dir).

In addition to the option that should be available, there is also the fact of changing the current working directory outside of skott that you mentioned: using process.chdir. This one should normally work when not providing any cwd as the default runtime config takes process.cwd() as the default so it might be a bug indeed.

I won't have the opportunity to dive in this week-end but I'll be able to spend some serious time on this issue starting from the beginning of next week :) Also if you ever want to land PRs around these topics, feel free to do so!

@antoine-coulon
Copy link
Owner

For now I'm dealing with this with

    const normalizeId = (id: string) => path.relative(cwd, id);



    const { graph } = await api.getStructure();

    return Object.fromEntries(

      Object.entries(graph).map(([id, graph]) => {

        const basePath = normalizeId(id);

        return [

          basePath, 

          { 

            ...graph, 

            id: basePath, 

            adjacentTo: graph.adjacentTo.map(normalizeId) 

          }

        ];

      })

    );

Glad to know you found a way to circumvent that! Could you please paste there some pieces of both the before/after graphs when using the normalization process?

@mattkindy
Copy link
Author

Glad to know you found a way to circumvent that! Could you please paste there some pieces of both the before/after graphs when using the normalization process?

Happy to do so once I get a minute!

In addition to the option that should be available, there is also the fact of changing the current working directory outside of skott that you mentioned: using process.chdir. This one should normally work when not providing any cwd as the default runtime config takes process.cwd() as the default so it might be a bug indeed.

Yeah, this one is quite strange. Given the reproduction difficulties, I'm tempted to say there's something off about my local environment, but can't figure out what it is or why it would behave in this way.

Also if you ever want to land PRs around these topics, feel free to do so!

👍

@mattkindy
Copy link
Author

Graphs --

Before

 "../../../../../../../var/folders/8p/k8mkhgrs4xjdvhk3l5rgy3fh0000gn/T/clt97320x0000xdnrquarnpd3-repositoriesICHELB/account/repo/aws-events.ts": {
    "id": "../../../../../../../var/folders/8p/k8mkhgrs4xjdvhk3l5rgy3fh0000gn/T/clt97320x0000xdnrquarnpd3-repositoriesICHELB/account/repo/aws-events.ts",
    "adjacentTo": [],
    "body": {
      "size": 439,
      "thirdPartyDependencies": [],
      "builtinDependencies": []
    }
  },
  "../../../../../../../var/folders/8p/k8mkhgrs4xjdvhk3l5rgy3fh0000gn/T/clt97320x0000xdnrquarnpd3-repositoriesICHELB/account/repo/jest.ci.config.js": {
    "id": "../../../../../../../var/folders/8p/k8mkhgrs4xjdvhk3l5rgy3fh0000gn/T/clt97320x0000xdnrquarnpd3-repositoriesICHELB/account/repo/jest.ci.config.js",
    "adjacentTo": [
      "../../../../../../../var/folders/8p/k8mkhgrs4xjdvhk3l5rgy3fh0000gn/T/clt97320x0000xdnrquarnpd3-repositoriesICHELB/account/repo/jest.config.js"
    ],
    "body": {
      "size": 179,
      "thirdPartyDependencies": [],
      "builtinDependencies": []
    }
  },

VS

After

  "aws-events.ts": {
    "id": "aws-events.ts",
    "adjacentTo": [],
    "body": {
      "size": 439,
      "thirdPartyDependencies": [],
      "builtinDependencies": []
    }
  },
  "jest.ci.config.js": {
    "id": "jest.ci.config.js",
    "adjacentTo": [
      "jest.config.js"
    ],
    "body": {
      "size": 179,
      "thirdPartyDependencies": [],
      "builtinDependencies": []
    }
  },

@antoine-coulon
Copy link
Owner

Thanks for providing these graphs @mattkindy.
I am able to reproduce the behavior consistently so I should be able to land something confidently.

Let's just recap on the situation to be sure it will fit your expectations.

Given an absolute path such as /var/folders/3v/some-project (basically a temp folder) and a skott's process being run in the following folder /Users/johnskott/blabla/some-folder with that:

skott({ cwd: "/var/folders/3v/some-project" })

skott will have graph node paths written in a relative way from where it was initially executed (/Users/johnskott/blabla/some-folder) such as ../../../../../../../../../var/folders/3v/some-project/some-file.js.

So the graph is correctly computed but paths are ugly as the location of the folder is higher FS-wise to the location where skott was executed. Regarding that, we could achieve normalization (what you pasted there) by default and otherwise when using --includeBaseDir (which is currently disabled when using --cwd), paths would look exactly as it is now, because including the base directory in that case means that paths should contain the information where skott was run.

To come back a little bit on "process.chdir", I found the problem with skott and I'll land a fix (this is orthogonal to the subject just above)

It was indeed due to process.cwd() being eagerly executed there

cwd: withDefaultValue(process.cwd())(D.string),

so the following would work for you as expected and would provide the expected paths that is directly starting from the temp location:

process.chdir("/var/folders/3v/lr71j2ks5flgnjw2n82qf49w0000gn/T/skott_store/knex/knex@7363211");

skott().then(({ getStructure }) => {
  return getStructure();
})

@mattkindy
Copy link
Author

Yes, that's exactly right. Dang, I was looking around the repository for what you found in the config file, but missed it. Nice!

@antoine-coulon
Copy link
Owner

That's usually the type of small bug that can make you waste hours 😆

I just released 0.33.1 including the small patch for process.chdir, let me know how that goes for you on that side

@mattkindy
Copy link
Author

Unfortunately, this didn't seem to fix my particular issue. I ran some logging statements in the node_modules for skott itself and for some reason, a process.chdir from my main script didn't affect the process.cwd from within skott itself 🤯

@antoine-coulon
Copy link
Owner

Hello @mattkindy, are you sure you're using the 0.33.1 version? From the tests I did, it indeed seems to alter the cwd skott is using. Here is a StackBlitz reproduction with the behavior (if I'm not mistaken) you expected https://stackblitz.com/edit/stackblitz-starters-abyuzx?file=script.js

Note that combining both process.chdir and cwd won't work, as the later will override the former.

❯ node script.js

With default process.cwd {
  'package-a/lib/index.js': {
    id: 'package-a/lib/index.js',
    adjacentTo: [],
    body: { size: 0, thirdPartyDependencies: [], builtinDependencies: [] }
  },
  'package-a/src/index.js': {
    id: 'package-a/src/index.js',
    adjacentTo: [ 'package-a/lib/index.js' ],
    body: { size: 67, thirdPartyDependencies: [], builtinDependencies: [] }
  },
  'script.js': {
    id: 'script.js',
    adjacentTo: [],
    body: { size: 342, thirdPartyDependencies: [], builtinDependencies: [] }
  }
}

With mutated process.cwd {
  'lib/index.js': {
    id: 'lib/index.js',
    adjacentTo: [],
    body: { size: 0, thirdPartyDependencies: [], builtinDependencies: [] }
  },
  'src/index.js': {
    id: 'src/index.js',
    adjacentTo: [ 'lib/index.js' ],
    body: { size: 67, thirdPartyDependencies: [], builtinDependencies: [] }
  }
}

If you are not able to reproduce the behavior then it might be another issue.

@mattkindy
Copy link
Author

I meant to respond to this after I did a more thorough investigation, but the short answer is -- yes, I was using 0.33.1, confirmed by looking at the actual node_modules and seeing the change you made.

I ran some logging statements in the node_modules for skott itself and for some reason, a process.chdir from my main script didn't affect the process.cwd from within skott itself 🤯

I actually went and and modified node_modules to add these logging statements and for some reason beyond my comprehension, the values of process.cwd run from within the node_modules vs within my code which calls to skott were different.

@antoine-coulon
Copy link
Owner

antoine-coulon commented Apr 5, 2024

Hello @mattkindy, were you able to check the StackBlitz repository I linked with an example just above? Does it match the behavior you're actually reproducing with your own code base?

In the meantime I'll investigate this time in a deeper way to see if I can catch any inconsistency regarding process.cwd management

@mattkindy
Copy link
Author

@antoine-coulon the StackBlitz repository you sent appears to function as I wanted mine to. It doesn't match what I'm seeing locally on my mac

@antoine-coulon
Copy link
Owner

Hello @mattkindy, I'm also using a Mac and it seems to work as expected out of StackBlitz, I'll investigate to see if something related to process.cwd is still eagerly executed but if I don't have a way to reproduce the failure it will be hard to be guarantee a fix for it

@mattkindy
Copy link
Author

mattkindy commented May 8, 2024

Went ahead and tested different node versions to make sure that wasn't the issue (the stackblitz was on 18.18.0, I was on ~18.19.0). Still distinct behaviour between the two, and not working locally for me, but working fine in the stackblitz.

@antoine-coulon
Copy link
Owner

Hey @mattkindy, the provided example on StackBlitz is working fine for me and I'm running on MacOS Sonoma (14). I tried with both Node.js v18.17.0 and v20.9.0 and they seem to work fine :/

When testing on your Mac, does process.chdir have an effect or not at all? What is the difference of behavior you're witnessing between StackBlitz and your Mac? Also given that process is a global object and side-effects can be processed even in library code, were you able to isolate skott and only skott when running examples?

I ran some logging statements in the node_modules for skott itself and for some reason, a process.chdir from my main script didn't affect the process.cwd from within skott itself 🤯

Were you using the skott API? I went through the codebase and there is no eager evaluation of skott anymore from the core, so any change of chdir should be reflected. The only places left would be if using the scripts from the /bin folder used by the CLI but from what I could understand you're not doing that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants