Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workspace support #1505

Open
frostming opened this issue Nov 9, 2022 · 22 comments
Open

Workspace support #1505

frostming opened this issue Nov 9, 2022 · 22 comments
Assignees
Labels
⭐ enhancement Improvements for existing features

Comments

@frostming
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

@frostming frostming added the ⭐ enhancement Improvements for existing features label Nov 9, 2022
@noirbizarre
Copy link
Member

That would be awesome.
Do you already have something in mind ? Same approach as Cargo ?

@frostming
Copy link
Collaborator Author

frostming commented Nov 19, 2022

Yes it would be similar to Cargo:

  • pdm init --workspace to add a tool.pdm.workspace setting in pyproject.toml
  • When the section is present, commands will turn into workspace mode -- will perform the same in subpackages
  • All subpackages will be installed in editable mode.
  • --include/--exclude options to control the range of subpackages
  • The lockfile will lock the dependencies in all subpackages.

For example, the content of pyproject.toml:

[tool.pdm.workspace]
packages = ["packages/*"]

File structure:

.
├── pyproject.toml
└── packages
   ├── foo
   │   └── pyproject.toml
   └── bar
       └── pyproject.toml

pdm add click will add click to the dependencies of both foo and bar
pdm run test will run test script(if exists) for both foo and bar
pdm add --include foo: foo only
pdm add --exclude foo: all but foo.

@sanmai-NL
Copy link
Contributor

What would be the best workaround for now to emulate this feature at the least cost, @frostming?

@frostming
Copy link
Collaborator Author

frostming commented Dec 12, 2022

What would be the best workaround for now to emulate this feature at the least cost, @frostming?

Using editable installs. Here is a simple example.

parent pyproject.toml:

[tool.pdm.dev-dependencies]
workspace = [
  "-e file:///${PROJECT_ROOT}/packages/foo#egg=foo",
  "-e file:///${PROJECT_ROOT}/packages/bar#egg=bar",
]

Metadata of foo: packages/foo/pyproject.toml:

[project]
name = "foo"
version = "0.1.0"

[build-system]
requires = ["pdm-pep517"]
build-backend = ["pdm.pep517.api"]

Metadata of bar: packages/bar/pyproject.toml:

[project]
name = "bar"
version = "0.1.0"
dependencies = ["foo"]  # specify dependency of other packages inside your workspace as named requirement

[build-system]
requires = ["pdm-pep517"]
build-backend = ["pdm.pep517.api"]

Then in the parent project, run: pdm install, will generate a pdm.lock and install foo and bar in editable mode into the environment. Since they are in editable mode, any modification to the foo or bar packages will take effect without reinstallation.

This should work NOW, the proposal in this issue will be a wrapper around the above with a more friendly UI.

@lqhuang
Copy link

lqhuang commented Feb 4, 2023

Background: major in Python, sometimes Scala (sbt) and Rust (cargo)

I also want to propose a file layout with my opinion. Here is an example of my ideal PDM workspace:

.
├── docs
├── package-01
│   ├── docs
│   ├── pyproject.toml
│   ├── src
│   │   └── mynamespace
│   │       └── foo
│   └── tests
├── package-02
│   ├── docs
│   ├── pyproject.toml
│   ├── src
│   │   └── mynamespace
│   │       └── bar
│   └── tests
├── package-03
│   ├── docs
│   ├── mynamespace
│   │   └── baz
│   ├── pyproject.toml
│   └── tests
├── package-04
│   ├── docs
│   ├── pyproject.toml
│   ├── src
│   │   └── package_04
│   └── tests
├── package-05
│   ├── docs
│   ├── package_05
│   ├── pyproject.toml
│   └── tests
├── pyproject.toml
└── tests

And let me add some notes:

  1. In my case, the reason I want to use mono-repo pattern for Python is I have several packages have the same top level namespace (or for enterprise internal tools, it could also be an org name) to maintain, but they are currently developed and released in different repos. So in my proposal package-01, package-02 and package-03 following PEP420 have the same namespace, except package-03 doesn't use src layout.
  2. package-04 and package-05 stand for those standard way without own namespace and respectively layout with src or without src
  3. All packages will have its own individual tests + docs directories and pyproject.toml

Other thoughts:

  1. I place all packages in top level root for that I'm worried about overly nested directory level if src + namespace layout is applied. I think it's fine if this mono-repo is a pure Python project. But in some cases, those projects developed by multiple languages will have more complex structures. So it could be configurable like:

    [tool.pdm.workspace] # pyproject.toml in root dir
    packages = ["./*"]  # search by glob pattern
    packages = ["packages/*"]  # nodejs style
    packages = ["modules/*"]  # jvm style
    packages = ["src/*"]  # rust/c/c++ style
    
    packages = ["packages/foo", "packages/bar", "packages/baz"]  # manually specified
  2. (Optional) Every sub-packages in mono-repo could be developed or run just like they are individual without top level tool. This would affect how to design the schema of sub-package's pyproject.toml. Of course, this probably is not necessary, unless somebody hopes he/she can do some simple develop routines to sub-package.

  3. Sub-packages could share the same version defined in top level pyproject.toml and also use individual step-in version controlled by itself. For example, some packages will be released in the same time, then some auxiliary utils could be released in different schedules.

  4. There are still docs and tests in top level root for a general entrypoint of docs' web and integrated tests for all sub-packages.

I create a repo (which could be transferred into PDM org) to illustrate current proposals of PDM workspace. Maybe we could write some specifications there or use repo for unit tests and examples in the future.

Feedback is welcome and appreciated. I'm willing to help to develop workspace feature because I could use it in my projects right now. My concerning is I'm not an expert on Python packaging area yet, perhaps guidance under mentor are required.

Finally, thanks for @frostming's efforts!

@lqhuang
Copy link

lqhuang commented Feb 6, 2023

And next problem struggle me is how to make linter tools (pre-commit hook / mypy / ruff / etc ...) fit for mono-repo.

@carderne
Copy link

carderne commented Oct 10, 2023

Just sharing my experience of trying to use pdm for a monorepo setup in case it's useful to anyone.
I used the pdm-example-monorepo as a starting point

  • You have to be careful running pdm add ... or anything in a sub-package, or it will try to create a brand new pdm project there for you (.venv, pdm.lock etc). This is annoying but I'm okay adding dependencies manually.
  • pdm run my_script doesn't work from sub-packages, it just creates a fresh .venv etc and then fails. We could get around this by activating the venv and use Make but not great.
  • Activating venvs from sub-packages doesn't really work...

Basically you can probably technically make it work, but you'll need to write a bunch of custom scripts to make the DX tolerable.

But it doesn't seem far to go, just some CLI sugar! I guess monas is your (frontming) experiment to resolve this?

@jacksonwb
Copy link

jacksonwb commented Nov 14, 2023

How does building wheels of packages with path dependencies work in this context?

@frostming
Copy link
Collaborator Author

frostming commented Nov 15, 2023

How does building wheels of packages with path dependencies work in this context?

I would prefer to specify dependency versions in sub packages, rather than specifying path dependencies, and those dependencies, if included by the workspace, will be installed with the local paths:

Something like:

[project]
dependencies = [
    "foo==${workspace_version}"
]

And when being built, the version variable will be replaced with the real version in the current workspace.

However, this is not valid PEP 621 metadata, which doesn't allow ${...} in the version part. So either break the standard or use our own table to specify metadata, like poetry. Still brainstorming

@sanmai-NL
Copy link
Contributor

Imitating other popular tools may be wise to converge on a standard over time.

@DavidVujic
Copy link

DavidVujic commented Jan 24, 2024

I think this could be related to what I just wrote in a PDM discussions thread about monorepos:
#1861 (comment)

@flyingleafe
Copy link

@frostming any progress with the above proposal? Is there any way to help with this coming to fruition?

The workaround using editable installs is not working well. pdm install in the root monorepo directory does not detect changes in the subpackages' pyproject.toml files - when I add a new dependency to the file manually, it does not get installed. I have to remove and add the editable package in the monorepo config each time. Scripting workarounds which make it work are possible, but are very ugly.

@frostming
Copy link
Collaborator Author

I have to remove and add the editable package in the monorepo config each time.

No, just run pdm update to pick them up.

@sanmai-NL
Copy link
Contributor

A fruitful initial design could be to accept multiple project root directories on the CLI and complete the operation in parallel. A later workspace definition-based feature could reuse a lot of this early work by running a subprocess under the hood.

@flyingleafe
Copy link

@frostming Okay, pdm update works for my issue, thanks!. Would you give an advice on how to also pick up the dev dependencies of the sub-packages, defined in [tool.pdm.dev-dependencies] section of the sub-package's pyproject.toml?

@alexcochran
Copy link

@frostming any progress with the above proposal? Is there any way to help with this coming to fruition?

I would also be happy to help with any development

@frostming frostming self-assigned this Apr 2, 2024
@frostming
Copy link
Collaborator Author

@alexcochran There still exist many that is undetermined, such as how to reuse or refer the config from the parent project in subprojects, and that will inevitably bring some new fields to the [project] table. I am thinking whether at least a part of it can be standardized, such as PEP 735: Dependency groups

@frostming frostming pinned this issue Apr 2, 2024
@alexcochran
Copy link

@frostming I think that's a great direction. On the Node side, I think PNPM is a great model to reference. Their workspace design makes monorepo setup pretty trivial, and you can specify production and development dependencies for each project in their own package.json, very similar to how you have things working already

@DavidVujic
Copy link

There's several ways of organizing a monorepo (of course), and I suggest to have a look at the way the Polylith Architecture solves these kind of problems. I understand the need of PDM features and a cargo-like way of doing that. Polylith has a different take on this thing and focuses on the sharing code between projects, and a nice developer "single project"-like developer experience. This means all code will have the same linting and all other dev related things same across all code in the monorepo. You can use this architecture already today with PDM.

I am the developer of the tooling support for this in Python, and there is a PDM-specific hook available. Really nice work with this way of interacting with PDM, by the way 👏 ⭐ The hook system has made adding tooling like this really simple.

@frostming
Copy link
Collaborator Author

frostming commented Apr 2, 2024

@DavidVujic IIUC, does it look like https://github.com/GreyElaina/Mina ?

BTW, polylith is a wonderful project, good job!

@DavidVujic
Copy link

@DavidVujic IIUC, does it look like https://github.com/GreyElaina/Mina ?

BTW, polylith is a wonderful project, good job!

Thank you!

I haven't seen the Mina repo before, but will have a look to learn if there are similarities.

@damymetzke
Copy link

I have some interest in this feature. I'm currently with a monorepo using poetry at work, if workspaces are added to PDM I will immediately start migrating. I'd like to support its development wherever I can.

For my use case, it's especially important that the feature works well with package registries. This is because I'm using a monorepo to manage multiple packages which need to be uploaded to an index. Dependencies would be specified as usual, using regular version requirements. In development it should prefer workspace packages if the version is valid, but when packaged and uploaded it should resolve versions using the index. I believe this behavior matches with pnpm workspaces, which has been mentioned before.

I also want to clarify what is expected to happen when you install from a project rather than the workspace root. I would expect it to detect the workspace and function accordingly.

Finally I want to clarify how non-pdm backends are treated. I think it makes sense to support alternative PEP517 compliant backends when they are specified in the pyproject.toml file. Not only does this make migration easier, it may be required in some cases. Like when you want to link to Rust code using maturin. I feel like this would be a common enough occurrence in practice that it warrants explicit support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⭐ enhancement Improvements for existing features
Projects
None yet
Development

No branches or pull requests

10 participants