Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

install optional dependencies by default #397

Open
pmeier opened this issue Apr 29, 2024 · 11 comments
Open

install optional dependencies by default #397

pmeier opened this issue Apr 29, 2024 · 11 comments
Labels
type: RFD ⚖️ Decision making

Comments

@pmeier
Copy link
Member

pmeier commented Apr 29, 2024

We currently suggest people install ragna[all] by default and only drop the [all] extra if they want fine-grained control

ragna/docs/install.md

Lines 9 to 23 in e0fe014

You can install Ragna and all recommended dependencies with:
```bash
pip install 'ragna[all]'
```
If you want to install a minimal version[^1]:
```bash
pip install ragna
```
[^1]:
The minimal version is for users who want fine-grained control over the dependencies
needed for the builtin components.

pip install ragna, which is what most people will try first, will only give them a bare bones version, i.e. almost no builtin components other than the demo ones. Meaning, new users will have to consult the documentation to get Ragna in a reasonable state. Since "fine-grained control" is a power user feature, the difficulty with installing Ragna is flipped.

We could create a ragna-base package, which corresponds to what we currently have for ragna and make the default ragna what is currently ragna[all]. Note that we need to do something like that for other package managers, i.e. conda anyway, since optional dependencies are not supported. See #396 for a discussion (cc @kklein).

This of course has the downside that the packaging configuration and release workflow will get more complicated as we now have to release two packages in unison rather than one. We have #367 to automate the release process so this in theory shouldn't be too bad.

Regarding naming, I've borrowed the -base postfix from conda-store packages. The core postfix, e.g. botocore or pydantic-core is not really an option, since we already have a ragna.core namespace and I don't want to associate this single namespace with the package. I'm open for other suggestions though.

@pmeier pmeier added the type: RFD ⚖️ Decision making label Apr 29, 2024
@arjxn-py
Copy link
Contributor

arjxn-py commented May 2, 2024

Hi @pmeier, I'm not sure if I should be giving my opinion on an Issue labelled as RFD: Decision Making but I really like this idea of giving user the fine-grained control with pip install ragna & give them a bare-bones version with a package say ragna-base

After reading through this issue I did a bit of a web digging to know how others might be doing it and if there are some best practices that's need to be followed along the way. While doing this I found about this interesting package:

I also found about this:

  • Unidic - Which takes around 1GB on disk after install, They're also distributing unidic-lite side by side but it looks like they've been doing it manually
  • pytorch-lightning - Which looks like a third-party distribution.

I'd be happy to work around this issue and I'd personally prefer to divide the problem statement into 2-3 smaller chunks & thus target it as a whole, i.e.:

  • Create a release & publish workflow for the existing infrastructure (I believe it's nice to have imrpove release process #367 before)
  • Modify the workflow to have ragna & ragna-base subsequently
  • Imitate the above for conda as well

In addition to this I feel other than -base suffix, we can also consider -lite or -slim.

I'd love to start working around on #367 upon your directive. Thanks a lot 💐

@pmeier
Copy link
Member Author

pmeier commented May 2, 2024

I'm not sure if I should be giving my opinion on an Issue labelled as RFD: Decision Making

You absolutely should. RFD is short for "request for discussion".

fine-grained control with pip install ragna & give them a bare-bones version with a package say ragna-base

You mixed something up here. "fine grained control" means the bare bones version, since the user now is in total control which dependencies they want to install. The ragna package should be "batteries included", i.e. all optional dependencies are installed without giving the user any choice.

This way, a new user has all the builtin features available right from the start and only when you have a more serious use case for Ragna, you can install the -base version and only the dependencies you need.

I'd be happy to work around this issue and I'd personally prefer to divide the problem statement into 2-3 smaller chunks & thus target it as a whole, i.e.:

Sounds good in general. Let me ask for some feedback on this offline before we start though to make sure we aren't going into a direction that users actually don't want. Furthermore, I would not start with the release workflow. Let's create a proof of concept for the two packages ragna and ragna-base first and only afterwards create the workflow. I recon, the release process is fairly different.

@pmeier
Copy link
Member Author

pmeier commented May 2, 2024

Emoji vote for the bare bones postfix. Multiple votes allowed per person. Option with most votes wins.

  • 🚀 -base
  • ❤️ -slim
  • 🎉 -lite
  • 👀 Something else, please comment.

@hameerabbasi
Copy link

How about -core, like Dask?

@arjxn-py
Copy link
Contributor

arjxn-py commented May 2, 2024

Thanks for the suggestion @hameerabbasi but in Philip's comment above:

The core postfix, e.g. botocore or pydantic-core is not really an option, since we already have a ragna.core namespace and I don't want to associate this single namespace with the package.

@pmeier
Copy link
Member Author

pmeier commented May 3, 2024

Let's make a decision.

  1. There was no opposing voices to this change so we are going to make the switch explained above.

  2. The poll right now looks like

    image

    In addition, I also had some people comment offline without reacting here with 1 vote for 🚀 -base and ❤️ -slim respectively and 2 votes for 🎉 -lite.

    This brings us to this total

    • 🚀 -base: 4
    • ❤️ -slim: 1
    • 🎉 -lite: 4

    Since we are tied, I'm going to decide and pick -base.

@arjxn-py You can go ahead and send a PoC PR to have the two packages. If we have that we can decide on a release workflow.

@arjxn-py
Copy link
Contributor

arjxn-py commented May 5, 2024

@arjxn-py You can go ahead and send a PoC PR to have the two packages. If we have that we can decide on a release workflow.

Hi, I've already started trying out things and digging about the same. Hopefully I'll be able to raise a PoC PR this week only, I'll also let you know in case i'd want to discuss something specific or require help.

@pmeier
Copy link
Member Author

pmeier commented May 8, 2024

@arjxn-py I realized we haven't written down exactly what we want to achieve. So here it is:

  • pip install ragna-base should install our package as well as all hard dependencies that we currently have

    ragna/pyproject.toml

    Lines 23 to 44 in d7e4783

    dependencies = [
    "aiofiles",
    "emoji",
    "fastapi",
    "httpx",
    "importlib_metadata>=4.6; python_version<'3.10'",
    "packaging",
    "panel==1.3.8",
    "pydantic>=2",
    "pydantic-core",
    "pydantic-settings>=2",
    "PyJWT",
    "python-multipart",
    "redis",
    "questionary",
    "rich",
    "sqlalchemy>=2",
    "starlette",
    "tomlkit",
    "typer",
    "uvicorn",
    ]
  • The ragna distribution should have exactly the same version as ragna-base and hardpin ragna-base to this exact version as dependency. In addition, pip install ragna should install all optional dependencies that we currently have

    ragna/pyproject.toml

    Lines 53 to 65 in d7e4783

    [project.optional-dependencies]
    # to update the array below, run scripts/update_optional_dependencies.py
    all = [
    "chromadb>=0.4.13",
    "httpx_sse",
    "ijson",
    "lancedb>=0.2",
    "pyarrow",
    "pymupdf>=1.23.6",
    "python-docx",
    "python-pptx",
    "tiktoken",
    ]

@arjxn-py
Copy link
Contributor

arjxn-py commented May 8, 2024

I realized we haven't written down exactly what we want to achieve. So here it is

No worries, as the good thing is that I've been already working with keeping just the same thing in mind as you suggested. Thanks for further clarifying :), tbh the issue was very well written and self-explanatory 🚀. Kudos

@pmeier
Copy link
Member Author

pmeier commented May 17, 2024

@arjxn-py Just thought of another requirement that we have: when inside the project root, pip install . needs to work. This is not just for setting up your development environment, but also SCM installs, e.g. pip install git+https://github.com/Quansight/ragna@my-branch.

Meaning one of the pyproject.toml files needs to be at the project root. I'm not sure yet which. Leaning towards the batteries included one, i.e. ragna.

@arjxn-py
Copy link
Contributor

I'll make sure that this particular requirement also fulfills by testing it on my fork's poc branch. Although I'm still WIP with shim setup.py (It's not behaving as expected at the moment, but I'm down to give it enough tries by my end)

Meaning one of the pyproject.toml files needs to be at the project root. I'm not sure yet which. Leaning towards the batteries included one, i.e. ragna.

I'm also in favor of having the one with batteries included, you can confirm the decision :) (It'll be a small change even if we change our minds in future)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: RFD ⚖️ Decision making
Projects
None yet
Development

No branches or pull requests

3 participants