Pipe based dataframe manipulation library that can also transform data on SQL databases
To install the package locally in development mode, you first have to install Poetry. After that, install pydiverse transform like this:
git clone https://github.com/pydiverse/pydiverse.transform.git
cd pydiverse.transform
# Create the environment, activate it and install the pre-commit hooks
poetry install
poetry shell
pre-commit install
After installation, you should be able to run:
poetry run pytest
For publishing with poetry to pypi, see: https://www.digitalocean.com/community/tutorials/how-to-publish-python-packages-to-pypi-using-poetry-on-ubuntu-22-04
Packages are first released on test.pypi.org:
- see https://stackoverflow.com/questions/68882603/using-python-poetry-to-publish-to-test-pypi-org
poetry version prerelease
orpoetry version patch
- push increased version number to
main
branch poetry build
poetry publish -r test-pypi
- verify with https://test.pypi.org/search/?q=pydiverse.transform
Finally, they are published via:
git tag
<version>git push --tags
poetry publish
Conda-forge packages are updated via:
- https://github.com/conda-forge/pydiverse-transform-feedstock#updating-pydiverse-transform-feedstock
- update
recipe/meta.yaml
- test meta.yaml in transform repo:
conda-build build ../pydiverse-transform-feedstock/recipe/meta.yaml
- commit
recipe/meta.yaml
to branch of fork and submit PR
To facilitate easy testing, we provide a Docker Compose file to start all required servers.
Just run docker compose up
in the root directory of the project to start everything, and then run pytest
in a new tab.
Afterwards you can run:
poetry run pytest --postgres --mssql
For running @pytest.mark.ibm_db2 tests, you need to spin up a docker container without docker compose
since it needs
the --priviledged
option which docker compose
does not offer.
docker run -h db2server --name db2server --restart=always --detach --privileged=true -p 50000:50000 --env-file docker_db2.env_list -v /Docker:/database ibmcom/db2
Then check docker logs db2server | grep -i completed
until you see (*) Setup has completed.
.
Afterwards you can run:
poetry run pytest --ibm_db2
poetry version prerelease
orpoetry version patch
- set correct release date in changelog.md
- push increased version number to
main
branch - tag commit with
git tag <version>
, e.g.git tag 0.7.0
git push --tags