Skip to content

General-purpose structure template for DS/ML/CV/DL/AI projects and Development Style Guide

License

Notifications You must be signed in to change notification settings

NikolasEnt/AI-project-template

Repository files navigation

Python base project template

This is a basic Python project that can be used as a starting point for any Data Science or AI projects.

The template includes:

  • Project structure, defined by the repo
  • Docker-based development environment, described below
  • Development Style Guide

Project structure

  • data - Data, including datasets for training and saved models. In the majority of cases, the content of the folder should not be tracked by git
  • src - The project source code
  • scripts - Small standalone scripts
  • configs - Configuration files: yaml, json, toml, and etc.
  • tests - Unit tests
  • docs - Documentation
  • notebooks - Jupyter notebooks for data exploration and visualisation

Environment setup

The project uses Docker to provide a reproducible environment for running the code. The environment is controlled by Makefile, which can be customized for the project needs.

The provided Docker environment is a basic Python 3.11 image, but it can be configured by editing Dockerfile to include any additional Linux packages required for the project. Alternatively, one can use different base Docker images, for example nvidia/cuda. Configure requirements.txt to include any additional python packages.

To build the environment, run in the project home directory:

make build

Once the image is built, start the container with:

make run

The container has the project root directory mounted to /workdir, so all the local files can be accessed in the folder from within the container. Files saved inside /workdir will be saved in the project root directory of the host machine.

It is a good practice to develop inside the container with one's favorite IDE (e.g., VS Code or PyCharm) and execute the code from within the container.

Note: the template is meant to be used as a development environment and for running the code in experimental setups. Production scenarios may require further modifications to suit one's needs, including security features.

Environment variables

In order to provide environment variables, such as secrets, it is a common practice to define them in the .env file and adding the file to the Docker run command in a Makefile with --env-file=.env. A sample structure of the .env file can be provided in the .env.sample file to make it easier for the user to fill with required values.

X11 support

In order to run the code in a container with X11 support, for example, to enable interactive visualisation, the docker run command in the Makefile should include the following lines:

        -v /tmp/.X11-unix:/tmp/.X11-unix \
        -v $(HOME)/.Xauthority:/root/.Xauthority:rw \
        -e DISPLAY=$(DISPLAY)

This allows use of the Linux host X11 server to access the display. Note that other host types may require a different approach.

Pre-commit hooks

The project provides some basic pre-commit hooks, helping with code formatting and linting before committing. It is just a helper, but even without this feature, it is important to ensure that the final code is formatted correctly, follows PEP8, and adheres to the development Style Guide.

To install pre-commit hooks, run in the project home directory:

pip install pre-commit
pre-commit install
pre-commit install-hooks

The pre-commit hooks are defined in .pre-commit-config.yaml and configured in pyproject.toml. Feel free to customize them as needed.

Links

The template is the result of years' experience within various development and research teams, as well as the result of inspiration from multiple successful ML competition projects, such as:

About

General-purpose structure template for DS/ML/CV/DL/AI projects and Development Style Guide

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published