Python library for interacting with the SpatiaFi API.
Also included is gdal_auth
which is a CLI tool to help with GCP authentication for GDAL.
pip install spatiafi
from spatiafi import get_session
session = get_session()
# The `session` object works just like `requests` but will automatically
# refresh the authentication token when it expires
params = {"item_id": "wildfire-risk-current-global-v1.0"}
url = "https://api.spatiafi.com/api/info"
response = session.get(url, params=params)
If you're using the client in a constrained environment you may need to pass in
proxies
to the get_session
function:
from spatiafi import get_session
proxies = {
"http": "http://fake-proxy.example.com:8080",
"https": "https://fake-proxy.example.com:443",
}
session = get_session(proxies=proxies)
params = {"item_id": "wildfire-risk-current-global-v1.0"}
url = "https://api.spatiafi.com/api/info"
response = session.get(url, params=params)
gdal_auth --help
gdal_auth
is a CLI tool to help with GCP authentication for GDAL.
This command can be used in three ways:
-
Set the environment variables in a file that can be sourced to set the environment variables:
gdal_auth --file source /tmp/gdal_auth.env
-
Print a command that can be run to set the environment variables once:
gdal_auth --line
Running this command will output a line that you will need to copy and run. This will set the GDAL environment variables for anything run afterward in the same terminal session. (e.g. run the command and then in the command line run another command like
qgis
) -
Print instructions for setting up aliases. These aliases allow
gdal_*
commands to be run as normal and authentication will be handled automatically:gdal_auth --alias
Note: for all options except --alias
, the authentication will eventually
expire. This is because the tokens generated by the Google Cloud SDK expire
after 1 hour. The aliases will automatically refresh the authentication
tokens when they expire.
The AsyncQueue
class is a helper class for running many (up to ~1 million) API queries in parallel.
To use it, you must:
- Create a task function
- Create an
AsyncQueue
object - Enqueue tasks
- Fetch results
tl;dr: See tests/test_async_queue.py
for an example.
A valid AsyncQueue task must:
- Be an async function
- Take a single argument
- Take an optional async session argument (if not provided, an async session will be created)
- Return a single, serializable object (a
dict
is recommended)
If your task function requires multiple arguments, you can:
- Use a wrapper function or closure (may not work on Windows or 'spawn' multiprocessing)
- Create a new function using
functools.partial
(as shown intests/test_async_queue.py
) - Pass a tuple as the argument and unpack it in the task function e.g.
async def task(args, session=None): arg1, arg2, arg3 = args ... with AsyncQueue(task) as async_queue: async_queue.enqueue((arg1, arg2, arg3))
from spatiafi.async_queue import AsyncQueue
from spatiafi.session import get_async_session
async def get_point(point, session=None):
"""
Get a point from the SpatiaFI API.
"""
# Unpack the `point` tuple because we can only pass
# a single argument to the task function.
lon, lat = point
# Create an async session if one is not provided.
if session is None:
session = await get_async_session()
# Create the url.
url = (
"https://api.spatiafi.com/api/point/" + str(lon) + "," + str(lat)
)
params = {"item_id": "wildfire-risk-current-global-v1.0"}
r = await session.get(url, params=params)
# We want to raise for all errors except 400 (bad request)
if not (r.status_code == 200 or r.status_code == 400):
r.raise_for_status()
return r.json()
AsyncQueue
takes a task function as an argument, and launches multiple instances of that task in parallel.
The AsyncQueue.enqueue
method takes a single argument is used to add tasks to the queue.
The AsyncQueue.results
property will return a list of results in the order they were enqueued.
When starting the AsyncQueue
, it is highly recommended that you specify the number of workers/CPUs to use
using the n_cores
argument. The default is to use the minimum of 4 and the number of CPUs on the machine.
This queue is designed to be used with the with
statement. Entering the with
statement will start the
subprocess and event loop. Exiting the with
statement will wait for all tasks to finish and then stop the
event loop and subprocess.
For example:
from spatiafi.async_queue import AsyncQueue
with AsyncQueue(get_point) as async_queue:
for _, row in df.iterrows():
async_queue.enqueue((row["lon"], row["lat"]))
results = async_queue.results
Alternatively, you can use the start
and stop
methods:
from spatiafi.async_queue import AsyncQueue
async_queue = AsyncQueue(get_point)
async_queue.start()
for _, row in df.iterrows():
async_queue.enqueue((row["lon"], row["lat"]))
async_queue.stop()
results = async_queue.results
Development should be done in a virtual environment. It is recommended to use the virtual environment manager built into PyCharm. To create a new virtual environment:
- Open the project in PyCharm and select
File > Settings > Project: spfi-api > Python Interpreter
. - In the top right corner of the window, click the gear icon and select
Add Interpreter > Add Local Interpreter...
In PyCharm, mark the src
folder as a source root. This will allow you to import modules from the src
folder without using relative imports.
Right-click on the src
folder and select Mark Directory as > Sources Root
.
Run ./scripts/bootstrap_dev.sh
to install the package and development dependencies.
This will also set up access to our private PyPI server, generate the first requirements.txt
(if required),
and install pre-commit
hooks.
Protip: This script can be run at any time if you're afraid you've messed up your environment.
Tests can be run locally via the scripts/test.sh
script:
./scripts/test.sh
All additional arguments to that script will be passed to PyTest which allows you to do things such as run a single test:
./scripts/test.sh -k test_async_queue
Dependencies are managed in setup.cfg
using the install_requires
and extras_require
sections.
To add a new dependency:
- Install the package in the virtual environment with
pip install <package_name>
(Hint: use the terminal built in to PyCharm) - Run
pip show <package_name>
to get the package name and version - Add the package name and version to
setup.cfg
in theinstall_requires
section. Use the compatible release syntaxpackage_name ~=version
.
DO NOT add the package to the requirements.txt
file. This file is automatically generated by
scripts/gen_requirements.sh
.
If the dependency is only needed for development, add it to the dev
section of extras_require
in setup.cfg
.
tl;dr: run ./scripts/build_docker.sh
.
We need to inject a GCP access token into the Docker build to access private PyPI packages. This requires using BuildKit (enabled by default in recent versions of Docker), and passing the token as a build argument.
This project uses pre-commit
to run a series of checks before each git commit.
To install the pre-commit
hooks, run pre-commit install
in the virtual environment.
(This is done automatically by ./scripts/bootstrap_dev.sh
)
To format all your code manually, run pre-commit run --all-files
.
Note: If your code does not pass the pre-commit
checks, automatic builds may fail.
To update local dependencies, run pip-sync
in the virtual environment.
This will make sure your virtual environment is in sync with the requirements.txt
file,
including uninstalling any packages that are not in the requirements.txt
file.
The project uses semantic versioning.
Package versions are automatically generated from git tags.
Create your first tag with git tag 0.1.0
and push it with git push --tags
tl;dr: ./scripts/install_package.sh
For development, it is recommended to install the package in editable mode with pip install -e .[dev]
.