Skip to content

program--/pyper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pyper

An (experimental) ETL toolkit based on DuckDB and PRQL.

Usage

Pyper relies on YAML to describe workflows. We use pydantic to model how a workflow file should look. Here's an example that showcases a simple workflow:

# myworkflow.yaml
extract:
  provider: local
  uri: file:///mnt/ssd/projects/pyper/invoices.csv
  register: my_data_source

transform:
  lang: prql
  backend: duckdb
  query: |
    from my_data_source
    filter billing_country == "USA"
    group [customer_id] (
      aggregate [
        sum total,
        count,
      ]
    )

load:
  provider: local
  uri: file:///mnt/ssd/projects/pyper/invoices_usa.csv

Then, using Python:

import pyper
pyper.workflow('myworkflow.yaml').exec()

About

ETL toolkit based on DuckDB and PRQL

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages