A tool for processing Toronto's lobbyist data.
The City of Toronto Lobbyist Registrar conveniently make available its open data from the registry on the City's open data portal in XML format.
This code repository consists of:
- A command-line tool for:
- A configuration to run the tool nightly in the cloud. 📜 Logs
To prepare your local development environment for the first time:
# Install Python package dependencies
$ pipenv install
# If you would like to set some configuration from environment variables,
# use this scaffold file.
cp sample.env .env
To process the xml file into spreadsheet format on either the local filesystem or Google Spreadsheet.
$ pipenv run python cli.py parse-xml --help
Usage: cli.py parse-xml [OPTIONS] XML_FILE
Process XML file of Toronto lobbyist registry data into a CSV file.
TODO: Document how to generate Google service account credentials.
Options:
-o, --output-file <file.csv> If provided, will write local CSV file.
Default: print to screen
--output-gsheet <url/key> If provided with writable Google spreadsheet
URL, CSV will be uploaded.
--google-creds <file> JSON keyfile with Google service account
credentials. Default: service-key.json
-h, --help Show this message and exit.
For updating a GraphCommons visualization from the XML.
$ pipenv run python cli.py update-graphcommons --help
Usage: cli.py update-graphcommons [OPTIONS] XML_FILE
Options:
--graph-id <string> Graph Commons graph ID (find in graph url)
--api-key <string> Graph Commons API key [required]
-d, --delete Delete all data from the graph before procesing
--noop Skip API calls that change/destroy data
-h, --help Show this message and exit.
- Python. A programming language common in scripting.
- Click. A Python library for writing simple command-line tools.
- CircleCI. A script-running service that runs scheduled tasks for us in the cloud.
This tool was initially created at the request of Bernard Rudny, in a barter deal that saw the exchange of one month's accomodation for a web scraper. Thanks Bernard!