Austin Bikeshare data analysis

As a part of Stream Processing and Stream Analytics project.

Project contributers

Project Disclaimer Please note that this project is currently under active development. The information provided in this readme and associated documentation is subject to change without prior notice. While we strive to provide accurate and up-to-date details, certain aspects of the project may be modified, added, or removed as development progresses.

Set up Project

Follow along these steps to set up the project and start the publisher and pipeline.

1. Clone and setup up requirements

git clone https://github.com/7ze/austin-bikeshare-data-analysis.git
cd austin-bikeshare-data-analysis # project's root folder

python3 -m venv env # set up virtual env
pip3 install -U pip # update pip

pip3 install -r requirements.txt # install requirements

2. Set up environment file

touch .env

and add these environment variables in your .env file as shown

PROJECT_ID="<your project id>"
BUCKET_NAME="<your google storage bucket name>"
TOPIC_ID="<your pubsub topic id>"
BIGQUERY_DATASET="<biguery dataset name from which you wish to query>"
DATA_SOURCE="<your big query dataset name to which you wish to write results>"
DATE_COLUMN="<name of the column which has event timestamps you wish to transform>"
MAX_OFFSET_MINS="<max offset in minutes between current time and message timestamp>"
MAX_SLEEP_SECONDS="<max time to sleep between publishing messages>"
ROWS_LIMIT="<max number of rows to query at a time>"

At this point you can inspect around and make the changes you wish to make.

3. Run the publisher

# Run with -n or --no-op option enabled to see the config options set
# helpful in case you wish to debug

python3 publisher.py

4. Run the pipeline

./run.sh # custom run script

Alternatively, you could also run the pipeline manually with custom options

# Run with -n or --no-op option enabled to see the config options set
# helpful in case you wish to debug

python3 main.py

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
app		app
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
publisher.py		publisher.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

app

app

utils

utils

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

main.py

main.py

publisher.py

publisher.py

pyproject.toml

pyproject.toml

requirements.txt

requirements.txt

run.sh

run.sh

Repository files navigation

Austin Bikeshare data analysis

Project contributers

Set up Project

1. Clone and setup up requirements

2. Set up environment file

3. Run the publisher

4. Run the pipeline

And voilà, you have the pipeline running!

About

Releases

Packages

Languages

License

7ze/austin-bikeshare-data-analysis

Folders and files

Latest commit

History

Repository files navigation

Austin Bikeshare data analysis

Project contributers

Set up Project

1. Clone and setup up requirements

2. Set up environment file

3. Run the publisher

4. Run the pipeline

And voilà, you have the pipeline running!

About

Topics

Resources

License

Stars

Watchers

Forks

Languages