chance_of_showers

Matthew Epland, PhD

This project provides live water pressure measurements via a web dashboard running on a Raspberry Pi, logs the data, and creates time series forecasts of future water pressure.

Introduction

Living in a 5th floor walk up in NYC can save you on rent and gym memberships, but runs the risk of leaving you high and dry when your water pressure gives out! The pressure delivered from the city's water mains is typically sufficient to reach the 6th floor, with higher buildings needing a booster pump and one of NYC's iconic rooftop water towers. My building lacks a pump and water tower, leaving my top floor apartment with just barely satisfactory pressure, as long as no other units are using water! As you can see in the data below, my daytime water pressure is all over the place. After being stranded soapy and cold halfway through a shower one too many times, I decided to use my data science and electronics skills to record the time series of my apartment's hot water pressure with the goal of forecasting future availability, and hence chance_of_showers was born!

chance_of_showers_dashboard_demo.mp4

Data Analysis Results

WIP

Time Series Plots

Below is a sample of the pressure data collected in November 2023. Clicking the links will open interactive plotly plots, please explore!

Raw analog to digital converter (ADC) values

The data acquisition (DAQ) system saves the raw pressure data from the analog to digital converter (ADC) as an integer between 0 and 65472. Note that occasionally a water hammer will increase the pressure above its steady state value, marked by the orange 100% reference line, with a subsequent decay on the order of 10 minutes. When water is flowing at the pressure sensor, the data is shown with an open purple marker. Using water reduces the pressure slightly under normal conditions, and abruptly ends overpressure events.

Normalized values

To clean the data before fitting any models, I rescale the values to 0 and 1 between the steady state extrema. Any values that are outside the normalization range are capped.

Overall Pressure Distributions

Prophet Results

Hardware

Bill of Materials

Here is a list of the components I used in my build. With suitable alterations, the project could definitely be carried out with a wide array of other sensors, single board computers or microcontrollers, plumbing supplies, etc.

Electronics

Raspberry Pi 4 Model B 2 GB
- USB C Power Supply
- Micro SD Card
8-Channel 10-Bit ADC with SPI Interface - MCP3008
DFRobot Gravity Water Pressure Sensor - SEN0257
Water Flow Hall Effect Sensor Switch - YWBL-WH
1 kΩ and 10 kΩ Resistors
830 Point Breadboard and Dupont Jumper Wires - Included in GPIO Kit

Plumbing

Optional Components

I2C OLED Display
Geekworm Baseplate
Wiring
Cooling
- Heatsink - Geekworm P165-B
- Fan - Noctua NF-A4x20 5V PWM 4-Pin 40x20mm
- 2x20 Pin Header Kit to clear heatsink
- One M3 Screw to attach fan to heatsink
- Four M2.5 Screws to attach heatsink to Pi and baseplate

Circuit Diagram

The circuit diagram for this implementation is provided as a KiCad schematic here.

Photos

Data Acquisition (DAQ)

The DAQ system recorded 95.4% of possible data points overall, and 99.870% since implementing the cron job heartbeat monitoring.

Launching the DAQ Script

The provided start_daq bash script will start the daq.py and fan_control.py scripts in new tmux windows. You will need to update the pkg_path variable in start_daq per your installation location.

source daq/start_daq

Opening the Web Dashboard

If daq: {display_web: true} is set in config.yaml, the local IP address and port of the dashboard will be logged on DAQ startup. Open this link in your browser to see the live dashboard, as shown in the introduction.

Setting up cron Jobs

Jobs to restart the DAQ on boot and every 30 minutes, as well as send heartbeat API calls - see below, are provided in the cron_jobs.txt file. Note that loading this file with crontab will overwrite any current cron jobs, so check your existing settings first with crontab -l!

crontab -l

crontab daq/cron_jobs.txt

You can verify the cron jobs are running as expected with:

grep CRON /var/log/syslog | grep $LOGNAME

Heartbeat Monitoring

You can use the provided heartbeat bash script to send heartbeat API calls for the DAQ script to healthchecks.io for monitoring and alerting. Configure your alert online at healthchecks.io, and then run the below commands to setup a secrets.json file with your alert's uuid. You will need to update the pkg_path variable in heartbeat per your installation location. The provided cron_jobs.txt will setup a cron job to send the heartbeat on the 15 and 45 minute of each hour.

sudo apt install jq
echo -e "{\n\t\"chance_of_showers_heartbeat_uuid\": \"YOUR_UUID_HERE\"\n}" > secrets.json
source daq/heartbeat

Bayesian Optimization

To optimize the many hyperparameters present in this project, both of the individual forecasting models themselves as well as how the data is prepared, Bayesian optimization was used to efficiently sample the parameter space. The functions needed to run Bayesian optimization are located in bayesian_opt.py.

Unfortunately, actually running the optimization over GPU accelerated models is not as simple as calling the run_bayesian_opt() function. I have been unable to successfully detach the training of one GPU accelerated model from the next when training multiple models in a loop. The second training session will still have access to the tensors of the first, leading to out of GPU memory errors, even when using commands like gc.collect() and torch.cuda.empty_cache(). The torch models created by darts are very convenient, but do not provide as much configurability as building your own torch model from scratch, leading me unable to fix this issue in a clean way.

To work around the GPU memory issues, a shell script, start_bayesian_opt, is used to repeatedly call run_bayesian_opt() via the bayesian_opt_runner.py script. In this way each model is trained in its own Python session, totally clearing memory between training iterations. A signed pickle file is used to quickly load the necessary data and settings on each iteration. Instructions for running the whole Bayesian optimization workflow are provided below.

Running Bayesian Optimization

Create the input parent_wrapper.pickle file for bayesian_opt_runner.py via the exploratory_ana.py notebook.
Configure the run in start_bayesian_opt and bayesian_opt_runner.py.
Run the shell script, logging outputs to disk via:

./ana/start_bayesian_opt 2>&1 | tee ana/models/bayesian_optimization/bayesian_opt.log

Dev Notes

Data Analysis Setup - Installing CUDA and PyTorch

Find the supported CUDA version (11.8.0) for the current release of PyTorch (2.0.1) here.
Install CUDA following the steps for the proper version and target platform here.
Update the poetry pytorch-gpu-src source to point to the correct PyTorch version in pyproject.toml.
- This is in place of pip install --index-url=... as provided by the PyTorch installation instructions.
Install the poetry ana group with make setupANA.
- This will install pytorch, along with the other necessary packages.
Check that PyTorch and CUDA are correctly configured with the following python commands:

import torch

if torch.cuda.is_available():
    print("CUDA is available")
    print(f"Device name: {torch.cuda.get_device_name(torch.cuda.current_device())}")
else:
    print("CUDA IS NOT AVAILABLE!")

DAQ Setup - Installing Python 3.11 on Raspbian

If python 3.11 is not available in your release of Raspbian, you can compile it from source following the instructions here, but will also need to install the sqlite extensions:

cd /usr/src/
sudo wget https://www.python.org/ftp/python/3.11.4/Python-3.11.4.tgz
sudo tar -xzvf Python-3.11.4.tgz
cd Python-3.11.4/
sudo apt update && sudo apt full-upgrade -y
sudo apt install build-essential zlib1g-dev libncurses5-dev libgdbm-dev libnss3-dev libsqlite3-dev -y
./configure --enable-optimizations --enable-loadable-sqlite-extensions
sudo make altinstall

# Should be Python 3.11.4 with your compile info
/usr/local/bin/python3.11 -VV

# Link binary
sudo rm /usr/bin/python
sudo rm /usr/bin/python3
sudo ln -s /usr/local/bin/python3.11 /usr/bin/python
sudo ln -s /usr/local/bin/python3.11 /usr/bin/python3

# Should match /usr/local/bin/python3.11 -VV
python -VV

Installing Dependencies with Poetry

Install poetry following the instructions here.

curl -sSL https://install.python-poetry.org | python3 -

Then install the python packages needed for this installation. Groups include:

daq for packages needed to run the DAQ script on a Raspberry Pi, optional
web for packages needed to run the live dashboard from the DAQ script, optional
ana for analysis tools, optional
dev for continuous integration (CI) and linting tools

poetry install --with daq,web

or

poetry install --with ana

Setting up pre-commit

It is recommended to use the pre-commit tool to automatically check your commits locally as they are created. You should just need to install the git hook scripts, see below, after installing the dev dependencies. This will run the checks in .pre-commit-config.yaml when you create a new commit.

pre-commit install

Installing Non-Python Based Linters

Markdown is linted using markdownlint-cli, JavaScript by standard, and HTML, SCSS, and CSS by prettier. You can install these JavaScript-based linters globally with:

sudo npm install -g markdownlint-cli standard prettier

Shell files are linted using shellcheck and shfmt. Follow the linked installation instructions for your system. On Fedora they are:

sudo dnf install ShellCheck shfmt

Using the Makefile

A Makefile is provided for convenience, with commands to setup the DAQ and analysis environments, make setupDAQ and make setupANA, as well run CI and linting tools, e.g. make black, make pylint, make pre-commit.

Name		Name	Last commit message	Last commit date
Latest commit History 212 Commits
.dev_config		.dev_config
.github		.github
TSModelWrappers		TSModelWrappers
ana		ana
circuit_diagram		circuit_diagram
daq		daq
fan_control		fan_control
media		media
utils		utils
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE.md		LICENSE.md
Makefile		Makefile
README.md		README.md
config.yaml		config.yaml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

License

mepland/chance_of_showers

Folders and files

Latest commit

History

Repository files navigation