Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read csv with naive data, column filters and datetime/timedelta units #521

Merged
merged 26 commits into from Nov 10, 2022
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
3f950d3
Support unit conversion for python datetime and timedelta objects
Flix6x Oct 9, 2022
57a050a
Allow to filter by column when reading in beliefs from CSV
Flix6x Oct 9, 2022
e0040e7
Allow to set a timezone for reading in timezone naive data
Flix6x Oct 9, 2022
d1e3e32
Allow throwing out NaN values when reading in beliefs
Flix6x Oct 9, 2022
2e2580a
Support datetime unit conversion for aware datetimes with mixed offset
Flix6x Oct 14, 2022
d9ddd0a
Raise instead of assume UTC when reading in timezone naive data witho…
Flix6x Oct 21, 2022
7b7fdf8
Bump timely-beliefs dependency for read_csv
Flix6x Oct 14, 2022
26b2131
Refactor: flake8
Flix6x Oct 31, 2022
6ccb2dc
CLI changelog entry
Flix6x Oct 31, 2022
e2eeb08
changelog entry
Flix6x Oct 31, 2022
994af8a
mypy
Flix6x Oct 31, 2022
401bc1d
Use sensor id field validation
Flix6x Oct 31, 2022
6e14acf
Querying for a source with a given id no longer requires knowing the …
Flix6x Oct 9, 2022
9b018dc
make freeze-deps
Flix6x Oct 31, 2022
b7ad2d6
add optional dependency: timely-beliefs[forecast]
Flix6x Oct 31, 2022
2df7658
Clarify help
Flix6x Nov 10, 2022
c65910f
Merge remote-tracking branch 'origin/main' into read-csv-with-naive-d…
Flix6x Nov 10, 2022
d1acfe5
Mention data conversion from 'datetime' or 'timedelta' units
Flix6x Nov 10, 2022
74751fd
Allow converting 'datetime' values to a duration other than seconds (…
Flix6x Nov 10, 2022
d1a8392
Refactor and make convert_time_units a private function
Flix6x Nov 10, 2022
e77803e
Refactor and add inline comment explaining why we check to_unit for a…
Flix6x Nov 10, 2022
b3f700a
mypy: PEP 484 prohibits implicit Optional
Flix6x Nov 10, 2022
6041f4f
Attempt to revert bugs introduced in merge with main
Flix6x Nov 10, 2022
35618ee
black and flake8
Flix6x Nov 10, 2022
6e1510a
A few more reverts
Flix6x Nov 10, 2022
100f05c
Fix typos
Flix6x Nov 10, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
15 changes: 1 addition & 14 deletions documentation/changelog.rst
Expand Up @@ -12,31 +12,18 @@ New features
* Ability to provide your own custom scheduling function [see `PR #505 <http://www.github.com/FlexMeasures/flexmeasures/pull/505>`_]
* Visually distinguish forecasts/schedules (dashed lines) from measurements (solid lines), and expand the tooltip with timing info regarding the forecast/schedule horizon or measurement lag [see `PR #503 <http://www.github.com/FlexMeasures/flexmeasures/pull/503>`_]
* The asset page also allows to show sensor data from other assets that belong to the same account [see `PR #500 <http://www.github.com/FlexMeasures/flexmeasures/pull/500>`_]
* Improved import of time series data from CSV file: 1) drop duplicate records with warning, and 2) allow configuring which column contains explicit recording times for each data point (use case: import forecasts) [see `PR #501 <http://www.github.com/FlexMeasures/flexmeasures/pull/501>`_]
* The CLI command ``flexmeasures show beliefs`` supports showing beliefs data in a custom resolution and/or timezone, and also saving the shown beliefs data to a CSV file [see `PR #519 <http://www.github.com/FlexMeasures/flexmeasures/pull/519>`_]
* Improved import of time series data from CSV file: 1) drop duplicate records with warning, 2) allow configuring which column contains explicit recording times for each data point (use case: import forecasts) [see `PR #501 <http://www.github.com/FlexMeasures/flexmeasures/pull/501>`_], 3) localize timezone naive data, 4) support reading in datetime and timedelta values, 5) remove rows with NaN values, and 6) filter by values in specific columns [see `PR #521 <http://www.github.com/FlexMeasures/flexmeasures/pull/521>`_]

Bugfixes
-----------
* The CLI command ``flexmeasures show beliefs`` now supports plotting time series data that includes NaN values, and provides better support for plotting multiple sensors that do not share the same unit [see `PR #516 <http://www.github.com/FlexMeasures/flexmeasures/pull/516>`_]
* Consistent CLI/UI support for asset lat/lng positions up to 7 decimal places (previously the UI rounded to 4 decimal places, whereas the CLI allowed more than 4) [see `PR #522 <http://www.github.com/FlexMeasures/flexmeasures/pull/522>`_]
Flix6x marked this conversation as resolved.
Show resolved Hide resolved

Infrastructure / Support
----------------------

* Reduce size of Docker image (from 2GB to 1.4GB) [see `PR #512 <http://www.github.com/FlexMeasures/flexmeasures/pull/512>`_]
* Remove bokeh dependency and obsolete UI views [see `PR #476 <http://www.github.com/FlexMeasures/flexmeasures/pull/476>`_]
* Fix ``flexmeasures db-ops dump`` and ``flexmeasures db-ops restore`` incorrectly reporting a success when `pg_dump` and `pg_restore` are not installed [see `PR #526 <http://www.github.com/FlexMeasures/flexmeasures/pull/526>`_]
Flix6x marked this conversation as resolved.
Show resolved Hide resolved
* Plugins can save BeliefsSeries, too, instead of just BeliefsDataFrames [see `PR #523 <http://www.github.com/FlexMeasures/flexmeasures/pull/523>`_]


v0.11.3 | November 2, 2022
============================

Bugfixes
-----------
* Fix scheduling with imperfect efficiencies, which resulted in exceeding the device's lower SoC limit. [see `PR #520 <http://www.github.com/FlexMeasures/flexmeasures/pull/520>`_]
* Fix scheduler for Charge Points when taking into account inflexible devices [see `PR #517 <http://www.github.com/FlexMeasures/flexmeasures/pull/517>`_]
* Prevent rounding asset lat/long positions to 4 decimal places when editing an asset in the UI [see `PR #522 <http://www.github.com/FlexMeasures/flexmeasures/pull/522>`_]


v0.11.2 | September 6, 2022
Expand Down
1 change: 1 addition & 0 deletions documentation/cli/change_log.rst
Expand Up @@ -8,6 +8,7 @@ since v0.12.0 | November XX, 2022
=================================

* Add ``--resolution``, ``--timezone`` and ``--to-file`` options to ``flexmeasures show beliefs``, to show beliefs data in a custom resolution and/or timezone, and also to save shown beliefs data to a CSV file.
* Add options to ``flexmeasures add beliefs`` to 1) read CSV data with timezone naive datetimes (use ``--timezone``to localize the data), 2) read CSV data with datetime/timedelta units (use ``--unit datetime`` or ``--unit timedelta`, 3) remove rows with NaN values, and 4) add filter read-in data by matching values in specific columns (use ``--filter-column`` and ``--filter-value`` together).
Flix6x marked this conversation as resolved.
Show resolved Hide resolved
Flix6x marked this conversation as resolved.
Show resolved Hide resolved
Flix6x marked this conversation as resolved.
Show resolved Hide resolved
* Fix ``flexmeasures db-ops dump`` and ``flexmeasures db-ops restore`` incorrectly reporting a success when `pg_dump` and `pg_restore` are not installed.

since v0.11.0 | August 28, 2022
Expand Down
4 changes: 1 addition & 3 deletions flexmeasures/api/v1_3/tests/test_api_v1_3.py
Expand Up @@ -112,9 +112,7 @@ def test_post_udi_event_and_get_device_message(
# check targets, if applicable
if "targets" in message:
start_soc = message["value"] / 1000 # in MWh
soc_schedule = integrate_time_series(
consumption_schedule, start_soc, decimal_precision=6
)
soc_schedule = integrate_time_series(consumption_schedule, start_soc, 6)
print(consumption_schedule)
print(soc_schedule)
for target in message["targets"]:
Expand Down
2 changes: 1 addition & 1 deletion flexmeasures/api/v2_0/tests/test_api_v2_0_assets.py
Expand Up @@ -248,7 +248,7 @@ def test_post_an_asset_with_invalid_data(client, db):
in post_asset_response.json["message"]["json"]["capacity_in_mw"][0]
)
assert (
"Longitude 300.9 exceeds the maximum longitude of 180 degrees."
"greater than or equal to -180 and less than or equal to 180"
in post_asset_response.json["message"]["json"]["longitude"][0]
)
assert "required field" in post_asset_response.json["message"]["json"]["unit"][0]
Expand Down
4 changes: 1 addition & 3 deletions flexmeasures/api/v3_0/tests/test_sensor_schedules.py
Expand Up @@ -90,9 +90,7 @@ def test_trigger_and_get_schedule(
# check targets, if applicable
if "targets" in message:
start_soc = message["soc-at-start"] / 1000 # in MWh
soc_schedule = integrate_time_series(
consumption_schedule, start_soc, decimal_precision=6
)
soc_schedule = integrate_time_series(consumption_schedule, start_soc, 6)
print(consumption_schedule)
print(soc_schedule)
for target in message["targets"]:
Expand Down
81 changes: 59 additions & 22 deletions flexmeasures/cli/data_add.py
Expand Up @@ -38,13 +38,7 @@
MissingAttributeException,
)
from flexmeasures.data.models.annotations import Annotation, get_or_create_annotation
from flexmeasures.data.schemas import (
AwareDateTimeField,
DurationField,
LatitudeField,
LongitudeField,
SensorIdField,
)
from flexmeasures.data.schemas import AwareDateTimeField, DurationField, SensorIdField
from flexmeasures.data.schemas.sensors import SensorSchema
from flexmeasures.data.schemas.units import QuantityField
from flexmeasures.data.schemas.generic_assets import (
Expand Down Expand Up @@ -248,12 +242,12 @@ def add_asset_type(**args):
@click.option("--name", required=True)
@click.option(
"--latitude",
type=LatitudeField(),
type=float,
help="Latitude of the asset's location",
)
@click.option(
"--longitude",
type=LongitudeField(),
type=float,
help="Longitude of the asset's location",
)
@click.option("--account-id", type=int, required=True)
Expand Down Expand Up @@ -286,8 +280,9 @@ def add_initial_structure():
@click.argument("file", type=click.Path(exists=True))
@click.option(
"--sensor-id",
"sensor",
required=True,
type=click.IntRange(min=1),
type=SensorIdField(),
help="Sensor to which the beliefs pertain.",
)
@click.option(
Expand All @@ -301,6 +296,8 @@ def add_initial_structure():
required=False,
type=str,
help="Unit of the data, for conversion to the sensor unit, if possible (a string unit such as 'kW' or 'm³/h').\n"
"Measurements of time itself that are formatted as a 'datetime' or 'timedelta' can be converted to a sensor unit representing time (such as 's' or 'h'),\n"
"where datetimes are represented as a duration with respect to the UNIX epoch."
"Hint: to switch the sign of the data, prepend a minus sign.\n"
"For example, when assigning kW consumption data to a kW production sensor, use '-kW'.",
)
Expand Down Expand Up @@ -341,6 +338,12 @@ def add_initial_structure():
multiple=True,
help="Additional strings to recognize as NaN values. This argument can be given multiple times.",
)
@click.option(
"--keep-default-na",
default=False,
type=bool,
help="Whether or not to keep NaN values in the data.",
)
@click.option(
"--nrows",
required=False,
Expand All @@ -367,6 +370,24 @@ def add_initial_structure():
type=int,
help="Column number with datetimes",
)
@click.option(
"--timezone",
required=False,
default=None,
help="timezone as string, e.g. 'UTC' or 'Europe/Amsterdam'",
)
@click.option(
"--filter-column",
"filter_columns",
multiple=True,
help="Set a column number to filter data. Use together with --filter-value.",
)
@click.option(
"--filter-value",
"filter_values",
multiple=True,
help="Set a column value to filter data. Only rows with this value will be added. Use together with --filter-column.",
)
@click.option(
"--delimiter",
required=True,
Expand Down Expand Up @@ -396,19 +417,23 @@ def add_initial_structure():
)
def add_beliefs(
file: str,
sensor_id: int,
sensor: Sensor,
source: str,
filter_columns: List[int],
filter_values: List[int],
unit: Optional[str] = None,
horizon: Optional[int] = None,
cp: Optional[float] = None,
resample: bool = True,
allow_overwrite: bool = False,
skiprows: int = 1,
na_values: list[str] | None = None,
keep_default_na: bool = False,
nrows: Optional[int] = None,
datecol: int = 0,
valuecol: int = 1,
beliefcol: Optional[int] = None,
timezone: Optional[str] = None,
delimiter: str = ",",
decimal: str = ".",
thousands: Optional[str] = None,
Expand All @@ -433,17 +458,7 @@ def add_beliefs(
In case no --horizon is specified and no beliefcol is specified,
the moment of executing this CLI command is taken as the time at which the beliefs were recorded.
"""
sensor = Sensor.query.filter(Sensor.id == sensor_id).one_or_none()
if sensor is None:
print(f"Failed to create beliefs: no sensor found with ID {sensor_id}.")
return
if source.isdigit():
_source = get_source_or_none(int(source), source_type="CLI script")
if not _source:
print(f"Failed to find source {source}.")
return
else:
_source = get_or_create_source(source, source_type="CLI script")
_source = parse_source(source)

# Set up optional parameters for read_csv
if file.split(".")[-1].lower() == "csv":
Expand All @@ -458,6 +473,14 @@ def add_beliefs(
elif beliefcol is None:
kwargs["belief_time"] = server_now().astimezone(pytz.timezone(sensor.timezone))

# Set up optional filters:
if len(filter_columns) != len(filter_values):
raise ValueError(
"The number of filter columns and filter values should be the same."
)
filter_by_column = (
dict(zip(filter_columns, filter_values)) if filter_columns else None
)
bdf = tb.read_csv(
file,
sensor,
Expand All @@ -472,6 +495,9 @@ def add_beliefs(
else [datecol, beliefcol, valuecol],
parse_dates=True,
na_values=na_values,
keep_default_na=keep_default_na,
timezone=timezone,
filter_by_column=filter_by_column,
**kwargs,
)
duplicate_rows = bdf.index.duplicated(keep="first")
Expand Down Expand Up @@ -1099,3 +1125,14 @@ def check_errors(errors: Dict[str, List[str]]):
f"Please correct the following errors:\n{errors}.\n Use the --help flag to learn more."
)
raise click.Abort


def parse_source(source):
if source.isdigit():
_source = get_source_or_none(int(source))
if not _source:
print(f"Failed to find source {source}.")
return
else:
_source = get_or_create_source(source, source_type="CLI script")
return _source
16 changes: 9 additions & 7 deletions flexmeasures/cli/data_delete.py
Expand Up @@ -12,7 +12,6 @@
from flexmeasures.data.models.generic_assets import GenericAsset
from flexmeasures.data.models.time_series import Sensor, TimedBelief
from flexmeasures.data.schemas.generic_assets import GenericAssetIdField
from flexmeasures.data.schemas.sensors import SensorIdField
from flexmeasures.data.services.users import find_user_by_email, delete_user


Expand Down Expand Up @@ -124,7 +123,7 @@ def delete_asset_and_data(asset: GenericAsset, force: bool):
Delete an asset & also its sensors and data.
"""
if not force:
prompt = f"Delete {asset.__repr__()}, including all its sensors and data?"
prompt = f"Delete {asset}, including all its sensors and data?"
click.confirm(prompt, abort=True)
db.session.delete(asset)
db.session.commit()
Expand Down Expand Up @@ -296,18 +295,21 @@ def delete_nan_beliefs(sensor_id: Optional[int] = None):
@with_appcontext
@click.option(
"--id",
"sensor",
type=SensorIdField(),
"sensor_id",
type=int,
required=True,
help="Delete a single sensor and its (time series) data. Follow up with the sensor's ID.",
)
def delete_sensor(
sensor: Sensor,
sensor_id: int,
):
"""Delete a sensor and all beliefs about it."""
n = TimedBelief.query.filter(TimedBelief.sensor_id == sensor.id).delete()
sensor = Sensor.query.get(sensor_id)
n = TimedBelief.query.filter(TimedBelief.sensor_id == sensor_id).delete()
db.session.delete(sensor)
click.confirm(f"Delete {sensor.__repr__()}, along with {n} beliefs?", abort=True)
click.confirm(
f"Really delete sensor {sensor_id}, along with {n} beliefs?", abort=True
)
db.session.commit()


Expand Down
10 changes: 8 additions & 2 deletions flexmeasures/cli/db_ops.py
Expand Up @@ -95,7 +95,10 @@ def dump():
dump_filename = f"pgbackup_{db_name}_{time_of_saving}.dump"
command_for_dumping = f"pg_dump --no-privileges --no-owner --data-only --format=c --file={dump_filename} {db_uri}"
try:
subprocess.check_output(command_for_dumping, shell=True)
proc = subprocess.Popen(command_for_dumping, shell=True) # , env={
# 'PGPASSWORD': DB_PASSWORD
# })
proc.wait()
click.echo(f"db dump successful: saved to {dump_filename}")

except Exception as e:
Expand All @@ -122,7 +125,10 @@ def restore(file: str):
click.echo(f"Restoring {db_host_and_db_name} database from file {file}")
command_for_restoring = f"pg_restore -d {db_uri} {file}"
try:
subprocess.check_output(command_for_restoring, shell=True)
proc = subprocess.Popen(command_for_restoring, shell=True) # , env={
# 'PGPASSWORD': DB_PASSWORD
# })
proc.wait()
click.echo("db restore successful")

except Exception as e:
Expand Down
7 changes: 3 additions & 4 deletions flexmeasures/data/models/planning/charging_station.py
Expand Up @@ -110,10 +110,9 @@ def schedule_charging_station(
]
if inflexible_device_sensors is None:
inflexible_device_sensors = []
device_constraints = [
initialize_df(columns, start, end, resolution)
for i in range(1 + len(inflexible_device_sensors))
]
device_constraints = [initialize_df(columns, start, end, resolution)] * (
1 + len(inflexible_device_sensors)
)
for i, inflexible_sensor in enumerate(inflexible_device_sensors):
device_constraints[i + 1]["derivative equals"] = get_power_values(
query_window=(start, end),
Expand Down
21 changes: 12 additions & 9 deletions flexmeasures/data/models/planning/solver.py
Expand Up @@ -46,8 +46,8 @@ def device_scheduler( # noqa C901
derivative max: maximum flow (e.g. in MW or boxes/h)
derivative min: minimum flow
derivative equals: exact amount of flow (we do this by clamping derivative min and derivative max)
derivative down efficiency: conversion efficiency of flow out of a device (flow out : stock decrease)
derivative up efficiency: conversion efficiency of flow into a device (stock increase : flow in)
derivative down efficiency: ratio of downwards flows (flow into EMS : flow out of device)
derivative up efficiency: ratio of upwards flows (flow into device : flow out of EMS)
EMS constraints are on an EMS level. Handled constraints (listed by column name):
derivative max: maximum flow
derivative min: minimum flow
Expand Down Expand Up @@ -228,12 +228,10 @@ def device_derivative_up_efficiency(m, d, j):

# Add constraints as a tuple of (lower bound, value, upper bound)
def device_bounds(m, d, j):
"""Apply efficiencies to conversion from flow to stock change and vice versa."""
return (
m.device_min[d, j],
sum(
m.device_power_down[d, k] / m.device_derivative_down_efficiency[d, k]
+ m.device_power_up[d, k] * m.device_derivative_up_efficiency[d, k]
m.device_power_down[d, k] + m.device_power_up[d, k]
for k in range(0, j + 1)
),
m.device_max[d, j],
Expand Down Expand Up @@ -277,10 +275,12 @@ def ems_flow_commitment_equalities(m, j):
)

def device_derivative_equalities(m, d, j):
"""Couple device flows to EMS flows per device."""
"""Couple device flows to EMS flows per device, applying efficiencies."""
return (
0,
m.device_power_up[d, j] + m.device_power_down[d, j] - m.ems_power[d, j],
m.device_power_up[d, j] / m.device_derivative_up_efficiency[d, j]
+ m.device_power_down[d, j] * m.device_derivative_down_efficiency[d, j]
- m.ems_power[d, j],
0,
)

Expand Down Expand Up @@ -321,7 +321,10 @@ def cost_function(m):
planned_costs = value(model.costs)
planned_power_per_device = []
for d in model.d:
planned_device_power = [model.ems_power[d, j].value for j in model.j]
planned_device_power = [
model.device_power_down[d, j].value + model.device_power_up[d, j].value
for j in model.j
]
planned_power_per_device.append(
pd.Series(
index=pd.date_range(
Expand All @@ -332,7 +335,7 @@ def cost_function(m):
)

# model.pprint()
# model.display()
# print(results.solver.termination_condition)
# print(planned_costs)
# model.display()
return planned_power_per_device, planned_costs, results