Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CLI command flexmeasures add report to compute reports using a Reporter #642

Closed
victorgarcia98 opened this issue Apr 21, 2023 · 7 comments · Fixed by #659
Closed

Add CLI command flexmeasures add report to compute reports using a Reporter #642

victorgarcia98 opened this issue Apr 21, 2023 · 7 comments · Fixed by #659
Labels
enhancement New feature or request

Comments

@victorgarcia98
Copy link
Contributor

victorgarcia98 commented Apr 21, 2023

This would be a good add-on to allow users to create reports straight from the CLI.

I suggest this command to have the following features:

  • Accept inline json and file path to the report_config. Inline json might be useful so that other scripts can pass report_config directly.
  • Using the flag --output report.csv to output the resulting report in different formats. By default: stdout. We can just rely on the output file extension or have a separate optional flag (--output-type csv) to force the output format.
  • Output format support: csv, xlsx, stdout, (pickle?)

Inspired from issue #509 and in views of the developments of the Reporter class, I suggest the following command format:

flexmeasures add report --type <str> --config <path> --sensor-id <int> [--output <path>] [--start <datetime>] [--end <datetime>] [--resolution <timedelta>]

Please, let me know your thoughts on the naming of the command as well as the features to support..

@victorgarcia98 victorgarcia98 created this issue from a note in Reporting backend infrastructure (To do) Apr 21, 2023
@victorgarcia98 victorgarcia98 added the enhancement New feature or request label Apr 21, 2023
@nhoening
Copy link
Contributor

  • Why type over name?
  • A cron job is usually where this is triggered. There, you don't know the exact start and end time. You just trigger it regularly. That's why I added these parameters: [--last-hour <bool>] [--yesterday <bool>] [last-week <bool>] [last-month <bool>]
  • inline json is not nice to work with, it is error-prone and hard to debug.
  • Why no possibility of storing the results on a sensor? It's nice to be able to export, but for real automation we want the reporting data in the database.
  • I'd also not use stdout as the default. This clogs logging files.
  • I would complain if there is no target sensor as well as no target file specified.
  • I'd support csv and excel for now. I don't see pickle and stdout having direct use cases. Maybe a follow-up ticket, but if no-one asks for it, let's not soend time on it.

@victorgarcia98
Copy link
Contributor Author

Good points!

  • Why type over name?

I think name could induce confusion to some people: name could be the name of the report itself not the Reporter type being used.

  • A cron job is usually where this is triggered. There, you don't know the exact start and end time. You just trigger it regularly. That's why I added these parameters: [--last-hour ] [--yesterday ] [last-week ] [last-month ]

Yes, I like it. That way we avoid repeating the same code to deal with time.

  • inline json is not nice to work with, it is error-prone and hard to debug.

Very true, indeed.

  • Why no possibility of storing the results on a sensor? It's nice to be able to export, but for real automation we want the reporting data in the database.

Sure, that would be nice.

  • I'd also not use stdout as the default. This clogs logging files.

Oh yeah, true.

  • I would complain if there is no target sensor as well as no target file specified.

I would make the sensor required and allow 4 types of outputs:

  • excel (output file required)
  • csv (output file required)
  • db: saving to the database

Would you rather use flag (e.g --excel, --csv) or input argument indicating the output type (e.g --output-type csv?

  • I'd support csv and excel for now. I don't see pickle and stdout having direct use cases. Maybe a follow-up ticket, but if no-one asks for it, let's not soend time on it.

I agree. Stdout would a nice way to enable piping Linux commands but it requires to keep a very well formated output and avoid logging to stdout. Let's leave it for a future iteration.

@nhoening
Copy link
Contributor

I think name could induce confusion to some people: name could be the name of the report itself not the Reporter type being used.

The difference between reporter and report can be tricky, true. Maybe call the parameter --reporter? I'd find that nicer.

I would make the sensor required and allow 4 types of outputs

But we only need the (target) sensor when we write to the database. Also, you listed only three types of outputs. Finally, in the case of excel and csv, we'd also need a filename.

I believe we can do with one argument output.

  • if it is a number -> sensor in DB
  • otherwise, guess the format by file extension (we support .csv, .xls(x), ...)
  • if it is not given, we can complain or default to a csv file, using the current datetime (down to the minute) in the filename.

@Flix6x
Copy link
Contributor

Flix6x commented Apr 24, 2023

Maybe call the parameter --reporter? I'd find that nicer.

I'd also find that a nicer, more explicit, parameter name. However, I'm also confused, because the title of this Issue mentions that the report_config should come from a sensor attribute. If that is the case, the reporter name should also come from a sensor attribute (or even be part of the report_config itself) imo.

Best case scenario, it should be enough to call the CLI command with only a sensor ID:

  • start: can default to the sensor's most recent event_start (or event_end).
  • end: can default to datetime.now().
  • resolution: the reporter already defaults this to the sensor's resolution.

This way, it fills in the missing report since the last time.

@Flix6x
Copy link
Contributor

Flix6x commented Apr 24, 2023

With regards to output, I just wanted to note that we already have the flexmeasures show beliefs command with the --to-file option (only for CSV export). I'd suggest keeping any new export option for reports in sync with that, foremost by reusing the name of the option, but preferably also by keeping the logic in sync; the ability to create an Excel file instead of a CSV file in case the filepath extension is "xlsx" would benefit both CLI commands.

@nhoening
Copy link
Contributor

Which sensor is supposed to carry the report_config? The one to which we'll write the data (the target sensor), right? I can't think of any other, as the input sensors are often multiple.

@victorgarcia98 also mentioned a --config <path> parameter, so I believe he planned to be agnostic of where the config is. It can be provided in a file (great for developing or ad-hoc reporting work) or, if the --reporter parameter is a sensor ID, it is expected to be on the sensor if not given explicitly as file.

Victor, if I am correct (and/or you agree), let's adapt the title of this PR.

@victorgarcia98
Copy link
Contributor Author

Very true!

My fault, the name of the PR comes from the task of the project, directly, and we have made some changes since then. Still pending to close PR #641, but, at the moment, the Reporter class requires the following:

  • sensor: output sensor
  • reporter_config: description of the reporter

We can discuss whether to "attatch" a report to a sensor (e.g having report_config as attribute), so that we could call a method of the Sensor (e.g. sensor.compute_report()) that computes the report. Furthermore, we need to address the issue to allow having multiple report_config descriptions attatched to a sensor (via datasource?. Or maybe having virtual sensor representing the report. I would leave this for another version/PR. If you agree, I'll edit the title.

Moreover, regarding UX, I find that the best would be to pass the report_config and the sensor_id. If we had to rely on attached report_config, we should provide a way to update report_config per sensor and we should also allow to have multiple report_config.

Another interesting feature would be to have a CLI command that scans a folder for configuration files (reporter_class + report_config + output_sensor + start / end + resolution) and runs the command for each entry, something like:

flexmeasures add reporters .

Gathering all the ideas:

  • sensor_id [Required]
  • report_config_file [Required]
  • start [Optional]: can default to the sensor's most recent event_start (or event_end).
  • end [Optional]: can default to datetime.now().
  • resolution [Optional]: the reporter already defaults this to the sensor's resolution.
  • to-file [Optional]: detecting format (CSV or XLSX). Complain if given a wrong format.
    • by default: save as CSV with format "%Y_%m_%dT%H_%M.csv", e.g: 2023_04_24T13_00.csv. We can add the Repoter class name as well.
  • to-database [Optional]: flag that tells whether to save the report to the database or not.
  • Datetime range helpers [Optional]
    • last-hour
    • yesterday
    • last-week
    • last-month

Please, let me know if that's what you had in mind 😄

Out of scope: a similar thing could be done for the Scheduler.

@victorgarcia98 victorgarcia98 changed the title Add CLI command flexmeasures add report that reads a sensor's report_config attribute and stores the resulting data. Add CLI command flexmeasures add report to compute reports using a Reporter Apr 27, 2023
This was referenced May 1, 2023
Reporting backend infrastructure automation moved this from To do to Done May 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
3 participants