Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

first version of the mail plugin #3

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

Phill93
Copy link

@Phill93 Phill93 commented Mar 15, 2021

This is a draft please do not merge!

Plugin to receive and parse emails.

Unread mails are retrieved from the server using IMAP and parsed using regexes.

Todo:

  • Parser: better error handling
  • Parser: UnhandledEvent -> flag CAUSE_UNPARSABLE_MESSAGE
  • Parser: use the AppInfo.path
  • IMAP: more settings (port, ssl)
  • IMAP: reconnect handling
  • IMAP: UnhandledEvent -> flag `CAUSE_IGNORED_SENDER
  • IMAP: prepend subject to sourceEvent.raw
  • timestamp handling
  • cleanup
    • self.__imapServer vs. self.__imapClient
  • add tests
  • docs
  • SettingsEvents
  • debug output

@Phill93
Copy link
Author

Phill93 commented Mar 15, 2021

@Sebastian-Maier can you please do a first review?

@Phill93 Phill93 mentioned this pull request Mar 15, 2021
@Sebastian-Maier
Copy link
Member

Sebastian-Maier commented Mar 15, 2021

Hi Phill93,

thank you for all your efforts. 🙂
Your PR looks excellent so far. 👍

I just assigned issue #1 to you. I hope this is ok?

You already listed most of the potential improvements:

  • addition of (parser) test cases
  • additional settings (e.g., IMAP port/folder)
  • additional error handling (e.g., invalid template file or template file name, invalid IMAP folder, IMAP search/fetch)

Apart from that, I found the following minor things:

  • How does IMAP client behave when the (internet-)connection was disconnected temporarily or if a transient error occurred (e.g., during extended use)? Do we have to reconnect to the IMAP server explicitly in this case?
  • We should probably also update self.__connected in case an error occurs during IMAP fetch.
  • Are messages flagged as "seen" automatically (after fetching them via IMAP)?
  • When receiving a message from a disallowed sender (allowlist/denylist), we typically return an UnhandledEvent with the CAUSE_IGNORED_SENDER flag set and pass it to the action handlers anyway (e.g., to forward it to an admin).
  • When parsing failed (e.g., no event or no location) and the sender was not an alarm sender (specified explicitly in the config), we typically return an UnhandledEvent with the CAUSE_UNPARSABLE_MESSAGE set and pass it to the action handlers anyway (e.g., to forward it to an admin).
  • When parsing failed (e.g., no event or no location) and the sender was not an alarm sender (specified explicitly in the config), we typically return an AlarmEvent with the FLAGS_INVALID flag set (to allow for a fallback mode that shows the raw message content).
  • I would probably prepend sourceEvent.subject to sourceEvent.raw (as first line) – otherwise, we would have to adjust database and CSV accordingly.
  • The alarmEvent.alarmTimestamp is not initialized (explicitly) – maybe AlarmEvent.fromSourceEvent() initializes it from alarmEvent.timestamp implicitly anyhow (?)
  • We should not rely on CWD being the root of the SituationBoard folder when determining the path to the template file. Instead, we should use the AppInfo.path to determine the full filename of the template file.
  • We should add a short mail-source and mail-parser section in docs/Configuration.md (mainly concerning the available config options).
  • self.__imapServer vs. self.__imapClient (?)
  • optional: support for SettingEvents (enables messages like "news=LF auf Bewegungsfahrt" to update the standby view of the frontend; compare SMS parser)
  • optional: additional (debug) output

I like your regex-/JSON-based parser solution – it's certainly more than sufficient for a first version of the mail parser and, at the same time, way more flexible than the hardcoded parser we use for the SMS source.
However, I am still thinking about a consistent and future-proof long-term solution for all the SituationBoard parsers (i.e., mail, SMS, POCSAG, ...).
It would be great to have a common template format for all sources/parsers that is flexible enough to meet all the individual needs of the different sources and yet simple enough to be written by non-tech-savvy users.

For the (rather simple) SMS parser, we currently have the following requirements:

  • verification of header against predefined text (to check whether it is a valid alarm SMS or not)
  • field definitions that are relative to a text like "Label: " at the beginning of a line
  • field definitions that are relative to other fields/lines
  • support for single- and multiline fields
  • encoding of expected date/time-formats (optional; workaround possible)

Are there reasons against ttp (or a comparable template-based parser) apart from the additional Python dependency?
What made you choose the regex/JSON-based solution instead? Did I talk you out of ttp or was there a specific limitation (e.g., the additional complexity introduced by HTML)?
Do you know if Python itself offers a similar template-based parser?

@Phill93
Copy link
Author

Phill93 commented Mar 16, 2021

Hi Sebastian,

I added your points to my ToDo in the first post.

I spent half of Sunday with ttp and then hacked the regex solution together in the evening. ttp has a few problems capturing multiline text.

For a clean result we will use a mix of ttp or regex templates and python scripts.

It would be good to have example emails from other control centers, not sure how similar they are.

I don't know another parser than ttp for python, I use ttp professionally for network automation.

Greetings

Phill93

@Sebastian-Maier
Copy link
Member

Thank you for looking into ttp.
I also had a brief look at ttp and came to the same (tentative ?) conclusion:
It is likely that Python (and regex) is required for most of our parsers needs.

The remaining question is, how to structure the parser implementations to allow for code reuse/sharing (in order to keep the effort for the parser creation/maintenance as small as possible).

Regarding mails from other control centers: Maybe @noffycws can provide you with (anonymized) mail examples?

@Phill93 Phill93 marked this pull request as ready for review February 6, 2022 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants