Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP]Telegram Bot Notification #206

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ To use the template, run the following command(s):

1. Create local config file (`config.env`) to store all necessary environmental variables. There's already an example `config.env.template` in the repo that stores default env vars.

2. [Download](https://go.dev/doc/install) or upgrade to `golang 1.19`.
2. [Download](https://go.dev/doc/install) or upgrade to `golang 1.21`.

3. Install all project golang dependencies by running `go mod download`.

Expand Down
15 changes: 14 additions & 1 deletion alerts-template.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,24 @@ alertRoutes:
low_oncall:
url: ""
channel: ""
telegram:
low_oncall:
bot_token: ""
chat_id: ""

medium:
slack:
medium_oncall:
url: ""
channel: ""
medium_oncall:
pagerduty:
medium_oncall:
config:
integration_key: ""
telegram:
medium_oncall:
bot_token: ""
chat_id: ""

high:
slack:
Expand All @@ -25,3 +34,7 @@ alertRoutes:
integration_key: ${MY_INTEGRATION_KEY}
medium_oncall:
integration_key: ""
telegram:
high_oncall:
bot_token: ""
chat_id: ""
2 changes: 1 addition & 1 deletion config.env.template
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ BOOTSTRAP_PATH=genesis.json
SERVER_HOST=localhost
SERVER_PORT=8080
SERVER_KEEP_ALIVE_TIME=10
SERVER_READ_TIMEOUT=10
SERVER_READ_TIMEOUT=10
SERVER_WRITE_TIMEOUT=10
SERVER_SHUTDOWN_TIME=10

Expand Down
8 changes: 8 additions & 0 deletions docs/alert-routing.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ Pessimism currently supports the following alert destinations:
| slack | Sends alerts to a Slack channel |
| pagerduty | Sends alerts to a PagerDuty service |
| sns | Sends alerts to an SNS topic defined in .env file |
| telegram | Sends alerts to a Telegram channel |

## Alert Severity

Expand All @@ -58,6 +59,13 @@ topic. The ARN should be added to the `SNS_TOPIC_ARN` variable found in the `.en
The AWS_ENDPOINT is optional and is primarily used for testing with localstack.
> Note: Currently, Pessimism only support one SNS topic to publish alerts to.

## Publishing to a Telegram Channel

It's possible to publish alerts to a Telegram channel by adding the channel's
ID and bot token to the `chat_id` and `bot_token`
variables in the `alerts-routing.` configuration file. Generate a bot token by leveraging the
following [guide](https://core.telegram.org/bots#how-do-i-create-a-bot).

## PagerDuty Severity Mapping

PagerDuty supports the following severities: `critical`, `error`, `warning`,
Expand Down
4 changes: 4 additions & 0 deletions docs/architecture/alerting.markdown
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,13 @@ subgraph AM["Alerting Manager"]
EL --> |Submit alert|SR["SeverityRouter"]
SR --> SH["Slack"]
SR --> PH["PagerDuty"]
SR --> TH["Telegram"]
SR --> CPH["CounterParty Handler"]

end
CPH --> |"HTTP POST"|TPH["Third Party API"]
SH --> |"HTTP POST"|SlackAPI("Slack Webhook API")
TH --> |"HTTP POST"|TelegramBotAPI("Telegram Bot API")
PH --> |"HTTP POST"|PagerDutyAPI("PagerDuty API")

</div>
Expand Down Expand Up @@ -79,6 +81,8 @@ Done! You should now see any generated alerts being forwarded to your specified

The PagerDuty alert destination is a configurable destination that allows alerts to be sent to a specific PagerDuty services via the use of integration keys. Pessimism also uses the UUID associated with an alert as a deduplication key for PagerDuty. This is done to ensure that PagerDuty will not be spammed with duplicate or incidents.

#### Telegram

### Alert CoolDowns

To ensure that alerts aren't spammed to destinations once invoked, a time based cooldown value (`cooldown_time`) can be defined within the `alert_params` of a heuristic session config. This time value determines how long a heuristic session must wait before being allowed to alert again.
Expand Down
31 changes: 31 additions & 0 deletions internal/alert/interpolator.go
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,23 @@ const (
`
)

// Telegram message format
const (
TelegramMsgFmt = `
*%s %s*

Network: *%s*
Severity: *%s*
Session UUID: *%s*

*Assessment Content:*
%s

*Message:*
%s
`
)

type Interpolator struct{}

func NewInterpolator() *Interpolator {
Expand All @@ -62,3 +79,17 @@ func (*Interpolator) PagerDutyMessage(a core.Alert) string {
a.Net.String(),
a.Content)
}

func (*Interpolator) TelegramMessage(a core.Alert, msg string) string {
sev := cases.Title(language.English).String(a.Sev.String())
ht := cases.Title(language.English).String(a.HT.String())

return fmt.Sprintf(TelegramMsgFmt,
a.Sev.Symbol(),
ht,
a.Net.String(),
sev,
a.HeuristicID.String(),
fmt.Sprintf(CodeBlockFmt, a.Content), // Reusing the Slack code block format
msg)
}
33 changes: 33 additions & 0 deletions internal/alert/manager.go
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,35 @@ func (am *alertManager) handleSNSPublish(alert core.Alert, policy *core.AlertPol
return nil
}

func (am *alertManager) handleTelegramPost(alert core.Alert, policy *core.AlertPolicy) error {
telegramClients := am.cm.GetTelegramClients(alert.Sev)
if telegramClients == nil {
am.logger.Warn("No telegram clients defined for criticality", zap.Any("alert", alert))
return nil
}

// Create Telegram event trigger
event := &client.AlertEventTrigger{
Message: am.interpolator.TelegramMessage(alert, policy.Msg),
Alert: alert,
}

for _, tc := range telegramClients {
resp, err := tc.PostEvent(am.ctx, event)
if err != nil {
return err
}

if resp.Status != core.SuccessStatus {
return fmt.Errorf("client %s could not post to telegram: %s", tc.GetName(), resp.Message)
}
am.logger.Debug("Successfully posted to Telegram", zap.String("resp", resp.Message))
am.metrics.RecordAlertGenerated(alert, core.Telegram, tc.GetName())
}

return nil
}

// EventLoop ... Event loop for alert manager subsystem
func (am *alertManager) EventLoop() error {
ticker := time.NewTicker(time.Second * 1)
Expand Down Expand Up @@ -229,6 +258,10 @@ func (am *alertManager) HandleAlert(alert core.Alert, policy *core.AlertPolicy)
if err := am.handleSNSPublish(alert, policy); err != nil {
am.logger.Error("could not publish to sns", zap.Error(err))
}

if err := am.handleTelegramPost(alert, policy); err != nil {
am.logger.Error("could not post to telegram", zap.Error(err))
}
}

// Shutdown ... Shuts down the alert manager subsystem
Expand Down
33 changes: 33 additions & 0 deletions internal/alert/routing.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,24 @@
package alert

import (
"go.uber.org/zap"

"github.com/base-org/pessimism/internal/client"
"github.com/base-org/pessimism/internal/core"
"github.com/base-org/pessimism/internal/logging"
)

// RoutingDirectory ... Interface for routing directory
type RoutingDirectory interface {
GetPagerDutyClients(sev core.Severity) []client.PagerDutyClient
GetSlackClients(sev core.Severity) []client.SlackClient
GetTelegramClients(sev core.Severity) []client.TelegramClient
InitializeRouting(params *core.AlertRoutingParams)
SetPagerDutyClients([]client.PagerDutyClient, core.Severity)
SetSlackClients([]client.SlackClient, core.Severity)
GetSNSClient() client.SNSClient
SetSNSClient(client.SNSClient)
SetTelegramClients([]client.TelegramClient, core.Severity)
}

// routingDirectory ... Routing directory implementation
Expand All @@ -26,6 +31,7 @@ type routingDirectory struct {
pagerDutyClients map[core.Severity][]client.PagerDutyClient
slackClients map[core.Severity][]client.SlackClient
snsClient client.SNSClient
telegramClients map[core.Severity][]client.TelegramClient
cfg *Config
}

Expand All @@ -36,6 +42,7 @@ func NewRoutingDirectory(cfg *Config) RoutingDirectory {
pagerDutyClients: make(map[core.Severity][]client.PagerDutyClient),
slackClients: make(map[core.Severity][]client.SlackClient),
snsClient: nil,
telegramClients: make(map[core.Severity][]client.TelegramClient),
}
}

Expand All @@ -49,6 +56,11 @@ func (rd *routingDirectory) GetSlackClients(sev core.Severity) []client.SlackCli
return rd.slackClients[sev]
}

// GetTelegramClients ... Returns the telegram clients for the given severity level
func (rd *routingDirectory) GetTelegramClients(sev core.Severity) []client.TelegramClient {
return rd.telegramClients[sev]
}

// SetSlackClients ... Sets the slack clients for the given severity level
func (rd *routingDirectory) SetSlackClients(clients []client.SlackClient, sev core.Severity) {
copy(rd.slackClients[sev][0:], clients)
Expand All @@ -67,6 +79,12 @@ func (rd *routingDirectory) SetPagerDutyClients(clients []client.PagerDutyClient
copy(rd.pagerDutyClients[sev][0:], clients)
}

// SetTelegramClients ... Sets the telegram clients for the given severity level
func (rd *routingDirectory) SetTelegramClients(clients []client.TelegramClient, sev core.Severity) {
rd.telegramClients[sev] = make([]client.TelegramClient, len(clients))
copy(rd.telegramClients[sev], clients)
}

// InitializeRouting ... Parses alert routing parameters for each severity level
func (rd *routingDirectory) InitializeRouting(params *core.AlertRoutingParams) {
rd.snsClient = client.NewSNSClient(rd.cfg.SNSConfig, "sns")
Expand Down Expand Up @@ -104,4 +122,19 @@ func (rd *routingDirectory) paramsToRouteDirectory(acc *core.AlertClientCfg, sev
rd.pagerDutyClients[sev] = append(rd.pagerDutyClients[sev], client)
}
}

if acc.Telegram != nil {
for name, cfg := range acc.Telegram {
conf := &client.TelegramConfig{
Token: cfg.Token.String(),
ChatID: cfg.ChatID.String(),
}
client, err := client.NewTelegramClient(conf, name)
if err != nil {
logging.NoContext().Error("Failed to create Telegram client", zap.String("name", name), zap.Error(err))
continue
}
rd.telegramClients[sev] = append(rd.telegramClients[sev], client)
}
}
}
50 changes: 50 additions & 0 deletions internal/alert/routing_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,12 @@ func getCfg() *config.Config {
URL: "test1",
},
},
Telegram: map[string]*core.AlertConfig{
"test1": {
Token: "test1",
ChatID: "test1",
},
},
},
Medium: &core.AlertClientCfg{
PagerDuty: map[string]*core.AlertConfig{
Expand All @@ -39,6 +45,12 @@ func getCfg() *config.Config {
URL: "test2",
},
},
Telegram: map[string]*core.AlertConfig{
"test2": {
Token: "test2",
ChatID: "test2",
},
},
},
High: &core.AlertClientCfg{
PagerDuty: map[string]*core.AlertConfig{
Expand All @@ -59,6 +71,16 @@ func getCfg() *config.Config {
URL: "test3",
},
},
Telegram: map[string]*core.AlertConfig{
"test2": {
Token: "test2",
ChatID: "test2",
},
"test3": {
Token: "test3",
ChatID: "test3",
},
},
},
},
},
Expand All @@ -84,11 +106,14 @@ func Test_AlertClientCfgToClientMap(t *testing.T) {
cm.InitializeRouting(cfg.AlertConfig.RoutingParams)

assert.Len(t, cm.GetSlackClients(core.LOW), 1)
assert.Len(t, cm.GetTelegramClients(core.LOW), 1)
assert.Len(t, cm.GetPagerDutyClients(core.LOW), 0)
assert.Len(t, cm.GetSlackClients(core.MEDIUM), 1)
assert.Len(t, cm.GetTelegramClients(core.MEDIUM), 1)
assert.Len(t, cm.GetPagerDutyClients(core.MEDIUM), 1)
assert.Len(t, cm.GetSlackClients(core.HIGH), 2)
assert.Len(t, cm.GetPagerDutyClients(core.HIGH), 2)
assert.Len(t, cm.GetTelegramClients(core.HIGH), 2)
},
},
{
Expand All @@ -107,6 +132,27 @@ func Test_AlertClientCfgToClientMap(t *testing.T) {
assert.Len(t, cm.GetPagerDutyClients(core.MEDIUM), 0)
assert.Len(t, cm.GetSlackClients(core.HIGH), 2)
assert.Len(t, cm.GetPagerDutyClients(core.HIGH), 2)
assert.Len(t, cm.GetTelegramClients(core.HIGH), 2)
},
},
{
name: "Test AlertClientCfgToClientMap Telegram Nil",
description: "Test AlertClientCfgToClientMap doesn't fail when telegram is nil",
testLogic: func(t *testing.T) {
cfg := getCfg()
cfg.AlertConfig.RoutingParams.AlertRoutes.Medium.Telegram = nil
cm := alert.NewRoutingDirectory(cfg.AlertConfig)
assert.NotNil(t, cm, "client map is nil")

cm.InitializeRouting(cfg.AlertConfig.RoutingParams)
assert.Len(t, cm.GetSlackClients(core.LOW), 1)
assert.Len(t, cm.GetPagerDutyClients(core.LOW), 0)
assert.Len(t, cm.GetTelegramClients(core.MEDIUM), 0)
assert.Len(t, cm.GetSlackClients(core.MEDIUM), 1)
assert.Len(t, cm.GetPagerDutyClients(core.MEDIUM), 1)
assert.Len(t, cm.GetSlackClients(core.HIGH), 2)
assert.Len(t, cm.GetPagerDutyClients(core.HIGH), 2)
assert.Len(t, cm.GetTelegramClients(core.HIGH), 2)
},
},
{
Expand All @@ -125,6 +171,7 @@ func Test_AlertClientCfgToClientMap(t *testing.T) {
assert.Len(t, cm.GetPagerDutyClients(core.MEDIUM), 1)
assert.Len(t, cm.GetSlackClients(core.HIGH), 2)
assert.Len(t, cm.GetPagerDutyClients(core.HIGH), 2)
assert.Len(t, cm.GetTelegramClients(core.HIGH), 2)
},
},
{
Expand All @@ -142,10 +189,13 @@ func Test_AlertClientCfgToClientMap(t *testing.T) {

assert.Len(t, cm.GetSlackClients(core.LOW), 0)
assert.Len(t, cm.GetPagerDutyClients(core.LOW), 0)
assert.Len(t, cm.GetTelegramClients(core.LOW), 0)
assert.Len(t, cm.GetSlackClients(core.MEDIUM), 0)
assert.Len(t, cm.GetPagerDutyClients(core.MEDIUM), 0)
assert.Len(t, cm.GetTelegramClients(core.MEDIUM), 0)
assert.Len(t, cm.GetSlackClients(core.HIGH), 0)
assert.Len(t, cm.GetPagerDutyClients(core.HIGH), 0)
assert.Len(t, cm.GetTelegramClients(core.HIGH), 0)
},
},
}
Expand Down