Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect Metrics #419

Open
LionelJouin opened this issue May 23, 2023 · 2 comments
Open

Collect Metrics #419

LionelJouin opened this issue May 23, 2023 · 2 comments
Labels

Comments

@LionelJouin
Copy link
Member

LionelJouin commented May 23, 2023

Is your feature request related to a problem? Please describe.

Add metrics to Meridio to improve observability tools.
Here is some slides:
https://docs.google.com/presentation/d/1yuiDj7H4NZTea7dJAKPK4SBvkHtZyWrn5HisNe1shuI
And OpenTelemetry/Prometheus/Grafana stack deployment instruction:
https://gist.github.com/LionelJouin/cfa15a569f1f23d8a84d43dc73b5f373

Describe the solution you'd like

Interface metrics in stateless-lb-frontend / Proxy / TAPA

<interface.metric>: rx_packets, tx_packets, rx_bytes, tx_bytes, rx_errors, tx_errors, rx_dropped, tx_dropped

  • Name: meridio.interface.<interface.metric>
  • Description: <interface.metric> metrics for the network interface
  • Type: Counter
  • Value: <interface.metric>
  • Attributes:
    • Pod Name
    • Trench
    • Conduit (optional)
    • Attractor (optional)
    • Interface Name

Interface status in stateless-lb-frontend / Proxy / TAPA ?

  • Name: meridio.interface.status
  • Description: Network interface status
  • Type: Gauge (Health Metric)
  • Value: Status of the interface (1 if up, 0 if down)
  • Attributes:
    • Pod Name
    • Trench
    • Conduit (optional)
    • Attractor (optional)
    • Interface Name

Stream status in conduit instance

  • Name: meridio.conduit.stream.status
  • Description: Stream status in the conduit instance
  • Type: Gauge (Health Metric)
  • Value: Status of the stream (1 if configured)
  • Attributes:
    • Pod Name
    • Trench
    • Conduit
    • Stream

Flows configured in conduit instance

  • Name: meridio.conduit.stream.flow.status
  • Description: Flow status in the conduit instance
  • Type: Gauge (Health Metric)
  • Value: Status of the flow (1 if configured) (Counter with nfqlb matches_count instead?)
  • Attributes:
    • Pod Name
    • Trench
    • Conduit
    • Stream
    • Flow

Targets configured in conduit instance

  • Name: meridio.conduit.stream.target.status
  • Description: Target status in the conduit instance
  • Type: Gauge (Health Metric)
  • Value: Status of the target (1 if configured, 0 if pending) (Counter with nftables prerouting on fwmark instead?)
  • Attributes:
    • Pod Name
    • Trench
    • Conduit
    • Stream
    • Target (identifier + IPs)
nft add table inet meridio-metrics
nft add chain inet meridio-metrics target-hits { type filter hook postrouting priority 0 \; }
nft add rule inet meridio-metrics target-hits meta mark 0x13dc counter

Targets configured in conduit instance

  • Name: meridio.conduit.target.connectivity.status
  • Description: Target status in the conduit instance
  • Type: Gauge (UpDownCounter instead?)
  • Value: ping in ms (-1 if no reply)
  • Attributes:
    • Pod Name
    • Trench
    • Conduit
    • IP

Gateways configured in attractor instance

  • Name: meridio.attractor.gateway.status
  • Description: Gateway status in the attractor instance
  • Type: Gauge (Health Metric)
  • Value: Status of the target (1 if running, 0 if failing)
  • Attributes:
    • Pod Name
    • Trench
    • Attractor
    • Gateway

Describe alternatives you've considered
/

Additional context
https://opentelemetry.io/docs/specs/otel/metrics/semantic_conventions/

@tedlean
Copy link
Collaborator

tedlean commented Oct 24, 2023

meridio.interface.METRIC_TYPE (Planned):

  • Which interfaces is more precisely included here: Proxy interfaces? Application pod interfaces? FE external interface?
  • Is IPv4 and IPv6 interface counters to be included? I guess the proposal is to have L2 counters.
  • How is the traffic on this interface to be mapped to counters on NSM/infra/SNIA level? by using MAC-addresses?

meridio.conduit.stream.status (Planned):

  • What would be the status presented here?

meridio.conduit.stream.flow.matches:

  • It would be nice to a have counter for non-matched packet, or maybe a counter for the total number of packets handled.

meridio.conduit.stream.target.packet.hits (Planned)

  • How can packets towards a target be mapped to the interface used for sending the traffic onwards.

meridio.conduit.stream.target.latency (Planned)

  • Do this require some test-traffic injected internally?
  • What results could be seen? And how to interpret these results in a general fashion?

meridio.attractor.gateway.status (Planned)

  • Is the proposal to have a up/down indication? It could be nice to have this both for BGP and BFD separately.
  • Besides this number of learned BGP routes could be of interest.
  • Number of sent/received BFD and BGP packets/bytes to get a clearer view on the "real payload"
    (yes, there can be other kinds traffic such ARP, but still... )

@LionelJouin
Copy link
Member Author

for non-matched packets, nolb_fwmark in nfqlb could be used to blackhole traffic in policy route and then counting in the same way as the targets using nftables

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: 🏗 In progress
Development

No branches or pull requests

2 participants