Skip to content

cloudfoundry/loggregator-release

Repository files navigation

Loggregator Release

Loggregator is a BOSH release deployed as a part of cf-deployment. Loggregator provides a highly-available (HA) and secure stream of logs and metrics for all applications and components on Cloud Foundry. It does so while not disrupting the behavior of the the applications and components on the platform (i.e. "backpressure").

The Loggregator Design Notes presents an overview of Loggregator components and architecture.

If you have any questions, or want to get attention for a PR or issue please reach out on the #logging-and-metrics channel in the cloudfoundry slack

Table of Contents

Streaming Application Logs

Any user of Cloud Foundry can experience Loggregator by using two simple interfaces for streaming application specific logs. These do not require any special User Account and Authentication(UAA) Scope.

Using the CF CLI

The fastest way to see your logs is by running the cf logs command using the CF CLI. Check the Cloud Foundry CLI docs for more details.

Forwarding to a Log Drain

If you’d like to save all logs for an application in a third party or custom tool that expects the syslog format, a log drain allows you to do so. Check the Cloud Foundry docs for more details.

Log Ordering

Loggregator does not provide any guarantees around the order of delivery of logs in streams. That said, there is enough precision in the timestamp provided by diego that streaming clients can batch and order streams as they receive them. This is done by the cf cli and most other streaming clients.

Consuming the Firehose

The firehose is an aggregated stream of all application logs and component metrics on the platform. This allows operators to ensure they capture all logs within a microservice architecture as well as monitor the health of their platform. See the Firehose README.

User Account and Authentication Scope

In order to consume the firehose you’ll need the doppler.firehose scope from UAA. For more details see the Firehose README.

Nozzle Development

Once you have configured appropriate authentication scope you are ready to start developing a nozzle for the firehose. See our Nozzle community page for more details about existing nozzles and how to get started.

Metrics

Loggregator and other Cloud Foundry components emit regular messages through the Firehose that monitor the health, throughput, and details of a component's operations. For more detials about Loggregator’s metrics see our Loggregator Metrics README.

Emitting Logs and Metrics into Loggregator

For components of Cloud Foundry or standalone BOSH deployments, Loggregator provides a set of tools for emitting Logs and Metrics.

Reverse Log Proxy (RLP)

The RLP is the v2 implementation of the Loggregator API. This component is intended to be a replacement for traffic controller.

RLP Gateway

By default, the RLP communicates with clients via gRPC over mutual TLS. To enable HTTP access to the Reverse Log Proxy, deploy the RLP Gateway.

Loggregator API

The Loggregator API is a replacement of the Dropsonde Protocol. Loggregator API defines an envelope structure which packages logs and metrics in a common format for distribution throughout Loggregator. See the Loggregator API README for more details.

Loggregator Agents

Loggregator Agents receive logs and metrics on VMs, and forward them onto the Firehose. For more info see the loggregator-agent release.

Statsd-injector

The statsd-injector receives metrics from components in the statsd metric aggregator format. For more info see the statsd-injector README.

Syslog Release

For some components (such as UAA) it makes sense to route logs separate from the Firehose. The syslog release uses rsyslog to accomplish this. For more information see the syslog-release README.

Tools for Testing and Monitoring Loggregator

Loggregator provides a set of tools for testing the performance and reliability of your loggregator installation. See the loggregator tools repo for more details.

Troubleshooting Reliability

Scaling

In addition to the scaling recommendations above, it is important that the resources for Loggregator are dedicate VM’s with similar footprints to those used in our capacity tests. Even if you are within the bounds of the scaling recommendations it may be useful to scale Loggregator and Nozzle components aggressively to rule out scaling as a major cause log loss.

Noise

Another common reason for log loss is due to an application producing a large amount of logs that drown out the logs from other application on the cell it is running on. To identify and monitor for this behavior the Loggregator team has created a Noisy Neighbor Nozzle and CLI Tool. This tool will help operators quickly identify and take action on noise producing applications. Instruction for deploying and using this nozzle are in the repo.