Skip to content

Latest commit

 

History

History
316 lines (204 loc) · 19.6 KB

README_related.md

File metadata and controls

316 lines (204 loc) · 19.6 KB

Notes, related work and references

Upstream RSyslog repo with latest stable

RHEL-CentOS RSyslog RPMs

RSyslog action queues

RSyslog supports sophisticated queue options. At a minimum, read both of the following to comprehend them more fully:

Also helpful: https://rsyslog.readthedocs.io/en/latest/whitepapers/queues_analogy.html

A quick summary of queue parameters as they apply to actions (not rulesets):

Param Action Q Default Custom default Disk-assisted in-memory Q In-memory Q Disk Q
queue.size 1000 100000 in-memory message count overall message count N/A
queue.maxFileSize 1M on-disk file fragment size N/A on-disk file fragment size
queue.maxDiskSpace 0 (no limit) 1073741824 (1G) on-disk message storage use N/A message storage use
queue.highWaterMark 90% of queue.size (900) in-memory message count N/A N/A
queue.lowWaterMark 70% of queue.size (700) in-memory message count N/A N/A
queue.discardMark 80% of queue.size (800) 95000 (95%) in-memory message count overall message count N/A?
queue.fullDelaymark 97% of queue.size (970) in-memory message count overall message count N/A?
queue.lightDelayMark 70% of queue.size (700) 0 (disabled) in-memory message count overall message count N/A?

Problems with defaults as they apply to disk assisted queue mode might be:

  • Light delay (70%) starts well before the queue uses disk buffering (90% highWaterMark) and source client systems (inputs) may not have large buffers and end up discarding messages client side
  • Messages begin to be discarded before the queue uses disk buffering if queue.discardSeverity is set to anything lower than the default of 8.

Threading.

Param RSyslog Default Custom default
queue.workerThreads 1 4
queue.workerThreadMinimumMessages queue.size/queue.workerthreads 0.8 * queue.size / queue.workerthreads (20000)

Batching.

Param Ruleset Q Action Q Custom default
queue.dequeueBatchSize 1024 128 5000
queue.minDequeueBatchSize 0 0 50
queue.minDequeueBatchSize.timeout 1000 1000 100

Forwarding formats and RSyslog templates

When forwarding via syslog, RFC5424 is obviously the most sensible format to relay in. RFC3164 is not a proper standard and made poor choices with regard to the time-stamp convention. While "Mmm dd hh:mm:ss" is human readable, it lacks the year, sub-second accuracy and timezone info.

When forwarding to kafka, format choices are more complex:

The current choice is leaving the message in RFC5424 text format as the payload. TODO: If mapped to JSON, a full raw payload type field could be preserved in case downstream systems already expect raw syslog. Future work could consider more efficient encoded binary formats like AVRO.

Rsyslog omkafka

http://www.rsyslog.com/doc/master/configuration/modules/omkafka.html

Rsyslog into kafka and logstash

Rsyslog meta-data

As a use-case for provenance, the message's origin IP, DNS reverse lookup, and time received filed would be useful meta-data. Unfortunately, this doesn't seem to be well standardised, and in addition, it's not clearly documented if rsyslog has a property for the source system's client certificate.

Adding meta-data to every message can also increase per-message size with highly redundant data - e.g. an extra 200 bytes of structured data elements added to each message. This additional overhead can be justified if per message provenance is important. In addition, compression can help compensate.

RFC5424 and structured data elements to add meta-data about an origin

Syslog RFC5424 allows for structured data elements (none or many). The standard already defines meta-data that can be added (Section 7 of RFC5424). Below is a quick summary of standard structured data elements:

origin:

  • ip
  • software
  • swVersion

meta:

  • sequenceId
  • sysUpTime
  • language

The 'origin' and 'meta' standard structured data elements don't cover the intended provenance use case fully. Both are client/log source/origin provided. E.g. the ip filed for origin can occur multiple times for all IP interfaces a given host had, but in a provenance use case, we want the server to note the IP the collector/syslog server received the message from.

See "6.3.5. Examples" in the standard as a good point to jump into it. Basically, via an example:

Given no appropriate standard, a custom structured data element rsyslog_relay@16543 can be pre-pended to include the following meta-data about how the log event was received:

rsyslog property Splunk CIM Use
inputname app rsyslog input module used, e.g. imudp, imptcp, imtcp, imrelp, etc
fromhost src Source DNS or IP if reverse DNS lookup fails for the connected syslog client
fromhost-ip src_ip Source IP of the connected syslog client
$myhostname dest Server instance processing the message
timegenerated message_received_time When the message was received
protocol-version ?? 1 if input conformed to RFC5424, or 0 if input was guessed to be RFC3164
custom variable: $!tls ?? true/false flag for SSL/TLS
custom variable: $!authenticated_client ?? if client authentication was applied, i.e. the 'authMode' applied for SSL/TLS - limited to true/false

When rsyslog doesn't have a property available, we can generate it with our own custom variables.

If rsyslog supported extracting properties from the input's TLS/SSL library, the authentication information such as the following could be included (but is not, given there are no such documented features in RSyslog):

  • ssl_issuer_common_name:
  • ssl_subject_common_name:
  • ssl_serial:

Knowing which format the source message was in is complicated because rsyslog by default applies multiple parsers in a chain so that it can accept both RFC3164 and RFC5424 messages. Syslog Parsing and Message Parser has more details. In summary, legacy syslog (3164) was never properly standardised. Any printable string is acceptable according to RFC3164, which is not an IETF standard, but just an informational document. So if there were a 'format' field to indicate the source type of message before attempting to store as RFC5424 with meta-data added, there are a number of scenarios for the format value that could be useful to try add as meta-data:

  • rfc3164_unconventional
  • rfc3164
  • rfc5424
  • rfc5424_malformed

'unconventional' indicates that something didn't respect the conventions laid out by section 5 of RFC3164.

NB: 'rfc5424_malformed' may not be possible given that if 5424 parsing fails, rsyslog next applies the 3164 parser, which accepts any printable string. Given a 3164 message priority (e.g. <177>) doesn't have a version, and a 5424 message priority does (e.g. <177>1), it's possible to check RFC5424 conformance, except for a highly rare edge case of an unconventional 3164 message of "1 ".

More specific parser should go earlier in the parser chain, and the default parser chain set is rsyslog.rfc5424 and then rsyslog.rfc3164. Parser chains are applied after input modules.

Options of interest for the parsing:

  • force.tagEndingByColon: deafult is off, but this can help avoid the case where a badly formated RFC3164 message doesn't tag properly and RSyslog incorrectly pulls out something that was not intended as a tag
  • remove.msgFirstSpace

However, there inst any documented obvious property to indicate which parser was used. A workaround is using the protocol-version property.

The above helps establish a server perspective and view of the provenance/origin of a message. However, since syslog messages can be relayed, the information above is not complete provenance, just information about the peer.

TODO: Dealing with syslog relay scenarios is a future concern. Perpending more structured data elements for each relay won't work given "the same SD-ID MUST NOT exist more than once in a message" (as per the RFC), so it's not that simple. Structured data element fields can be duplicated, so perhaps by adding the sever (dest) name as a field helps and a known sequence of the same fields (structured data parameters) repeated and appended per relay could better establish a full chain of provenance. This might entail using the mmpstrucdata module to check for an existing data element at increased overhead...

Note further, templates are not affected by if-statements or config nesting.

If the structured data element follows Splunk CIM field names, the template string would be as follows:

[received@16543 message_received_time=\"%timegenerated:::date-rfc3339%\" src=\"%fromhost%\" src_ip=\"%fromhost-ip%\" dest=\"%$myhostname%\" app=\"rsyslog_%inputname%\" ssl=\"%$!tls%\" authenticated_client=\"%$!authenticated_client%\"]

However, elsewhere within the message from the client, information about the event might supersede fields such as src and dest.

If the structured data element follows rsyslogs default property names, the template string would be as follows:

[received@16543 timegenerated=\"%timegenerated:::date-rfc3339%\" fromhost=\"%fromhost%\" fromhost-ip=\"%fromhost-ip%\" myhostname=\"%$myhostname%\" inputname=\"%inputname%\" tls=\"%$!tls%\" authenticated_client=\"%$!authenticated_client%\"]

RFC3164 (legacy) converted to RFC5424 with meta-data pre-pended to structured data

RSyslog template string (see Rsyslog examples)

"<%PRI%>%TIMESTAMP% %HOSTNAME% %syslogtag%%msg%"

Example source RFC3164 message (obviously without meta-data) via UDP

<177>Jul 14 12:52:12 05766jpvanriel-vm test: FINDME using UDP from 127.0.0.1 to 127.0.0.1:514

After, with meta-data added

<177>1 2015-07-14T12:52:12 experience test - - - [received@16543 message_received_time="2017-07-18 12:52:13.116658983+02:00" src="experience.standardbank.co.za"
src_ip="196.8.110.47" dest="plogcs1p.standardbank.co.za" app="rsyslog_imudp" syslog_version="0" ssl="false"] FINDME using UDP from 127.0.0.1 to 127.0.0.1:514

RFC5424 (modern) converted with meta-data appended to structured data

RFC5424 message with an existing structured data element, but no meta-data added yet

<177>1 2015-07-14T12:52:12.916345047+02:00 05766jpvanriel-vm test - - - [testsd@16543 test_msg="RELP client with TLS"] FINDME using RELP from 127.0.0.1 to 127.0.0.1:514

After, RFC5424 message with meta-data prepended to the structured elements

<177>1 2015-07-14T12:52:12 test - - - [received@16543 message_received_time="2017-07-18 12:52:13.116658983+02:00" src="experience.standardbank.co.za"
src_ip="196.8.110.47" dest="plogcs1p.standardbank.co.za" app="rsyslog_imrelp" syslog_version="0" ssl="true" authentication_client="true"][testsd@16543 test_msg="RELP client with TLS"] FINDME using RELP from 127.0.0.1 to 127.0.0.1:514

Other interesting pre-defined structure data elements (IDs) are 'timeQuality', but the source needs to try populate this. To duplicate time quality data in every message would be overly redundant?

RSyslog allows for referencing properties. See rsyslog Properties. Properties of interest are:

  • Protocol version (only applicable to RFC5424)
    • protocol-version 1 (no obvious splunk CIM fieldname)
  • Time when log was received
    • date-rfc3339
    • timegenerated (message_received_time)
  • Log source/sender:
    • fromhost (src)
    • fromhost-ip (src_ip)

Input module name:

  • inputname (app)

Splunk CIM is useful for field naming conventions when we add metadata. The Protocols that map to Splunk CIM has a fuller listing of the Splunk CIM namespace accross multiple catagories ('Protocols').

Dealing with malformed RFC3164

A well formed RFC3164 message should have a valid Syslog Priority (e.g. "<13>") and header which is the timestamp "Mmm dd hh:mm:ss" (notice no year) and hostname (up to 64 character length usually).

Some regex examples below (which fail to account for IPv4 or IPv6 being allowed instead of a hostname)

Strict validation (67 steps)

"^<(([0-9]{1,2})|(1(([0-8][0-9])|([9][01]))))>((Jan)|(Feb)|(Mar)|(Apr)|(May)|(Jun)|(Jul)|(Aug)|(Sep)|(Oct)|(Nov)|(Dec)) (([ 0]?[1-9])|([12][0-9])|(3[01])|[1-9]) (2[0123]|[01][0-9]):([0-5][0-9]):(([0-5]?[0-9]|60)([:.,][0-9]+)?) (20[0-9]{2} )?[a-zA-Z0-9-]{1,63} "

^<(([0-9]{1,2})|(1(([0-8][0-9])|([9][01]))))>

More relaxed validation that still checks for a sane date and time (57 steps)

"^<[0-9]{1,3}>((Jan)|(Feb)|(Mar)|(Apr)|(May)|(Jun)|(Jul)|(Aug)|(Sep)|(Oct)|(Nov)|(Dec)) (([ 0]?[1-9])|([12][0-9])|(?:3[01])|[1-9]) (2[0123]|[01][0-9]):([0-5][0-9]):(([0-5]?[0-9]|60)([:.,][0-9]+)?) [a-zA-Z0-9-]{1,63}"

More relaxed and faster (10 steps), but allows pattern that could be an invalid date (unlikely, but possible)

^<[0-9]{1,3}>[A-Z][a-z]{2} [ ]?[0-9]{1,2}:[0-9]{1,2}:[0-9]{1,2}: [a-zA-Z0-9-]{1,63}

Given regex complexity, other rsyslog properties like comparing seeing if rawmsg-after-pri begins with hostname might be faster way to see if the timestamp was obviously missing and generated by the server instead of being sent by the host.

Realted docker examples

RSyslog images

Changes needed for RHEL 7 default rsyslog config to be able to run in a container:

On docker hub:

Well matched examples for our use-case (specifically wanting to output to kafka)

Recently, as of Oct 2018, docker-rsyslog-bin from camptocamp seems to take a mature approach where:

  • Configuration is managed with an entrypoint wrapper script approach that allows other purpose specific entrypoint scripts to be called in order, e.g. like there's one to cleanup stale rsyslog PID files.
  • The rsyslog config can to be externally generated and mounted into the container while an entrypoint script waits for it

Some of the above images are lightweight and expect a DIY config supplied by the user, e.g.:

  • Don't define expected data volumes for rsyslog queues
  • Don't have pre-defined inputs for various protocols like RELP and TLS options

Other examples (but no github source)

Almost suitable example for our use-case, but uses a lesser know/tested go syslog implementation:

Other interesting examples

Docker specific (not related to syslog/bypass, use kafka directly)

Container config methods

Configuration file templates inside of a container

As per Environment Variable Templating, it's a good idea to help legacy apps become container and micro-service friendlier by 'templating' configuration files. Some tools (in order of github popularity) to help with this are:

  • Confd, uses Golang text/templates and TOML config files
  • dockerize, uses Golang text/templates
  • Tiller, a ruby gem which presumably uses ERB and can support reading environment variables

Confd:

While rsyslog does have a getenv function built into it's config, it's usage seems limited.

See Re: set config values from env var

no.

there are ways to assign variables to be equal to environment variables with the set command, but those cannot be used for most config options.

In any even, something like confd has more flexibly for deployments that might use distributed config management and orchestration tools like etcd, zookeeper, etc.