Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic Syslog UDP collector bot #1611

Draft
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

creideiki
Copy link
Contributor

Extremely basic, probably too slow, but simple and working Syslog collector over UDP.

We will probably not be running this in production, but I had already written it as a proof of concept, and thought it marginally more useful to share the code than quietly disposing of it. Especially since the existing pull request for a Syslog collector in #848 no longer works because of changes in both IntelMQ and Python.

@ghost ghost added this to the 2.3.0 milestone Oct 10, 2020
@ghost ghost added component: bots feature request Indicates new feature requests labels Oct 10, 2020
@ghost ghost self-assigned this Oct 10, 2020
@ghost
Copy link

ghost commented Oct 10, 2020

Thanks for your contribution!

This PR is marked as draft, is this intentional?

Even if it is not perfect, I'm fine to merge it as long as it is functional. I'd add some explanation in Bots.md, also linking rsyslog's documentation as hint for a set-up (e.g. https://www.rsyslog.com/doc/master/configuration/examples.html which contains examples).

@creideiki
Copy link
Contributor Author

This PR is marked as draft, is this intentional?

Yes, as I do not consider this functionality even remotely ready for production. I made this as a proof of concept, but for production we'll be sending syslog traffic using AMQP through a RabbitMQ server. Obvious deficiencies in this bot include:

  1. Only does UDP, not TCP.
  2. Synchronous single-threaded design means abysmal performance and probably dropped messages under even light load.
  3. Doesn't validate the syslog data format at all. Luckily, syslog is simple enough that treating it as a simple text string sort of works, but loses information such as the reporting hostname and the timestamp.

Even if it is not perfect, I'm fine to merge it as long as it is functional. I'd add some explanation in Bots.md, also linking rsyslog's documentation as hint for a set-up (e.g. https://www.rsyslog.com/doc/master/configuration/examples.html which contains examples).

I'm wary of people not considering any documented caveats and attempting to use this code for things it wasn't designed for, losing data in the process.

@ghost
Copy link

ghost commented Oct 16, 2020

Thanks for your response. I think the collector should be called "UDP", not "Syslog", as syslog is just the data format (relevant for parsing), not the transport protocol.

@creideiki
Copy link
Contributor Author

Something like this (which is totally untested)?

This does present the problem that there is already a collector named "tcp", which accepts IntelMQ messages, not raw bytes. Maybe this should be called "udp_text" or "udp_raw" to distinguish them, and make clear that there are two possible bots (IntelMQ messages over UDP and raw text over TCP) not implemented?

Extremely basic, probably too slow, but simple and working Syslog
collector over UDP.
The bot really doesn't care about the Syslog data format, just that
it can receive text in UDP packets. Handling Syslog is the job
of a later parser bot.
@codecov-io
Copy link

codecov-io commented Oct 20, 2020

Codecov Report

Merging #1611 into develop will decrease coverage by 0.05%.
The diff coverage is 35.71%.

@@             Coverage Diff             @@
##           develop    #1611      +/-   ##
===========================================
- Coverage    75.55%   75.50%   -0.06%     
===========================================
  Files          391      392       +1     
  Lines        19700    19728      +28     
  Branches      2708     2709       +1     
===========================================
+ Hits         14885    14895      +10     
- Misses        4230     4248      +18     
  Partials       585      585              
Impacted Files Coverage Δ
intelmq/bots/collectors/udp/collector.py 35.71% <35.71%> (ø)

@ghost
Copy link

ghost commented Oct 21, 2020

Concerning the TCP collector issue: Previously we had no other use-case for the TCP collector than the IntelMQ to IntelMQ connection. If we have more, I'd be for offering both functionalities: The collector could then be able to receive arbitrary input (like syslog) but can also be capable of receiving the IntelMQ "flavor" (with the "Ok" message).

cc @e3rd (tcp collector/output author & user)

@e3rd
Copy link
Member

e3rd commented Oct 21, 2020

If I remember, TCP output has the parameter counterpart_is_intelmq. Depending on that it awaits an "Ok" message be received after each message is output.
TCP collector just sends "Ok" after every message it gets but I supposed this would not pose a problem for any arbitrary input. If it poses a problem, a parameter counterpart_is_intelmq might be easily added so that the collector stops sending "Ok".
That was the question, right?

@ghost ghost removed this from the 2.3.0 milestone Feb 5, 2021
@ghost ghost removed their assignment Aug 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: bots feature request Indicates new feature requests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants