Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Added document to explain measurements in RITA #482

Open
wants to merge 38 commits into
base: master
Choose a base branch
from

Conversation

carrohan
Copy link
Contributor

Added a document that explains measurements in RITA
Based on documents I was given I wasn't sure if we wanted a document like this, or one that literally went through and explained the column titles given in show-* commands, if we wanted something else I can rework this into that easily

@ethack
Copy link
Collaborator

ethack commented Aug 14, 2019

What you have is good too. I don't think the interval score and data size score are actually displayed in the beacon output, but we do store those in the database. I'm not sure if we should explain those in the doc or not.

But I would like something that goes through and literally explains the column titles in the show-* commands. Maybe something like this could go under each section in the document (e.g. "## Beacons") before each of the in depth explanations. Each list item could be a link (if applicable) to the relevant section in the document or have a short description if there is not a section. If you need help figuring out how to get the links right let me know. I think there are examples in our other docs though.

The following are the column titles you will see in the output of rita show-beacons. You can click them for more information.

  • Score (just link to the score section)
  • Source IP - IP address that initiated the connections.
  • Destination IP - IP address that received the connections.
  • Connections - The total number of connections between the source and destination IPs.
  • Avg Bytes - etc.
  • Intvl Range
  • Size Range
  • Top Intvl
  • Top Size
  • Top Intvl Count
  • Top Size Count
  • Intvl Skew (just link to the section)
  • Size Skew (just link to the section)
  • Intvl Dispersion (just link to the section)
  • Size Dispersion (just link to the section)

ethack and others added 2 commits August 14, 2019 12:57
Update the link to documentation about Security Onion with RITA to point to the new version of Security Onion's wiki.
@lisaSW lisaSW changed the title added document to explain measurements in RITA WIP: added document to explain measurements in RITA Sep 4, 2019
@Spriithy
Copy link

Spriithy commented Jan 3, 2020

I fully agree with @ethack I think it would provide more insights to document each column. As I suggested in the issue #273 I think a similar approach could be useful for the analyzers in order to help future / independant maintainers that might not have prior insider knowledge.

@Spriithy
Copy link

Spriithy commented Jan 3, 2020

As I dug through the code and tried to make sense of the several indicators and scores used in the analyzer I really wished I had some documentation to back my intuitions mostly regarding the choices made.

Why pick a 30 seconds in the computation of the tsMadmScore and use 32 seconds right after for dsMadmScore ?

Anyways, I think the indicators are all straightforward. Just the scores might need some explanation.

@ethack
Copy link
Collaborator

ethack commented Jan 6, 2020

@Spriithy Thanks for the feedback!

To answer your question about these lines,

rita/pkg/beacon/analyzer.go

Lines 164 to 174 in d7f7b17

//lower dispersion is better, cutoff dispersion scores at 30 seconds
tsMadmScore := 1.0 - float64(tsMadm)/30.0
if tsMadmScore < 0 {
tsMadmScore = 0
}
//lower dispersion is better, cutoff dispersion scores at 32 bytes
dsMadmScore := 1.0 - float64(dsMadm)/32.0
if dsMadmScore < 0 {
dsMadmScore = 0
}

ts stands for timestamp and refers to the connection interval metrics. ds stands for data size and refers to the connection size metrics. In both cases the divisors are setting a value to normalize the score against. So anything with an interval dispersion greater or equal to 30 seconds will all have the same tsMadmScore of 0. Likewise, anything with a data size dispersion greater or equal to 32 bytes will have the same dsMadmScore of 0.

I'm not entirely sure how 30 seconds and 32 bytes were picked. It could be they were just arbitrary choices that tended to work well.

@Spriithy
Copy link

Spriithy commented Jan 6, 2020

Thanks for the feedback ! Is there some sort of roadmap for the project ? Anything maybe we could contribute to ?

@ethack
Copy link
Collaborator

ethack commented Jan 6, 2020

No public roadmap :( But any issue marked "good first issue" would be very helpful and contributions would be welcome. If you're interested in any of them just start commenting on the issue with questions or a proposed solution.

@carrohan carrohan changed the title WIP: added document to explain measurements in RITA Added document to explain measurements in RITA Feb 14, 2020
@carrohan
Copy link
Contributor Author

carrohan commented Feb 14, 2020

I updated the doc I'd written before to hopefully explain some of the headers better, but didn't have a chance to get it proof read. I'm not totally sure if it's what you were hoping for @Spriithy so please feel free (but not obligated by any means) to give any feedback (and apologies for taking so long to get back to this!)

@lisaSW lisaSW self-requested a review February 14, 2020 20:22
@lisaSW lisaSW self-assigned this Feb 14, 2020
@lisaSW lisaSW changed the title Added document to explain measurements in RITA WIP: Added document to explain measurements in RITA Apr 9, 2020
@lisaSW lisaSW removed their assignment May 12, 2020
@thibaultbl
Copy link

thibaultbl commented Apr 21, 2021

I think an explanation is still needed to explain what is a good / bad score of beacon.

  • We do not understand immediately that the score is not a probability.
  • We do not understand what threshold can be used to separate beacon and non-beacon (or even if using a threshold make sense).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants