Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support github's security-severity in SARIF #2123

Open
Jiri-Stary opened this issue Dec 19, 2023 · 8 comments
Open

Support github's security-severity in SARIF #2123

Jiri-Stary opened this issue Dec 19, 2023 · 8 comments

Comments

@Jiri-Stary
Copy link

Jiri-Stary commented Dec 19, 2023

Sarif files uploaded to github are recommended to have security-severity field set for securty issues to allow for mapping between critical, high, medium, low issues.

Currently saf ignores this field when reading a sarif file and maps the issue to another severity just based on pass / fail when coverting to hdf or doing summary. I think it would be nice to include this so severity is properly mapped.

https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning#reportingdescriptor-object

Code scanning translates numerical scores as follows: over 9.0 is critical, 7.0 to 8.9 is high, 4.0 to 6.9 is medium and 3.9 or less is low.

@aaronlippold
Copy link
Member

Happy to take a look - is this on our side SARIF to HDF or on the MS project side HDF to SARIF?

@Jiri-Stary
Copy link
Author

@aaronlippold SARIF to HDF definitely, not sure about the other direction. When you import issues , ideally it should keep the same severity of the issues. Currently when importing from sarif to hdf it is not since it does not take this property into account

@aaronlippold
Copy link
Member

The standard ohdf scale is standard CVSS scale

0.0 not applicable / info
.1 - .3 low
.4 - .6 med
.7 - .8 high
.9. - 1 critical

It seems like we just have a few boundary issues on the data so we could take a look at that.

That being said here's the thinking I've come around to after doing this for a number of years.


The idea of ohdf what is to try to standardize these kinds of discrepancies and really when we get into the micro details of are you going to call something on the edges of one of these boundaries one thing or the other we are starting to get into the splitting hairs territory :-) do you really sort them at that level of detail?

Or are the buckets you really care about low medium and high and above.

Especially at the higher level, prioritizing a critical over a high really becomes a distinction without a difference because the general guidance is to take all high things or critical things aka the things that can lead to privilege escalation off the table first.

When I see data that starts to separate data out at the hundredths of decimal points I'm not really sure how much value those distinctions really make in operational reality.

To take that one step further, even having a distinction point in let's say the medium category of 4 versus 6 rather than just calling everything in that set a 5 goes to the same point.

How we really think about these things is one of four buckets

0 3 5 7/9+

Most policies operate around the concept of prioritization of risk

Today take all the high and higher out of the risk chain

Tomorrow get to any new high and higher issues and then take care of the mediums

Thursday see if you have some time to take care of the Lows and nuisance things but still address any new high in moderate findings that come in

Friday make sure you take care of any critical and high things that came in and make sure there's no new mediums and if you have time take care of the Lows and NA that you haven't yet gotten around to

Come Monday start the dance again

@Jiri-Stary
Copy link
Author

@aaronlippold

What i do care about are indeed the labeled buckets and one other thing. The data consistency. I am ingesting data from couple different tools in SARIF format and if i import them now, they will end up in different "buckets" so all criticals becomes "High" and similar. This throws off the data reporting and users are asking me to "fix" it since they see different numbers when they look in the tool directly. The resulting statistics over the data will become more "fuzzy".

I agree in principle that criticals and highs should be threated first, but i am looking for a way to convert all data i have into a common processing pipeline without loosing any data, and in my mind, this decreases the data precision. What i am asking for is to use existing value if it is provided, so that the severity is preserved.

Lets say there is an SLA and i am measuring how long it takes to deal with Criticals, Highs and Mediums, if i lose the severity precision i am unable to do that.

@aaronlippold
Copy link
Member

So the easiest way to fix it is just to adjust the data in the mapper so that when it gets serif data the boundaries are right on the arrays

@aaronlippold
Copy link
Member

I was just commenting on my thinking and theory behind the whole normalization which is the point of ohdf of course so don't take my theory and my soapbox as a no :-) just let the brain space I'm in overall.

Should be a simple PR because it's just an adjustment to how the mapper takes the incoming data and maps it to the outgoing hdf impacts

@Jiri-Stary
Copy link
Author

Thanks, and i think it is awesome for all the tools that dont provide any numbers by default like checkov

@aaronlippold
Copy link
Member

So what’s your suggested resolution. Happy to find a path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants