Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spam filter to work with flags from Rspamd and bogofilter #300

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Titan-C
Copy link
Contributor

@Titan-C Titan-C commented Oct 18, 2020

I started using bogofilter as it is low on resources, and enough for my email load. I realized I was missing other header matching options for spam.

This also Closes #299, for Rspamd

P.S. Happy if you also give this PR a hacktoberfest-accepted label

@codecov
Copy link

codecov bot commented Oct 18, 2020

Codecov Report

Merging #300 into master will decrease coverage by 0.21%.
The diff coverage is 25.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #300      +/-   ##
==========================================
- Coverage   47.08%   46.86%   -0.22%     
==========================================
  Files          30       30              
  Lines        1079     1084       +5     
==========================================
  Hits          508      508              
- Misses        571      576       +5     
Flag Coverage Δ
#unittests 46.86% <25.00%> (-0.22%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
afew/filters/HeaderMatchingFilter.py 33.33% <0.00%> (-8.78%) ⬇️
afew/filters/SpamFilter.py 60.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c3aa987...5c58e18. Read the comment docs.

@GuillaumeSeren
Copy link
Collaborator

GuillaumeSeren commented Oct 19, 2020

Hello @Titan-C
thank you for this patch.

It look's fine to me but this filter does not have test (yet) so I will wait for a second advice to merge it.

@flokli if you want to look at it ?

Copy link

@nlewo nlewo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, perfect timing (i submitted #299)! Thank you;)

Copy link
Member

@flokli flokli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs updates to the documentation, describing the behaviour, and a test (especially considering this changes HeaderMatchingFilter, which is used by users directly, and in ListMailsFilter too.

self.header = [self.header]

try:
value = next(filter(None, map(message.get_header, self.header)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty hard to read. I assume it'll call message.get_header() with every header configured in the filter, and filter out results that are an empty string (which can be the case if one of the headers doesn't exist)?

What's the next() in front of everything doing here? Why is value still a single element? What happens if the filter returns multiple elements?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you read it right.

value is a single element, the first matching header. This is because the HeaderMatchingFilter only matches a string, it only has a pattern to match a string. Up until now the filter only matches to one header. For spam, as there are many spam flagging tools, you get different headers depending on the tool, thus you need to be able to search for many headers.
I assume, sysadmins don't run their many spam filters at one, they just pick a tool. Thus I pick the first not empty match(that is what next does), which might be the only match. Later I just let the the same logic of the filter run its way and match the pattern to the string. There I use the benefits of the regex. Spamassasing and Rspamd Mark with YES, their spam, bogofilter uses Spam.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it makes sense in the scope of SpamFilter, this changes semantics in HeaderMatchingFilter in a non-intuitive and undocumented way.

Can we expose this in HeaderMatchingFilter in a less confusing way, and have it documented?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

spamfilter doesn't work when Rspamd is used instead of SpamAssassin
4 participants