I have a problem with some spam messages with the subject field encoded in utf8 base 64 and weird characters used to fool the filter rules
example:
raw subject of incoming email
Subject: =?UTF-8?B?UklGSVVU0J4gREkgUklOTtCeVtCe?=#821538
decode by spamassasin contains this char О instead of O
__SUBJ_NOT_SHORT ======> got hit: "RIFIUTО DI RINNOVO"
so the rule not trigger
header __SUBJECT_PHISHING_3 Subject=~ /(RIFIUTО DI RINNОVО)/i
however these characters are displayed in the email client ( Outlook or Thunderbird) with an O and result correct in italian language to fool the user
RIFIUTО DI RINNОVО
So the spammer inserts weird characters knowing that the client will show them correctly in Italian while spamassassin will not trigger the rule
there is a solution to match these characters or decode them like the email client do without having to create a new rule every time the spammer insert special char to bypass filter
found same problem with some hint https://users.spamassassin.apache.narkive.com/LhGDKXkm/utf-8-spam-rules