As part of managing my own web presence, including a hosted email server with limited users (both in numbers and geography), I tend to try and cut large swathes of spam by simply “binning” any emails that have any association with specific TLDs, like .ru
or .us
or .cn
– whereby I know that my users and I have no legitimate reason to receive any email coming from those TLDs or passing through servers using any such TLDs.
However, it came to pass that some ham were getting caught, but simply looking at the email headers was not helping. Using CPanel’s in-built testing tool was helpful in surfacing which of my rules was triggering the spam trap, but not exactly why (or what part of the email was triggering it).
The triggering rule looked like regex, so I immediately tried to hunt down converted/parsed file to try and copy the rule in converted regular expression form.
Attempting to poke at the ~/.cpanel/filter.yaml
and ~/.cpanel/filter.cache
and even the /etc/vfilters/<domain>
did not turn up the regular expressions I was looking for.
In desperation, I took a quick look at the CPanel test tool results and decided to just copy the regex shown outright…
Unfortunately, pasting that regex directly into a regex test tool did not work…
Looking at the copied regex, several corrections were required. Specific to my rule, there were two variations per intended TLD:
$header_from: matches
<TLD> or$message_headers matches
<TLD>
This then led to the following required changes to the copied regex:
- changing the “never-ending” nesting (via brackets) with a simple “OR”, since precedence was not a requirement for my specific rule as all the operators were logical OR (i.e. “||”)
- remove all references to specific fields that have no meaning in the regex test tool (taking into account left/right space padding as well as preceding closing bracket as per point #1), replacing with logical OR (i.e. “||”) e.g.:
-
) or $header_from: matches
>||
) or $message_headers matches
>||
- remove the first non-repeating instance of the fields, taking into account any “
or
” prefixes which will appear for every field instance excepting the first appearing field, e.g.:-
$header_from: matches
> (nothing) -
or $message_headers matches
> (nothing)
-
-
- removing the escaping of the forward slash (which is not needed for the regex test tool)
\\
>\
- removal of all the preceding nested brackets (just select and delete any leading “
(
” until the first regex expression)
Testing with the cleaned-up regex now worked!