From abuse.ch one can get a plain text file with malware distributing URIs. I want to use this as a blacklist for squid proxy (not yet sure about runtime behavior). It should not be to hard to convert the URI file into a regex file for acl aclname url_regex ...
using sed
, but I struggle to find the squid regex syntax description to identify all special characters, that I have to escape.
Asked
Active
Viewed 1,785 times
0

Thomas P
- 51
- 1
- 9
-
the dquid page has a nice wiki, https://wiki.squid-cache.org/SquidFaq/SquidAcl – djdomi Mar 01 '22 at 17:46
-
I know this wiki page, but it describes the acl syntax, not the regex syntax. – Thomas P Mar 02 '22 at 07:06
-
You must be more specific and produce at least one example clearly stating what you intend to do. Anyway, assuming you just need to parse a hosts file [https://urlhaus.abuse.ch/downloads/hostfile/](https://urlhaus.abuse.ch/downloads/hostfile/), you may try this: search for `^(#.*$(\n|\r\n)?|127.*\t)` and replace with `""` – mjoao Jun 22 '22 at 10:11
-
I'm looking for a description of the regex syntax itself. Which metacharacters, quantifiers, modifiers, ... are allowed, This differs slightly from perl to php to java to ... – Thomas P Jun 24 '22 at 09:24
1 Answers
1
Squid understands GNUregex (Extended Regular Expressions, AKA: ERE REGEXP).
It does not fully understand Perl Regular Expressions, AKA: PCRE.
E.x: \w, \d, \W, \D, lookahead, negative lookahead, shy grouping, atomic groups, etc...)
Working examples:
^(outlook-[1-9]\.cdn|attachments|res\.cdn)\.office\.net$
^c[0-9]+.*(powerpoint|word|excel|visio).*[0-9]{2}\.cdn\.office\.net$
^trello-[a-zA-Z0-9]+\.s3\.amazonaws\.com$
NON WORKING examples but PCRE valid:
^(outlook-\d\.cdn|attachments|res\.cdn)\.office\.net$
^c\d+.*(powerpoint|word|excel|visio).*\d{2}\.cdn\.office\.net$
^trello-\w+\.s3\.amazonaws\.com$
^rr?[1-9]-{2,4}sn-(?!.*-apn[a-z]).*\.googlevideo\.com)$
More info: https://www.gnu.org/software/gnulib/manual/html_node/Regular-expressions.html https://www.gnu.org/software/grep/manual/html_node/Regular-Expressions.html

mjoao
- 171
- 4