9

I am trying to reject connections from specific user agents (by matching a substring of the user-agent header) using an haproxy ACL with -f option to read from a file. However it is not working, it runs as if the configuration is being ignored.

Can somebody with greater experience with haproxy pinpoint what I am missing? Or some tips on how to debug this haproxy configuration?

I am runnning haproxy 1.4.18.

This is the excerpt from haproxy.cfg:

listen http 0.0.0.0:80
    acl abuser hdr_sub(user-agent) -f /etc/haproxy/abuser.lst
    tcp-request content reject if abuser
    mode http
    server www1 127.0.0.1:8080 maxconn 10000

This is the content of the abuser.lst file:

# annoying bots
annoyingbot1
annoyingbot2
raugfer
  • 221
  • 1
  • 2
  • 5

1 Answers1

7

This question is old, but in case someone else runs into this problem:

Your problem comes from the fact that tcp-request content runs before HAProxy has had time to receive/read any layer 7 data.

How to fix this?

Easy: add a tcp-request inspect-delay:

listen http 0.0.0.0:80
    tcp-request inspect delay 15s

    acl abuser hdr_sub(user-agent) -f /etc/haproxy/abuser.lst
    tcp-request content reject if abuser
    mode http
    server www1 127.0.0.1:8080 maxconn 10000

Here's the important bit about this from the HAProxy documentation:

Note that when performing content inspection, haproxy will evaluate the whole rules for every new chunk which gets in, taking into account the fact that those data are partial. If no rule matches before the aforementioned delay, a last check is performed upon expiration, this time considering that the contents are definitive. If no delay is set, haproxy will not wait at all and will immediately apply a verdict based on the available information. Obviously this is unlikely to be very useful and might even be racy, so such setups are not recommended.

liquidity
  • 418
  • 1
  • 7
  • 22
  • Is there any significant performance impact by using `tcp-request inspect delay 15s` – Gaurav Pundir Mar 15 '16 at 06:21
  • The description for `inspect-delay` is a bit thin but I would guess that it's similar to timeout so performance overhead should be potentially higher memory usage plus whatever overhead your OS has for reading the clock. I think if the inspect delay is not used and you inspect the headers, you end up banning people if the inspected header doesn't happen to be included in the first TCP/IP segment because that's the moment HAProxy starts to process the request. – Mikko Rantalainen Jan 27 '23 at 15:41