0

Observe these lines of BPF filters in tcpdump/libpcap syntax:

1: not host x or host y
2: not (host x or host y)
3: not (host x or y)
4: not host x or y
5: (not host x) or host y
6: (not host x) or y

It is my opinion that host z matches all of the above (with the exception of 6 because that one has an invalid syntax). My problem is with line 4. The tcpdump program sees that as equivalent to 5, but I think that is not intuitive and therefore not correct. Line 5 is unambiguous, as is line 3. However, line 4 can mean both things, depending how you look at things. I am of the opinion that because you cannot see y separately from the "host" keyword, it is wrong to treat line 4 like line 5.

What is the parse logic behind this? Who can explain why 1 == 4 == 5 and why 2 != 4 and 3 != 4 ?

Cheatah
  • 1,825
  • 2
  • 13
  • 21
  • http://biot.com/capstats/bpf.html: "Negation has highest precedence". (1) could be viewed as `(not host x) or (host y)`, essentially. – Marc B Feb 01 '16 at 19:12
  • I do agree in that case (1), however since y cannot be written separately, my brain processes the parse tree: not host (x or y) because it is not possible to put brackets around "not host x" because that would lead to invalid syntax. In my mind, you should always be able to put brackets around what you actually meant without breaking syntax. Therefore 4 is more like 3 and not like 1. – Cheatah Feb 01 '16 at 19:21

1 Answers1

1

"I think that is not intuitive and therefore not correct."

Perhaps. But often intuition is in the eye of the beholder, and a precise specification is always more useful than "the parser does the intuitive thing". (Unless you like Perl, I suppose. But then you need the correct intuitions.)

Having said that, I can't find a precise specification of the pcap grammar, but man pcap-filter does explain how expressions are disambiguated in combinations of primitives with boolean operators:

Negation has highest precedence. Alternation and concatenation have equal precedence and associate left to right.

Many primitives consist of a keyword followed by an identifier, but the keyword may be omitted:

If an identifier is given without a keyword, the most recent keyword is assumed.

That has no impact on grouping. The omitted keyword is inserted without changing the parse. The example makes that clear:

For example,
not host vs and ace
is short for
not host vs and host ace
which should not be confused with
not ( host vs or ace )

What the description doesn't really make clear is the reason that your example 6 is a syntax error, which is that the parse is performed recursively inside parentheses, and consequently keywords inside the parenthetic expression do not change "the most recent keyword".

rici
  • 234,347
  • 28
  • 237
  • 341