2

I want to extract specific Mac Address from a log file that can appear in different formats.

For example, on these three lines:

Jun 16 10:24:28 (2248) Login OK: cli 88-c9-d0-fd-13-65 via TLS tunnel)

Jun 16 10:24:35 (2258) Login OK: cli f8:a9:d0:72:0a:dd via TLS tunnel)

Jun 16 10:24:44 (2273) Login OK: cli 485a.3f12.a35a via TLS tunnel)

with this regex:

([[:xdigit:]]{2}[:.-]?){5}[[:xdigit:]]{2} 

I can bring out all the mac address, within the linux command less.

Assuming to search 48:5a:3f:12:a3:5a,how do I apply the same syntax with a specific mac address in Python?

I tried to write something like this:

regex = re.compile(r'([[:xdigit:]]{2}[:.-]?){5}[[:xdigit:]]{2}')

for line in file:
   match = regex.search(line)

but obviously it doesn't work.

bonzix
  • 45
  • 5
  • Do you want to find a specific MAC address in a file? If so, would `cat exampleFile | grep exampleMACAddress` work for you? – Paradox Jul 19 '16 at 15:36
  • 1
    If I understand this correctly, you're looking for `48[:.-]?5a[:.-]?3f[:.-]?12[:.-]?a3[:.-]?5a`? If the answer is yes, you should probably take a look at some regex tutorials. – Aran-Fey Jul 19 '16 at 15:36
  • I'm looking for **48:5a:3f:12:a3:5a** but it may appear in the format **485a.3f12.a35a** or **48-5a-3f-12-a3-5a** – bonzix Jul 19 '16 at 15:45

1 Answers1

1

You may use

r'\b[a-f0-9]{2}(?:([:-]?)[a-f0-9]{2}(?:\1[a-f0-9]{2}){4}|(?:\.?[a-f0-9]{2}){5})\b'

See the regex demo (compile the regex object with the re.I flag).

Explanation:

  • \b - leading word boundary
  • [a-f0-9]{2} - 2 xdigits
  • (?: - start of a non-capturing group with 2 alternative patterns:
    • ([:-]?)[a-f0-9]{2}(?:\1[a-f0-9]{2}){4}:
      • ([:-]?) - Group 1 capturing a delimiter that is either a : or -
      • [a-f0-9]{2} - 2 xdigits
      • (?:\1[a-f0-9]{2}){4} - 4 sequences of the delimiter in Group 1 and 2 xdigits
    • | - or
    • (?:\.?[a-f0-9]{2}){5}) - 5 sequences of an optional (1 or 9) dot (\.?) and 2 xdigits.
  • \b - trailing word boundary

Sample Python demo:

import re
p = re.compile(r'\b[a-f0-9]{2}(?:([:-]?)[a-f0-9]{2}(?:\1[a-f0-9]{2}){4}|(?:\.?[a-f0-9]{2}){5})\b', re.IGNORECASE)
s = "Jun 16 10:24:28 (2248) Login OK: cli 88-c9-d0-fd-13-65 via TLS tunnel)\nJun 16 10:24:35 (2258) Login OK: cli f8:a9:d0:72:0a:dd via TLS tunnel)\nJun 16 10:24:44 (2273) Login OK: cli 485a.3f12.a35a via TLS tunnel)"
print([x.group() for x in p.finditer(s)])
# =>  ['88-c9-d0-fd-13-65', 'f8:a9:d0:72:0a:dd', '485a.3f12.a35a']
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563