1

I'm new in python and regex and I have been trying to hide the IP Address logs in a txt file. I should avoid using for loops and if checks -if possible because txt file is huge (158MB).

(All the IP addresses starts with 172)

This is the code i tried:

import re
txt = "test"
x = re.sub(r"^172\.*", "XXX.\", txt)
print(x)

Sample txt file:

ABCDEFGHIJKLMNOPRST172.12.65.10RSTUVYZ
ASDG172.56.23.14FSDGHSFSDFDSFHSF
!'^%%&!'+!'+^%&!ÂSDBSDF172.23.23.23SADASFSA
ASGFGD 172.12.23.56 ASDSAFASFDASSADSA

Desired output:

ABCDEFGHIJKLMNOPRSTXXX.XXX.XXX.XXXRSTUVYZ
ASDGXXX.XX.XX.XXFSDGHSFSDFDSFHSF
!'^%%&!'+!'+^%&!ÂSDBSDFXXX.XXX.XXX.XXXSADASFSA
ASGFGD XXX.XXX.XXX.XXX ASDSAFASFDASSADSA
korimusk
  • 41
  • 5
  • 2
    ``re.sub(r'(172\.\d{1,3}\.\d{1,3}\.\d{1,3})', "XXX.XXX.XXX.XXX", text)`` – sushanth Aug 09 '20 at 07:21
  • Here is a dupe, https://stackoverflow.com/a/30654313/4985099 – sushanth Aug 09 '20 at 07:22
  • Another question, in the declaration part I'm assigning "test" to txt variable as a String. However, I want to read it from the file what should I do? I made it like: txt = open("test.txt", "r+") x = re.sub(r'(172\.\d{1,3}\.\d{1,3}\.\d{1,3})', "XXX.XXX.XXX.XXX", txt) But it gives an type error: TypeError: expected string or bytes-like object – korimusk Aug 09 '20 at 08:38

2 Answers2

2

You should indeed use re.sub.

re.sub("(172)(\.(?:[0-9]{1,3}\.){2}[0-9]{1,3})", r"XXX.XXX.XXX.XXX", tested_addr)

An explanation about the regex (You don't really need the groups for the for what you've requested but its a nice way to understand the parts of the regex:

^(172)(\.(?:[0-9]{1,3}\.){2}[0-9]{1,3})$

^ asserts position at start of a line
1st Capturing Group (172)
172 matches the characters 172 literally (case sensitive)
2nd Capturing Group (\.(?:[0-9]{1,3}\.){2}[0-9]{1,3})
\. matches the character . literally (case sensitive)
Non-capturing group (?:[0-9]{1,3}\.){2}
{2} Quantifier — Matches exactly 2 times
Match a single character present in the list below [0-9]{1,3}
{1,3} Quantifier — Matches between 1 and 3 times, as many times as possible, giving back as needed (greedy)
0-9 a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
\. matches the character . literally (case sensitive)
Match a single character present in the list below [0-9]{1,3}
{1,3} Quantifier — Matches between 1 and 3 times, as many times as possible, giving back as needed (greedy)
0-9 a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
$ asserts position at the end of a line
Or Y
  • 2,088
  • 3
  • 16
0

Use: 172(?:\.\d{1,3}){3}

Code:

string = r'''ABCDEFGHIJKLMNOPRST172.12.65.10RSTUVYZ
ASDG172.56.23.14FSDGHSFSDFDSFHSF
!'^%%&!'+!'+^%&!SDBSDF172.23.23.23SADASFSA
ASGFGD 172.12.23.56 ASDSAFASFDASSADSA'''

print re.sub(r'172(?:\.\d{1,3}){3}', "XXX.XXX.XXX.XXX", string)

Output:

ABCDEFGHIJKLMNOPRSTXXX.XXX.XXX.XXXRSTUVYZ
ASDGXXX.XXX.XXX.XXXFSDGHSFSDFDSFHSF
!'^%%&!'+!'+^%&!SDBSDFXXX.XXX.XXX.XXXSADASFSA
ASGFGD XXX.XXX.XXX.XXX ASDSAFASFDASSADSA

Demo & explanation

Toto
  • 89,455
  • 62
  • 89
  • 125