2

This is the regex I am using code (?m)^.*(10(?:\.\d+){3}\/\d+)\s.*\s(\d+).*$ It is suppose to catch subnet IP and last 4 digits from the line. One exception being if there is only the subnet IP directly followed by new line, then the capture should continue in next line.

Example data:

*> 10.118.130.98/32 172.17.128.161 0 65000 4809 23 8705 8705 8705 8705 i *> 10.118.130.102/32 172.17.128.161 0 65000 4809 23 285 i

Capture group 1 should contain 10.118.130.98/32 and 10.118.130.102/32 and capture group 2 should contain 8705 and 285. This works well on regex101.com however in PowerShell it catches only the first lineenter image description here.

  • your data - both the input and the desired output - is BADLY garbled. please, add code formatting around it so that it becomes readable ... and usable for testing. – Lee_Dailey Nov 28 '19 at 16:38
  • Are you sure this isn't just because you are missing the `-Pattern` operator on the `Select-String` method? e.g. `$file | Select-String -Pattern '(.*/..)(\n)'` – Panomosh Nov 28 '19 at 16:39
  • Once you have time to test out my suggestion, please drop a comment below my answer, will you? – Wiktor Stribiżew Nov 29 '19 at 13:08

1 Answers1

0

I suggest reading the file in as a single text variable, not line by line, using -Raw, then use a regex to find lines that only contain an IP-like string followed with a port number and line breaks, and remove those linebreaks in those locations:

(Get-Content $file -Raw) -replace '(?m)^(\d+(?:\.\d+){3}/\d+)[\r\n]+', '$1' | Set-Content $file

Pattern details

  • (?m) - MULTILINE modifier option
  • ^ - start of a line
  • (\d+(?:\.\d+){3}/\d+) - Group 1: 1+ digits and then 3 repetitions of a dot and 1+ digits, then / and 1+ digits
  • [\r\n]+ - 1 or more CR or LF symbols.

The $1 is a placeholder containing Group 1 value.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • This is what I tried (Get-Content 'D:\BGP.txt' -Raw) -replace '(?m)^(\d+(?:\.\d+){3}/\d+)[\r\n]+', '$1' | Set-Content .\test.txt Result is not good:`code *> 10.118.130.98/32 172.17.128.161 4809 23 8705 8705 8705 8705 i *> 10.118.130.102/32 172.17.128.161 0 65000 4809 23 285 i` – Jozef Trubac Nov 29 '19 at 15:10
  • @JozefTrubac Yes, and? Did it work as expected? I have two lines after this: `10.118.130.98/32 172.17.128.161 0 65000 4809 23 8705` and `10.118.130.102/32 172.17.128.161 0 65000 4809 23 285` – Wiktor Stribiżew Nov 29 '19 at 15:11
  • @JozefTrubac So, you also get two lines, and isn't that expected? If your file looks different from what you showed in the question please update the question, do not use comments – Wiktor Stribiżew Nov 29 '19 at 15:15
  • if I run this in PowerShell it does not return anything so I guess the regex does not work in PowerShell? `code(Get-Content 'D:\BGP.txt' -Raw) | Select-String '(?m)^(\d+(?:\.\d+){3}/\d+)[\r\n]+'` – Jozef Trubac Nov 29 '19 at 15:43
  • @JozefTrubac It works in PS because I used to test it in Powershell. `(Get-Content 'D:\BGP.txt' -Raw) -replace '(?m)^(\d+(?:\.\d+){3}/\d+)[\r\n]+', '$1'` displays the changed file contents on screen. Please make sure you run the tests on the right file. Do not use `Select-String`. Use my code. – Wiktor Stribiżew Nov 29 '19 at 15:43
  • OK it works fine. I am sorry this was my mistake. I was running the regex in the part of my code where the input data were differently formatted. Thanks a lot guys. Big respect for finding time to help newbies like me. – Jozef Trubac Nov 29 '19 at 16:03
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/203334/discussion-between-jozef-trubac-and-wiktor-stribizew). – Jozef Trubac Nov 29 '19 at 16:10