2

This is the data I am trying to parse:

10.186.128.0/20 172.17.128.161 0 65000 8788

10.186.128.0/20 172.17.128.161 0 65000 878

10.186.128.0/20 172.17.128.161 0 65000 87

Ideally the output should match the IP address from the beginning of the line and also last 2 or 3 or 4 digits. Example desired output:

10.186.128.0/20 8788

10.186.128.0/20 878

10.186.128.0/20 87

I have regex that will match the IP address "10\.\d*\.\d*\.\d*\/\d\d"

And then I have second regex that will match the last 2 or 3 or 4 digits " \d{4}$| \d{3}$| \d{2}$"

Question is how to combine those two regex expressions in PowerShell to achieve desired result?

Thanks

  • Maybe `$s -replace '^(10(?:\.\d+){3}/\d+)\s.*\s(\d+)$', '$1 $2'`? Or do you mean the text contains lines that need to be extracted first? – Wiktor Stribiżew Nov 26 '19 at 13:54
  • If you are trying to match IP adresse, there will never be more than 3 digit between dots, and never less than 1. You should change your regex to `10\.\d{1,3}\.\d{1,3}\.\d{1,3}\/\d{2}` – Nicolas Nov 26 '19 at 13:58
  • Cool. Thanks. I modified my IP matching regex. However how to achieve the desired outcome of getting the IP and also the last 2 or 3 or 4 digits in the string? – Jozef Trubac Nov 26 '19 at 14:02
  • I've created a [regex 101](https://regex101.com/r/imc77v/1) for you to test more cases, basically, i've added a space an a match for 2 to 4 digits. – Nicolas Nov 26 '19 at 14:10
  • `Select-String '(?m)^(10(?:\.\d+){3}/\d+)\s.*\s(\d+)\r?$' -input $txt -AllMatches | Foreach {$_.matches} | Foreach {$_.groups[1].value + " " + $_.groups[2].value}`? Note the `$txt` here is a multiline string input. It outputs expected result in PS 6.1.3 – Wiktor Stribiżew Nov 26 '19 at 14:21
  • Try also `Get-Content $filepath | Select-String '^(10(?:\.\d+){3}/\d+)\s.*\s(\d+)$' -AllMatches | Foreach-Object {$_.Matches} | Foreach-Object {$_.Groups[1].Value + " " + $_.Groups[2].Value}` – Wiktor Stribiżew Nov 26 '19 at 14:29
  • last comment from Wiktor Stribiżew worked exactly the way I needed. Many thanks. Feel free to post it as answer. – Jozef Trubac Nov 26 '19 at 15:01

3 Answers3

1

You may use

Get-Content $filepath | Select-String '^(10(?:\.\d+){3}/\d+)\s.*\s(\d+)$' -AllMatches | Foreach-Object {$_.Matches} | Foreach-Object {$_.Groups[1].Value + " " + $_.Groups[2].Value}

The ^(10(?:\.\d+){3}/\d+)\s.*\s(\d+)$ regex - see its online demo - matches:

  • ^ - start of string
  • (10(?:\.\d+){3}/\d+) - Group 1: 10, then three repetitions of a dot and any 1+ digits and then / and 1+ digits
  • \s.*\s - a whitespace, any 0+ chars other than newline as many as possible and a whitespace
  • (\d+) - Group 2: 1+ digits
  • $ - end of string.

So,

  • Get-Content $filepath reads the file
  • Select-String '^(10(?:\.\d+){3}/\d+)\s.*\s(\d+)$' -AllMatches gets all matches from the file that it gets by reading the file line by line
  • Foreach-Object {$_.Matches} grabs all matches one by one
  • Foreach-Object {$_.Groups[1].Value + " " + $_.Groups[2].Value} concats Group 1 and 2 values.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
1

Using -split seems much simpler if all of your data is consistently in the posted format.

Get-Content -Path file.txt |
    Foreach-Object { [string]($_ -split ' ')[0,-1] }

Explanation:

-split uses regex matching to split a string into an array of strings. Here the string is split based on a single space. [0,-1] selects the first (index 0) and last (index -1) elements of the array.

[string] casts the two array elements as a string. Since PowerShell automatically joins two items with a space when they are cast as a string, this is just merely a shortcut.

AdminOfThings
  • 23,946
  • 4
  • 17
  • 27
1

If a quick way to modify the string is all that's needed
a simple replace with one of these two regex :

$string -replace '(?<=\b10\.\d{0,3}\.\d{0,3}\.\d{0,3}/\d{2}).*(?=[ \t]\d{1,4}\b)', ''

or

$string -replace '(?m)(?<=^[ \t]*10\.\d{0,3}\.\d{0,3}\.\d{0,3}/\d{2}).*(?=[ \t]\d{1,4}[ \t]*$)', ''
  • This solution would effectively remove the part that I am looking to retrieve. The solution provided by Wiktor Stribiżew works perfectly fine so it should be marked as the right answer IMHO. – Jozef Trubac Nov 27 '19 at 09:46
  • @JozefTrubac - The resulting string is the IP address from the beginning of the line and also last 2 or 3 or 4 digits. –  Nov 27 '19 at 12:47
  • @JozefTrubac - I never read or look at other answers as I consider myself an expert. See Possible [duplicate](https://meta.stackoverflow.com/questions/390629/should-stack-overflow-remove-the-regex-tag). –  Nov 27 '19 at 12:50