2

please advice how to match only the valid IPs ( 255.255.255.255 ) from the file.txt and insert only the valid IP into VALID_IP.txt file

  • ( see VALID_IP.txt for example )

the solution should be implemented in my ksh script ( so perl or sed or awk is fine also )

more file.txt

     e32)5.500.5.5*kjcdr
     ##@$1.1.1.1+++jmjh
     1.1.1.1333
     33331.1.1.1
     @5.5.5.??????
     ~3de.ede5.5.5.5
     1.1.1.13444r54
     192.9.30.174
     &&^#%5.5.5.5
     :5.5.5.5@%%^^&*
     :5.5.5.5:
     **22.22.22.22
     172.78.0.1()*5.4.3.277

example of VALID_IP.txt file

     1.1.1.1
     192.9.30.174
     5.5.5.5
     5.5.5.5
     5.5.5.5
     22.22.22.22
     172.78.0.1
yael
  • 2,433
  • 5
  • 31
  • 43

3 Answers3

4

The following is a suitable regex, split onto 4 different lines for the sake of my own sanity.

(1?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])\.
(1?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])\.
(1?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])\.
(1?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])

Output:

egrep -o `cat regex` infile #all regex lines above joined, no spaces

1.1.1.1
1.1.1.133
31.1.1.1
5.5.5.5
1.1.1.134
192.9.30.174
5.5.5.5
5.5.5.5
5.5.5.5
22.22.22.22
172.78.0.1
5.4.3.27

Obviously this doesn't match your example. Why? Because we can't tell that that a 3 doesn't belong with a one. As you can see, garbage numbers can't be cleanly guessed at.

Jeff Ferland
  • 20,547
  • 2
  • 62
  • 85
  • but how to add this to my ksh script ? – yael Nov 15 '12 at 08:51
  • @yael use `egrep -o` – Jeff Ferland Nov 15 '12 at 09:03
  • 1
    If you use `grep -oP` you can use look-around constraints to limit the invalid first and last octets: `grep -oP '(?<!\d)(1?\d?\d|2[0-4]\d|25[0-5])(\.(1?\d?\d|2[0-4]\d|25[0-5])){3}(?!\d)'` – glenn jackman Nov 15 '12 at 12:28
  • @glennjackman Good point. It didn't click to me that none of the "extra digits" numbers were included in the example output, probably from a combination of it being very late at night my time and the syntax highlighting. Definitely should use negations in this circumstance, yet be aware that one extra digit of nonsense could still make a valid IP depending on the digit. – Jeff Ferland Nov 15 '12 at 22:56
  • @Jeff your solution is fine for linux but the egrep flag for solaris is diffrent , can you please advice what the same solution for solaris machines ? – yael Nov 18 '12 at 06:48
  • @yael The syntax should be the same for Perl. Just surround it with a match grouping and reference the grouping ($1). – Jeff Ferland Nov 18 '12 at 09:21
3

It's slightly cleaner with perl

#!/usr/bin/perl
use Regexp::Common qw/net/;
while (<>) {
      print $1, "\n" if /($RE{net}{IPv4})/;
}

but it still gets false positives

1.1.1.1
1.1.1.133
31.1.1.1
5.5.5.5
1.1.1.134
192.9.30.174
5.5.5.5
5.5.5.5
5.5.5.5
22.22.22.22
172.78.0.1

Perl one liner

perl -e 'use Regexp::Common qw/net/;while (<>) {print $1, "\n" if /($RE{net}{IPv4})/;}' infile
user9517
  • 115,471
  • 20
  • 215
  • 297
2

I recommend using range checking instead of hairy regexes. You can do this in ksh without using an external utility or another language. Although Iain's solution is nice, it's not a core module.

Here's pure ksh. There's no need to make it a one-liner, just use a function. Code like this is easier to understand, easier to check for correctness and easier to maintain.

#!/usr/bin/ksh
validate_ip () {
    typeset ip=$@
    typeset IFS=. valid=1
    typeset octets=($ip) octet
    typeset digits='^[[:digit:]]+$'

    if (( ${#octets[@]} == 4 ))
    then
        for ((octet = 0; octet <= 3; octet++))
        do
            value=${octets[octet]}
            if [[ ! "$value" =~ $digits ]] || ((value < 0 || value > 255))
            then
                valid=0
            fi
        done
    else
        valid=0
    fi

    if ((valid))
    then
        printf '%s\n' "$ip"
    fi

    return $valid
}

while read -r line
do
    validate_ip "$line"
done #< file.txt > VALID_IP.txt

This is ksh 93, I haven't tested it in ksh 88. It also runs unchanged in Bash 3.2 or higher.

Dennis Williamson
  • 62,149
  • 16
  • 116
  • 151