-1

I've created a perl script to validate email ids for my marketing team to send campaigns.

The script is behaving erratically.

For Example, I had validated 135 email ids on various dates,

******************************************
Date   |    Valid     | Invalid  | Total
******************************************
23-Dec-13    45       |   90     | 135
******************************************
24-Dec-13    90       |   45     | 135
******************************************
25-Dec-13    133      |   02     | 135
******************************************

I'm unable to figure out where it went wrong,

Link to code

Code:

 #!/usr/bin/perl
 use Data::Dumper;

 %lookup_cache = ();

 sub valid_address {
   my($addr) = @_;
   my($domain, $valid);

   # Lower-case address
   $addr = lc($addr);

   # Validate format of address
   return(0) unless ($addr =~ /^[^@]+@([-\w]+\.)+[a-z]{2,4}$/);

   # Grab domain
   $domain = (split(/@/, $addr))[1];

   # Lookup and return cached result if it exists
   $cached_result = $lookup_cache{$domain};
   if ($cached_result ne '')
   {
     #print "[cached_result] ";
     return $cached_result;
   }

   # Do domain lookup
   $valid = 0;
   if (open(DNS, "nslookup -q=any $domain |"))
   {
     while (<DNS>) {
       $valid = 1 if (/^$domain.*\s(mail exchanger|internet address)\s=/i);
     }
   }

   # Store cached result for later
   $lookup_cache{$domain} = $valid;

   return $valid;
 }

 while (<>) {
   $addy = $_;
   $addy =~ s/\s+$//;
   if ($addy)
   {
     print "$addy " . (valid_address($addy) ? 'valid' : 'invalid') . "\n";
   }
 }
dg99
  • 5,456
  • 3
  • 37
  • 49
bcrajkumar
  • 527
  • 5
  • 16

2 Answers2

2

Email address syntax can be quite complicated. So, validation could be tricky - very easy to go wrong. I suggest exploring a proper library on CPAN.

Email::Valid seems to support domain checks and TLD checks too. Disclaimer: I have not used this module personally but it seems to be actively maintained.

The output of nslookup might have changed between multiple runs of the script, so your script might be reporting inconsistent results. I would suggest adding more log statements so that you can pin point what is going on.

Gowtham
  • 1,465
  • 1
  • 15
  • 26
1

I'd recommend using strict and warnings in all Perl scripts.

Ask for mail exchanger records with nslookup -q=MX to make the script stable. The output of nslookup -q=any might include the MX record but not always (I suppose it returns any record type it finds, not necessarily MX?).

Edit: This script works for me:

#!/usr/bin/perl

use strict;
use warnings;
use Data::Dumper;

my %lookup_cache = ();

sub valid_address {
  my($addr) = @_;
  my($domain, $valid);

  # Lower-case address
  $addr = lc($addr);

  # Validate format of address
  return(0) unless ($addr =~ /^[^@]+@([-\w]+\.)+[a-z]{2,4}$/);

  # Grab domain
  $domain = (split(/@/, $addr))[1];

  # Lookup and return cached result if it exists
  my $cached_result = $lookup_cache{$domain};
  if (defined $cached_result)
  {
    return $cached_result;
  }

  # Do domain lookup
  $valid = 0;
  if (open(DNS, "nslookup -q=MX $domain |"))
  {
    while (<DNS>) {
      $valid = 1 if (/^$domain.*\s(mail exchanger|internet address)\s=/i);
    }
  }

  # Store cached result for later
  $lookup_cache{$domain} = $valid;

  return $valid;
}

while (<>) {
  my $addy = $_;
  $addy =~ s/\s+$//;
  if ($addy)
  {
    print "$addy " . (valid_address($addy) ? 'valid' : 'invalid') . "\n";
  }
}
ales_t
  • 1,967
  • 11
  • 10
  • @alest_t If a line did not match the regex, $valid will remain unchanged. However, breaking early makes sense because remaining lines need not be checked. – Gowtham Dec 26 '13 at 13:36
  • @Gowtham thanks, good point. I don't know what I was thinking :) – ales_t Dec 26 '13 at 13:39
  • I made changes as per your suggestion, The results all are showing invalid. Is there anything to add?? # Do domain lookup $valid = 0; if (open(DNS, "nslookup -q=Mx")) { while () { $valid = 1 if (/^$domain.*\s(mail exchanger|internet address)\s=/i); } } – bcrajkumar Dec 26 '13 at 14:17
  • What happens when you run `nslookup -q=MX` with one of your email addresses on the command line? – ales_t Dec 26 '13 at 14:18
  • bcrajkumar@localhost:~$ nslookup -q=Mx arun@gmail.com Server: 192.168.1.1 Address: 192.168.1.1#53 ** server can't find arun@gmail.com: NXDOMAIN Actually I've configured 192.168.1.1 local router address as DNS/Gateway for my system – bcrajkumar Dec 26 '13 at 14:22
  • Sorry -- I meant domains. E.g. `nslookup -q=MX gmail.com` will surely include the string "mail exchanger". – ales_t Dec 26 '13 at 14:24