0

Scenario: I have to transfer approx 3000 files, 30 to 35 MB each from one server to another (Both servers are IBM-AIX servers). These files are in .gz format. They are unzipped at the destination using gunzip command to b of use.

The way i am doing it now: I have made .sh files containing ftp scripts of 500 files each. These .sh files when run, transfer the file to the destination. At the destination i keep on checking how many files have arrived, as soon as 100 files have arrived, i run gunzip for these 100 files, then again the same for the next 100 files and so on. I run gunzip for a batch of 100 just to save on time.

What is in my mind: I am in search of a command or any other way which will ftp my files to the destination, and as soon as 100 files are transferred they are started for unzipping BUT this unzipping should not pause the transfer for the remaining files.

Script that i tried:

ftp -n 192.168.0.22 << EOF
quote user username
quote pass password
cd /gzip_files/files
lcd /unzip_files/files
prompt n
bin
mget file_00028910*gz
! gunzip file_00028910*gz
mget file_00028911*gz
! gunzip file_00028911*gz
mget file_00028912*gz
! gunzip file_00028912*gz
mget file_00028913*gz
! gunzip file_00028913*gz
mget file_00028914*gz
! gunzip file_00028914*gz
bye

The drawback in the above code is that when the

! gunzip file_00028910*gz

lines is executing, the ftp for the next batch i.e ftp for ( file_00028911*gz ) is paused, hence wasting lot of time and loss of bandwidth utilization. The ! mark is used to run Operating system commands within ftp prompt.

Hope i have explained my scenario properly. Will update the post if i get a solution, if any one already has a solution do reply.

Regards Yash.

2 Answers2

0

Since you seem to do it on a UNIX system you probably have Perl installed. You might try the following Perl code:

use strict;
use warnings;
use Net::FTP;

my @files = @ARGV; # get files from command line

my $server = '192.168.0.22';
my $user   = 'username';
my $pass   = 'password';

my $gunzip_after = 100; # collect up to 100 files

my $ftp = Net::FTP->new($server) or die "failed connect to the server: $!";
$ftp->login($user,$pass) or die "login failed";

my $pid_gunzip;
while (1) {
    my @collect4gunzip;

    GET_FILES:
    while (my $file = shift @files) {
        my $local_file = $ftp->get($file);
        if ( ! $local_file ) {
            warn "failed to get $file: ".$ftp->message;
            next;
        }
        push @collect4gunzip,$local_file;
        last if @collect4gunzip == $gunzip_after;
    }

    @collect4gunzip or last; # no more files ?

    while ( $pid_gunzip && kill(0,$pid_gunzip)) {
        # gunzip is still running, wait because we don't want to run multiple
        # gunzip instances at the same time
        warn "wait for last gunzip to return...\n";
        wait();

        # instead of waiting for gunzip to return we could go back to retrieve
        # more files and add them to @collect4gunzip
        # goto GET_FILES;
    }

    # last gunzip is done, start to gunzip collected files
    defined( $pid_gunzip = fork()) or die "fork failed: $!";
    if ( ! $pid_gunzip ) {
        # child process should run gunzip
        # maybe one needs so split it into multipl gunzip calls to make
        # sure, that the command line does not get too long!!
        system( ['gunzip', @collect4gunzip ]);
        # child will exit once done
        exit(0);
    }

    # parent continues with getting more files
}

It's not tested, but at least it passes the syntax check.

Steffen Ullrich
  • 114,247
  • 10
  • 131
  • 172
0

One of two solutions. Don't call gunzip directly. Call "blah" and "blah" is a script:

#!/bin/sh
gunzip "$@" &

so the gunzip is put into the background, the script returns immediately, and you continue with the FTP. The other thought is to just add the & to the sh command -- I bet that would work just as well. i.e. within the ftp script, do:

! gunzip file_00028914*gz &

But... I believe you are somewhat leading yourself astray. rsync and other solutions are the way to go for many reasons.

pedz
  • 2,271
  • 1
  • 17
  • 20