4

I'd like to transfer a directory with uncompressed files over using ssh, gzip'ing them individually along the way. Can anyone recommend a simple oneliner to achieve this?

eg.

fileA.ext -> ssh/pipe -> fileA.ext.gz

I've been piping tar over ssh with compression, but then the files are uncompressed at the end of the pipe. In this case, I'd like them to stay compressed.

Compressing beforehand would be possible, but would require space locally, or require a connection per file(?)

There are 6000+ files, and a I'd prefer a solution where all the files could be transferred using a single connection (although I do use keys for authentication!)

grojo
  • 429
  • 1
  • 7
  • 18
  • 1
    Do you want the compression just for the transfer or do you want them stored on remove server compressed? – Phil Apr 07 '11 at 21:38
  • Do you actually want to end up with compressed files at the other end, or do you just want to reduce network overhead while transferring them? – Daniel Lawson Apr 07 '11 at 21:40
  • I want them compressed at the other end, see updates – grojo Apr 08 '11 at 06:22

4 Answers4

6

I suppose you rather want to gzip them first, then send them across.

cat dafile | gzip -c | ssh remote "cat > dafile"

Repeat with relevant loop construct. E.g.

find -type f | while read fname ; do cat $fname | gzip -c | ssh remote "cat > $fname" ; done

... or something to that effect. Need to set up pkey access, or this will be one massive password-speedtyping excercise.

EDIT:

On the subject of a single connection, see man ssh_config ControlMaster. This will mean you save the overhead of negotiating 5999 of those 6000 SSH sessions.

EDIT2:

Haha! I win!

tar zcf - /dir/of/files | ssh remote "tar zxf - --to-command=mygzip.sh"

mygzip.sh, present on the remote machine, looks like this:

#!/bin/sh
mkdir -p "$(dirname $TAR_FILENAME)"
gzip -c > "${TAR_FILENAME}.gz"
Bittrance
  • 3,070
  • 3
  • 24
  • 27
  • 1
    It feels like one should be able to create a much more beautiful solution built on: "tar zcf - . | ssh whisper 'cd tmp; tar zxf -'" – Bittrance Apr 07 '11 at 21:26
  • As an alternative to file redirection, there's always dd. It tends to require less ugly quoting. `cat dafile | gzip -c9 | ssh remote dd of=dafile.gz` Also, for a slightly safer `read`, add `-r`, it stops read from interpreting any backslashes in the filename should you decide to use them. – penguin359 Apr 07 '11 at 22:03
  • 1
    And quote any variables, of course: `find -type f | while read -r fname ; do cat "$fname" | gzip -c | ssh remote "cat > '$fname.gz'" ; done` or using dd: `find -type f | while read -r fname ; do cat "$fname" | gzip -c | ssh remote dd of="'$fname.gz'"; done` When using ssh, it requires two levels of quotes since both the local shell and remote shell have a chance to expand variables. **Note**: this will break on filenames with single quotes. If you replace the single quotes in the ssh command with \" then it will instead break on filenames with $, ", or \` (back-tick) – penguin359 Apr 07 '11 at 22:17
4

GNU Parallel can do this easily:

find . -type f | parallel "gzip -c {} | ssh server \"cat>{}.gz\""

which however will also find and send files in directories below the starting directory.

or alternately if you want to send just all the regular files in a single directory:

parallel "gzip -c {} | ssh server \"cat>{}.gz\"" ::: *
Phil Hollenback
  • 14,947
  • 4
  • 35
  • 52
  • Ok, interesting! how does it compare to xargs? Any value in running N number of jobs in parallel as opposed to sequentially? – grojo Apr 08 '11 at 18:18
  • yes, one file per job by default, and numcpu jobs allowed to run at once by default. parallel is a lot of fun. – Phil Hollenback Apr 08 '11 at 22:04
0

Everyone's got their own tricks to coping files without SCP. Here mine (does not use SSH), http://linuxtipsandtricks.com/file-manipulation/transferring-files-with-netcat-nc/

Secondly if you insist on using SSH, you may want to add '-c arcfour' to your SSH.

HTH

Brent
  • 305
  • 2
  • 11
0

You should check out this page.

It compares the execution time for this type of file transfer using scp, tar+ssh, and tar+nc with and without ssh- or gzip-compression.

Besides providing several options for completing your task, the results are interesting.

mikewaters
  • 1,175
  • 1
  • 14
  • 27
  • The data was random so the files will not be compressible because random data should be pretty high entropy to begin with, and therefore using uncompressed will obviously be faster. – James Wakefield Oct 17 '19 at 03:50