1

I am trying to create a backup of a remote server. This is my configuration:

Server1  (webserver)
Server2  (backupserver)

This is my little script. It starts from the server2:

#!/bin/bash

date=`date +%F`
basepath=/var/backup
webfolder=$basepath/$date/websites/

for f in $(ssh root@server1 "ls -l /var/www/ | egrep '^l'")
do
    if [[ $f = *.* ]]
    then
        echo "processing $f ";
        ssh root@server1 "tar zcf - /var/www/$f/web/" > $webfolder/$f.tar.gz
    fi
done;

The problem is that it is too slow! How have I to speed up this script?

Updates:

I have already used the Rsync without success. This is the command that I use:

/usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded \
    --rsh="/usr/bin/ssh -p 22" root@123.123.123.123:/var/www \
    /home/backups/daily.0/webserver/ 

The servers are connected by a Dell Gigabit Switch. Both servers have the Gigabit network card. They are in the same subnet.

rSync Solution:

At the end, and thanks to the suggestions I have followed this path:

  1. Install rsync in all the debian box
  2. Install rsnapshot in the backup server
  3. Configure rsync deamon in the debian box (excluding the backup server)
  4. Set the rsnapshot cron configuration file

Waiting the first time a lot of time for the first backup.

Distro: Debian Servers

Michelangelo
  • 260
  • 2
  • 13

4 Answers4

4

You are reinventing the wheel. You should try using rsync. rsync will build the file list for you, and uses an amazing algorithm that is very fast, even over slow links, or encrypted connections that are slower from the overhead.

Very easy to run as well rsync -vvarP root@server1:/var/www/ root@server2:/var/backup/

Tim
  • 3,017
  • 17
  • 15
  • Hi Tim, I have already tested the rsync option in this way: `/usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded \ --rsh="/usr/bin/ssh -p 22" root@123.128.251.100:/var/www \ /home/backups/daily.0/webserver/` It is so slow! Where I wrong? – Michelangelo Dec 08 '11 at 16:07
  • Since they are on the same network, the SSH/RSH overhead (from the encryption) must be what is slowing you down, on a 100MB network, SSH will only pull and push around 2MB\s. Try using rsync daemon and client. – Tim Dec 08 '11 at 16:17
  • You might want to mention how to speed up building the incremental file list with the --delete-before or --delete-after options. – Tim Brigham Dec 08 '11 at 16:18
  • @timbrigham I doubt at this point the file list is his bottleneck. And I don't use that option regularly, maybe you would be better suited to instruct him ;) – Tim Dec 08 '11 at 16:20
  • Thanks Tim and timbrigham I am a little bit confused because the htop command shows me the process with a 0% (zero) of CPU and seems that the process is paused. – Michelangelo Dec 08 '11 at 16:22
  • @tim how have I say to rsync that I don't want to use the ssh? – Michelangelo Dec 08 '11 at 16:23
3

I don't think this is the most likely explanation, but having read the trouble you're having with rsync, it's just possible that you're suffering from a duplex mismatch on one or both of the NIC-switch connections.

Try doing a netstat -in on both servers, and check the error counts on transmission. Non-zero TX-errors often signal a duplex mismatch, and one effect of those is to permit slow, small-packet (interactive) connections unimpeded, but brutally restrict full-speed bulk-data connections.

Edit (following your comment below): OK, that's not symptomatic of duplex mismatch, so ignore my suggestion. It would still be useful to find out what the bottleneck is when you try an rsync-over-ssh right now, since it's not CPU.

MadHatter
  • 79,770
  • 20
  • 184
  • 232
  • Good input sir. – Tim Dec 08 '11 at 16:42
  • Why, thank you, sir. I already +1'd you because I agree that once he's got his network sorted out, rsync is definitely the way forward for him – MadHatter Dec 08 '11 at 16:51
  • @MadHatter thanks for the suggestion, this is the result: http://pastie.org/2986834 – Michelangelo Dec 08 '11 at 17:04
  • @MadHatter This is the first running of the Rsync and I have seen in the verbose suggested by Tim that the speed of the file transfer is from about 613.89kB/s to 46.88MB/s. At the moment I have started the rsync with the **-e none** parameter. – Michelangelo Dec 08 '11 at 17:15
1

Since your two servers are residing on the same switch and same network segment my suggestion would be to set up an rsync daemon on your backup box and avoid the use of SSH all together.

My suggested settings for your rsync daemon would be as follows. I'd give a little more specific instruction but you did not mention your distribution.

[yourshare]
path = /yourpath/
read only = no
list = yes
uid = youruser
gid = youruser
hosts allow = you.rip.add.res

This can be restricted down to only being accessible from the servers you want to back up from. From there you should be able to schedule a rsync job directly to your destination without the use of SSH, eliminating that issue.

If your site consists of a great many files the rsync process may hang at the sending incremental file list. If so the --delete-before or --delete-after options may prove beneficial.

There are also some configurations where the files are first copied and then analyzed locally. I haven't used rsync over SSH in a while but it is possible that the settings you are trying are having this effect.

Tim Brigham
  • 15,545
  • 10
  • 75
  • 115
1

I would suggest you to use rsnapshot. It's based on rsync as well. I use it to backup many remote server. It just take some time the first time and then it's very fast if your data doesn't change a lot. It's fully customisable and quite fast (the network in my case is the bottleneck).

Yann Sagon
  • 276
  • 3
  • 8
  • Hi @Yann Sagon thanks for the suggestion. I am using the rsnapshot software in my servers and now thanks to all the contributors here seems to work like a charm! I have configured the server side of the rsync in each box and then I have set rsnapshot in my backup server. This night it has create the daily.0 and daily.1. Thanks again! – Michelangelo Dec 10 '11 at 07:53