0

I am having a difficult time using rsync from my computer to a remote server. I have a folder with a few sub directories and a lot of small files. On my computer the total size of my directory (using du -h) is 6.4GB. When I try to rsync it to the remote server it seems to work fine but when I check the file size it is only 1.3GB. It looks like all the sub directories are there as well as the files but there are too many files for me to check manually what is missing.

I have also tried using ls -al and ls -l to see the directory sizes. On my computer I get this

(base) MacBook-Pro:LaCie $ ls -al 2year_coarsened_cropped/
total 1024
drwxrwxrwx  1   staff  131072 19 Jun 14:18 .
drwxrwxrwx@ 1   staff  131072 31 Dec  1979 ..
drwxrwxrwx  1  staff  131072 19 Jun 14:32 test
drwxrwxrwx  1  staff  131072 19 Jun 14:23 train
(base) MacBook-Pro:LaCie $ ls -l 2year_coarsened_cropped/
total 512
drwxrwxrwx  1  staff  131072 19 Jun 14:32 test
drwxrwxrwx  1  staff  131072 19 Jun 14:23 train 

On the remote server this is what it says

remote:~$ ls -al 2year_coarsened_cropped/
total 16
drwxrwxr-x  4 user 4096 Jul 17 13:51 .
drwxr-x--- 14 user 4096 Jul 17 13:51 ..
drwxrwxr-x  5 user 4096 Jul 17 13:51 test
drwxrwxr-x  5 user 4096 Jul 17 13:51 train
remote:~$ ls -l 2year_coarsened_cropped/
total 8
drwxrwxr-x 5 user 4096 Jul 17 13:51 test
drwxrwxr-x 5 user 4096 Jul 17 13:51 train

I haven't used rsync before I usually use scp but I got the exact same results. I am not sure how or why the size has been reduced this much but if anyone has advice I would greatly appreciate it. My rsync command was just rsync -r -v 2year_coarsened_cropped user@remote:/home/

S.Beale
  • 1
  • 1
  • Do you have any information about the file systems and block sizes on the local and remote system? It looks like the local machine is set to use a very large (128KB) block size. – tsc_chazz Jul 17 '23 at 22:21
  • Yes I think the block size is 131072 bytes. It is an external hard drive which is connected to my laptop. Is that what the difference is? The block size on the remote system seems to be 4096 when I use `stat -f %k .` – S.Beale Jul 17 '23 at 22:31

1 Answers1

2

The reason for the larger size locally is the local drive has a block size of 128KB, where the remote machine's block size is 4KB. Despite file sizes being shown in bytes, a file must occupy an integral number of blocks, there is no block sharing between files in common file systems. So a 2KB file would occupy 1 block on either file system, but locally would have 126KB of unused space on the file system, where on the remote it would have only 2KB of unused space. Because du is concerned with space actually occupied on the disk, it will show the total space occupied by all file blocks, not the sum of the byte counts. The difference between 6.4GB locally and 1.3GB remote is that there is 5.1GB (approximately) more unused space in the blocks assigned to the files on the local drive, than on the blocks assigned to the files on the remote drive.

I normally would not suggest a block size as large as 128K for any file system, but when you're dealing with very large files, such as recorded video, it can be more efficient.

tsc_chazz
  • 905
  • 3
  • 14