1

I just finished migrating a mail server from CentOS 7 over to CentOS 8. The mail store was in /var/vmail.

The data was migrated with rsync: rsync -rltDPH /var/vmail/* root@new-hostname:/var/vmail/

I ran du -skh inside /var/vmail on the old (CentOS 7 server). I then ran the same command on the new CentOS 8 server. There were a number of directories that had different sizes, but 1 in particular stood out to me.

On the old server, it shows that it is using 26G.
On the new server, it shows that it is using 33G.

The CentOS 8 server's filesystem is ext4. If I recall correctly, I was also using ext4 on the CentOS 7 server -- but I have already destroyed that server, so I can't confirm.

I have an off-site CentOS 7 server used for backup purposes. That backup server's filesystem is XFS. It shows the backup - from the new server - for the directory in question to also be using 26G.

The backup server uses the same rsync command (rsync -rltDPH) to copy things down.

I became more curious, so I copied this directory to an ext4 partition on my Ubuntu laptop (Ubuntu 18.04). Ubuntu is also reporting a usage of 26G.

I know the whole conversation about why df and du show different values.

Why does CentOS 8 ext4 filesystem show du results of a directory about 7G larger than an ext4 filesystem on Ubuntu 18.04, as well as an XFS filesystem on CentOS 7 (and another ext4 filesystem on CentOS 7)?

David W
  • 3,453
  • 5
  • 36
  • 62
  • How did you copied the files? Please post the exact command. – shodanshok Feb 03 '20 at 06:25
  • Files were / are copied with: rsync -rltDPH, and the question has been updated. The usage I see on the backup server is based on the new server AND the old server. (i.e. rsyncing the directory from the new server is the same size as it was on the old server, according the "du" on the backup server). – David W Feb 03 '20 at 10:50
  • It is possible that the original files were sparse one. Can you identify two files with different size and report the output of `stat ` for both? – shodanshok Feb 03 '20 at 22:22
  • I compared the `du` for /var/vmail/example.com/user@example.com/cur. On the backup server, it is using about 9.4G, and on the new CentOS 8 server, it is using 12G. In both cases, they have about the same number of files (almost 39,000 email files). I ran a `ls -lhrS` on both, and can confirm that the largest files are the same size from the same date. – David W Feb 04 '20 at 00:50
  • Can you share the `stat` output of two files (not dirs) with different real size? Please be aware that `ls` does not take into account sparse file, you need to use `stat` or `du`. To identify the candidate files, you can use something as `find // -printf "%s %k %S %h/%f\n" | sort -nr` and compare the output between the two systems. – shodanshok Feb 04 '20 at 08:24
  • What is the filesystem block size? Larger blocks could easily explain this. To check, run `stat -fc %s /var/vmail`. (If I recall correctly, standard 4k-sized blocks will only allow ext4 to address 16TB of storage, so if you have more than that it might be using larger blocks.) – Moshe Katz Feb 06 '20 at 18:25
  • Both systems' block size is the same (4096). I did try running the `find` command suggested earlier to identify potential sparse files. I didn't have a whole lot of time to investigate that as of yet, but some initial checks seemed like both systems were identical. I'll stick the results of that find command into a text document and run a diff for comparison. – David W Feb 06 '20 at 21:03
  • I can't really explain this, but as of today, `du` on both the new server as well as the backup server now show the same values (33GB). The backup server is doing a `--link-dest` from the previous day's backup in order to save space, so that may have something to do with why I saw a discrepancy for a while. It still doesn't explain why the du jumped overnight by simply rsyncing files from the old server to the new server. – David W Feb 29 '20 at 11:39
  • Using the find command posted earlier in the comments section, I take it that to identify sparse files, I need to pay attention to the 3rd column (%S) and look for any numbers that are below 1 -- correct? – David W Feb 29 '20 at 12:00

0 Answers0