-1

I have a filesystem with 40 million files in a 10 level tree structure (around 500 GB in total). The problem I have is the backup. An Incr backup (bacula) takes 9 hours (around 10 GB) with a very low performance. Some directories have 50k files, other 10k files. The HDs are HW RAID, and I have the default Ubuntu LV on top. I think the bottleneck here is the # of files (the huge # of inodes.) I'm trying to improve the performance (a full backup on the same FS takes 4+ days, at 200k/s read speed). - Do you think that partitioning the FS into several smaller FS would help? I can have 1000 smaller FS... - Do you think that moving from HD to SSD would help? - Any advice?

Thanks!

Sergio
  • 61
  • 1
  • 6

1 Answers1

0

Moving to SSD will improve the speed of the backup. The SSD will get tired very soon and you will need the backup...
Can't you organise things that you know where to look for changed/new files? In that way you pnlu need to increment-backup those folders.

Is it necessary your files are online? Can you have tar files of old trees 3 levels deep?

I guess a find -mtime -1 will take hours as well.

I hope that the backup is not using the same partition as de tree structure (everything under /tmp is a very bad plan), the temporary files the bavkup might make should be on a different partition.

Where are the new files coming from? When all files are changed by a process you control, your process can make a logfile with a list of files changed.

Walter A
  • 19,067
  • 2
  • 23
  • 43
  • Hola Walter ... every filesystem operation which traverse through the FS crawls... I was talking about reorganizing the structure and archive the unneeded folders....Today I've counted at least 4 mill files the users don't need to have them online. Sadly yes, the backup stores volumes under /backup and the mysql dB is also on the same gigantic FS. Moving those two things could improve the situation. I don't have any direct control of the generated files, but is normal to have between 10k and 50k files per folder.... I also think any SSD will be worn out very fast.... – Sergio Feb 13 '15 at 02:32
  • Removing those 4M files is a start. I do not know the requirements for keeping access to them, can you use `find /yourpath -atime +30 -type f -exec rm {} \; ` or automaticly mv them into a tar and extract the files requested? – Walter A Feb 13 '15 at 08:19
  • The server where this monstrosity is held has only 16G of RAM...gathering slab data I've found the ext4 inode cache is full when it reaches 5 GB. I was running some tests in a new server I've received (96 GB of RAM), and I found I needed 45 GB just to cache the FS metadata....so I will upgrade the RAM to 64 GB or 96 GB to check if there is a significant change in the performance.. – Sergio Feb 14 '15 at 18:58
  • I hope the additopnal memory helped. Removing obsolete data will help in all cases, try to figure out how you can reduce the number of files without loss of service. – Walter A Feb 19 '15 at 21:28
  • Still fighting with the beast.... after upgrading RAID fw and adding 96 GB of RAM, read performance didn't change....still crawls... Our Bacula log: ==== FD Files Written: 7,034 SD Files Written: 7,034 FD Bytes Written: 537,896,121 (537.8 MB) SD Bytes Written: 539,479,171 (539.4 MB) Rate: 20.7 KB/s ==== Im starting to thing ext4 is not the best FS for this scenario.... I will test xfs or reiserfs and compare.... – Sergio Mar 16 '15 at 15:47
  • Important performance trick is skip what you don't need or have done before. Do you need all files online? Can you skip some dirs? Can you collect files in tar files so you do not need to read so much files? Can you combine several files in one file (or database)? – Walter A Mar 16 '15 at 19:24