have 2 folders on different disks behave as if they were 1 folder

Question

I have a server with two 1TB hard drives. I have an uploads folder on /disk2/uploads, That folder has filled up the entire hard drive. The other disk is basically empty. I want to have a spill over folder on the empty drive that hosts any additional uploads that don't fit on the first disk.

The problem is that my web application needs to know where the files are. One solution would be to have my web application upload all new files onto the empty disk, and when the application needs to fetch a file, it would see if it exists in the new location and if it doesn't it will check the old location. I would like to solve this using linux though.

Is it possible to create a symlink (or something), that would allow me to upload new files to /var/www/uploads and allow me to find the files in /disk2/uploads from the /var/www/uploads folder?

For example:

if /disk2/uploads has folder1 and folder2 and /var/www/uploads has folder3. Then I would want this:

> ls /disk2/uploads
folder1 folder2
> ls /var/www/uploads
folder1 folder2 folder3

If such a solution exists, what would happen if /disk2/uploads/folder1/foo.log exists, and I try to upload foo.log to /var/www/uploads/folder1/foo.log? That probably isn't an issue for me because our files are all timestamped in their name, but I am curious.

Spooler · Answer 1 · 2017-10-22T18:11:06.620

If all you want to do is linearly allocate data across two block devices (disks, in this case) while presenting a single filesystem namespace, you could use LVM. Using a volume group with two physical volumes and a single logical volume that is allocated to consume all the space you need would be very simple. It will also allow for more elastic volume resizing. This would take care of the problem at the block level, and the filesystem of choice on top is largely irrelevant. Heres a link to the RedHat guide on how to do this in a generic sense.

The same could be accomplished with BTRFS by adding multiple disks to a single filesystem pool, and BTRFS has great durability features that LVM doesn't necessarily include, such as checksumming and fully online filesystem checks. This would be addressing the problem at the filesystem level. And here's a general multi-device BTRFS guide.

Both of these solutions require that you copy your data to new filesystems, though. So it's not necessarily a full migration plan. Personally, I'd probably go with BTRFS in this case unless you are also running something like a database workload from the same disks, in which I would recommend LVM and traditional filesystems.

As far as migration is concerned, a few intermediate symlinks and a file copy are probably good enough. However, I don't know the application you're running, so your strategy may be influenced by that.

EDIT Come to think of it, don't use BTRFS on older systems (anything before CentOS/RHEL 7 or SLES 12, for example). The early packages in kernel 2.6 are pretty spotty on feature support and stability.

EDIT JDS's answer reminded me that you may be seriously looking for a near-fully online migration, and there IS a way (several, actually) to provide fully transparent cross-filesystem overlays that will allow you to access the contents of both systems in various unified ways.

I have a fair about of personal experience with unionfs, which works great and is fairly simple. You might also look at aufs depending on what features you need. These filesystems have been invaluable to me for application and data center migrations.

score 1 · Answer 2 · answered Oct 20 '17 at 18:07

There is no way, with the details you've provided, to just link the directories together and make them appear as one single directory.

There are, however, solutions to your problem. The solutions will mostly require rebuilding the filesystem, though. Which will include steps like backing up all your data, making major, destructive, filesystem changes, then restoring the data.

You haven't provided enough details about the server and the existing filesystem to know for sure what the best approach is, but I can make some educated guesses.

LVM is a good option for making a single "folder" that spans multiple physical disks. LVM is easy enough to configure, but definitely requires some planning.

The easiest way to rebuild an existing server with two separately mounted hard drives into a server with two hard drives mounted as a single LVM Volume is to back up all the data and reinstall the OS from scratch. Modern OS installers have steps in which the LVM portion can be set up automatically.

If the two 1TB drives are just data drives, and the OS lives on another disk, you can back up all your data, combine the two 1TB drives into a nearly-2TB single LVM volume, and mount that on the original location. Then restore the data.

There are other filesystems that provide similar solution as LVM (e.g. ZFS), but they all boil down to the same thing: 1) back up; 2) rebuild fs as a contiguous logical volume, and mount that in the original location, 3) restore data.

There are several ways to link directories together and make them appear to users and applications as one. Take a look at unionfs and aufs, both of which can easily accommodate this functionality. — Spooler, Oct 22 '17 at 18:13
Interesting. I've never encountered Unionfs. aufs is used by Docker, pretty sure, but i thought it was just something related to Docker — JDS, Oct 23 '17 at 19:21

score 0 · Answer 3 · answered Oct 20 '17 at 20:58

0

Am I missing something here? Why not just keep it simple copy / move the entire upload directory to the new (empty) disk and point the server there?

answered Oct 20 '17 at 20:58

L3XT3CH

11
3

Because then that disk would be full and I would need to "spill over" into the previously full disk. They are both 1 TB and I need > 1TB of space. Unfortunately, that is the most disk space that go daddy provides. Can you believe that?! – ajon Oct 20 '17 at 21:08
OK That is crazy... – L3XT3CH Oct 23 '17 at 12:53

have 2 folders on different disks behave as if they were 1 folder

3 Answers3