7

I have a series of directories on Linux and each directory contains lots of files and data. The data in those directories are automatically generated, but multiple users will need to perform more analysis on that data and generate more files, change the structure, etc.

Since these data directories are very large, I don't want several people to make a copy of the original data so I'd like to make a copy of the directory and link to the original from the new one. However, I'd like any changes to be kept only in the new directory, and leave the original read only. I'd prefer not to link only specific files that I define because the data in these directories is so varied.

So I'm wondering if there is a way to create a copy of a directory by linking to the original but keeping any changed files in the new directory only.

Shahbaz
  • 46,337
  • 19
  • 116
  • 182
Greg B
  • 609
  • 6
  • 19

2 Answers2

23

It turns out this is what I wanted to:

cp -al <origdir> <newdir>

It will copy an entire directory and create hard links to the original files. If the original file is deleted, the copied file still exists, and vice-versa. This will work perfectly, but I found newdir must not already exist. As long as the original files are read-only, you'll be able to create an identical, safe copy of the original directory.

Greg B
  • 609
  • 6
  • 19
  • 1
    It's possible to do this when the original files are not read-only, but any changes made to either copy will be made to both copies equally, so hard linking is not suitable if you want to be able to modify the copied file without modifying the original. – thomasrutter Jul 22 '14 at 00:20
  • If the files are read-only, then whatever program modifies (and saves) the file must overwrite the file, rather than modify it in-place. For `cp`, `cp -f` will accomplish this. – palswim Aug 23 '20 at 05:20
2

However, since you are looking for a way that people can write back changes, UnionFS is probably what you are looking for. It provides means to combine read-only and read-write locations into one.

Unionfs allows any mix of read-only and read-write branches, as well as insertion and deletion of branches anywhere in the fan-out.


Originally I was going to recommend this (I use it a lot):

Assuming the permissions aren't an issue (e.g. only reading is required) I would suggest to bind-mount them into place.

mount -B <original> <new-location>
# or
mount --bind <original> <new-location>

<new-location> must exist as a folder.

0xC0000022L
  • 20,597
  • 9
  • 86
  • 152
  • Thanks for the suggestions! I tried the mount bind and was able to create a new mount point. However: 1) I have thousands of directories, which would require thousands of mounts 2) I created a file in the new directory and it also showed up in the original directory – Greg B Jul 12 '12 at 17:15
  • Thousands of directories but not a single root to them which could be used? You would need at most as many mounts as users in such a scenario. Even if different users have different sub-folders of that single common root, the number of mounts would still equal the number of users. – 0xC0000022L Jul 12 '12 at 17:21
  • A duplicate tree is possible... but changing things at the OS level isn't ideal. I think I may try to just recursively create links to the files and directories in the original, and just make sure the original files are read only – Greg B Jul 12 '12 at 18:45
  • @GregB: but in your question you state that you want to allow changes as well? I'm confused. – 0xC0000022L Jul 12 '12 at 18:47
  • I do want to allow changes to anything in the directory, in theory. But if I'm only able to create a duplicate read only tree of links, even if they can only add, but not modify existing files, then that will be good enough. – Greg B Jul 12 '12 at 19:08