Let's say you have two machines, A and B. On each machine, you export /opt/files
as a Gluster brick, and set up client-side replication. We then mount the resulting directory as /mnt/gluster-files
on both machines. This is important!
Using that mount point, we now have a highly available file system across the two machines.
When you write a file - let's say /mnt/gluster-files/example
on machine A, it will cause two things to happen:
- Write a copy to
/opt/files
- Send a copy over the network to be written to
/opt/files
on machine B.
This is good, because we want to have redundancy, which means we have to have more than one copy of the data.
Next up, let's say we want to read the same file. Again on machine A:
- You issue a read for
/mnt/gluster-files/example
- GlusterFS says "I need to check all the replica nodes to find out who has the most recent version of this file"
- GlusterFS checks every node
- It turns out that all copies are the same, because replication is working nicely
- You are returned the file from your local disk. §
(§ There is a read-subvolume
client option, and it is sensible to set it to the local volume on any machine that is a Gluster client and server, as in this case. Otherwise, step 5 could be 'you are sent the file from a random node'.)
Behind the scenes, GlusterFS keeps /opt/files
on both machines in sync. Checking every node, especially for a large number of small files, adds a not-insignificant performance penalty.
The question is therefore raised: if I am running a process on one of these two machines, and I know the files are in sync, why can't I just read the files from the local share?
It's not recommended, but you can do this. Read the files from /opt/files
. Manually keep track of if you get out sync, and if you do, do something like a ls -laR
in /mnt/gluster-files
which will trigger a synchronization.
So, what happens if you write to /opt/files
on machine A?
The file sits there unnoticed by GlusterFS. Gluster doesn't work that way. It doesn't get onto machine B unless you happen to do something which makes Gluster notice it on machine A.
Therefore, you can't just tell Apache to read and write to /opt/files
. What seems like a good compromise is telling it to read from /opt/files
but write to /mnt/gluster-files
. This is only possible if your application lets you specify a different path for reading and writing files, which not many do.