How to implement Shared Storage for Concurrent File Access between 2 nodes (Linux)

Question

I need to design a Clustered application which runs separate instances on 2 nodes. These nodes are both Linux VM's running on VMware. Both application instances need to access a database & a set of files.

My intention is that a shared storage disk (external to both nodes) should contain the database & files. The applications would co-ordinate (via RPC-like mechanism) to determine which instance is master & which one is slave. The master would have write-access to the shared storage disk & the slave will have read-only access.

I'm having problems determining the file system for the shared storage device, since it would need to support concurrent access across 2 nodes. Going for a proprietary clustered file system (like GFS) is not a viable alternative owing to costs. Is there any way this can be accomplished in Linux (EXT3) via other means?

Desired behavior is as follows:

Instance A writes to file foo on shared disk
Instance B can read whatever A wrote into file foo immediately.

I also tried using SCSI PGR3 but it did not work.

score 1 · Answer 1 · answered Sep 03 '12 at 05:34

1

Q: Are both VM's co-located on the same physical host?

If so, why not use VMWare shared folders?

Otherwise, if both are co-located on the same LAN, what about good old NFS?

answered Sep 03 '12 at 05:34

paulsm4

114,292
17
138
190

No, both VM's will likely be located on different physical hosts. – viv Sep 03 '12 at 09:31

score 0 · Answer 2 · answered Sep 03 '12 at 06:15

0

try using heartbeat+pacemaker, it has couple of inbuilt options to monitor cluster. Should have something to look for data too

answered Sep 03 '12 at 06:15

itsme

1

score 0 · Answer 3 · answered Sep 03 '12 at 06:35

0

you might look at an active/passive setup with "drbd+(heartbeat|pacemaker)" ..

drbd gives you a distributed block device over 2 nodes, where you can deploy an ext3-fs ontop ..

heartbeat|pacemaker gives you a solution to handle which node is active and passive and some monitoring/repair functions ..

if you need read access on the "passive" node too, try to configure a NAS too on the nodes, where the passive node may mount it e.g. nfs|cifs ..

handling a database like pgsq|mysql on a network attached storage might not work ..

answered Sep 03 '12 at 06:35

helmor

1,369
1
7
2

Thanks for the suggestion. But as per this [link](https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial#Component.3B_DRBD), it looks like DRBD requires GFS2. If this is correct, it is not an option for me. In fact, I tried setting up GFS2 separately on a storage device but it turned out to be too complex & highly coupled with RH Cluster Services. – viv Sep 04 '12 at 08:23

score 0 · Answer 4 · answered Sep 03 '12 at 12:06

0

Are you going to be writing the applications from scratch? If so, you can consider using Zookeeper for the coordination between master and slave. This will put the coordination logic purely into the application code.

answered Sep 03 '12 at 12:06

Ambar

132
1
2

Thanks. We already have a V1.0 of the applications which work in a standalone system. The idea here is to build a redundant system which provides clustering service for high availability. – viv Sep 04 '12 at 08:25

score 0 · Answer 5 · edited Jul 31 '13 at 20:01

GPFS Is inherently a clustered filesystem.

You setup your servers to see the same lun(s), build the GPFS filesystem on the lun(s) and mount the GPFS filesystem on the machines.

If you are familiar with NFS, it looks like NFS, but it's GPFS, A Clustered filesystem by nature.

And if one of your GPFS servers goes down, if you defined your environment correctly, no one is the wiser and things continue to run.

How to implement Shared Storage for Concurrent File Access between 2 nodes (Linux)

5 Answers5