Using ZFS head node as database server?

Question

I'm using a dual-head ZFS-backed NAS for high availability cluster shared storage, based on Nexenta's recommended architecture as seen here:

The disks in 1 JBOD will store the database files for a single 4 TB Postgres database and the disks in the other JBOD store 20 TB of large raw binary flat files (cluster results for large stellar object collision simulations). In other words, the JBOD backing the Postgres files will handle mainly random workloads while the JBOD backing the simulation results will handle mainly serial workloads. Both head nodes have 256 GB of memory and 16 cores. The cluster has about 200 cores each maintaining a Postgres session, so I expect about 200 concurrent sessions.

I'm wondering if it's wise in my setup to have the ZFS head nodes act simultaneously as a mirrored pair of Postgres database servers for my cluster? The only drawbacks I can see are:

Less flexibility for scaling my infrastructure.
Slightly lower level of redundancy.
Limited memory and CPU resources for Postgres.

However, the advantage I see is that ZFS is pretty dumb about automatic failover anyway and I don't have to spend a lot of work getting each Postgres database server to figure out if a head node has failed since it will fail together with the head node.

PostgreSQL *cannot* be run in any form of shared-storage mode. Attempts to do so will fail. Attempts to bypass the protections to stop you doing it (like moving/hiding `postmaster.pid`) will result in severe data corruption. — Craig Ringer, Jun 21 '14 at 07:48
@CraigRinger Hm, is this contradictory to https://wiki.postgresql.org/wiki/Shared_Storage? — elleciel, Jun 21 '14 at 08:00
You can run it if you absolutely guarantee that only one postmaster may ever be accessing the data directory at the same time. Good STONITH / fencing is an absolute requirement to avoid big-time data corruption. Personally there's no way I'd do it. This also eliminates the benefits you're talking about - figuring out which is the main/live server automatically, etc - because you have to manage failover. — Craig Ringer, Jun 21 '14 at 08:03
@Craig Ringer I see, it sounds to me then that it's possible, just a technically-challenging endeavor. Perhaps you'd recommend that I should just have 2 identical JBODs, each with a single head also acting as a Postgres server, and figure out the automatic failover from there instead? Would be interesting to hear other views on this. It also still doesn't really deal with my main question - should I be using the ZFS head as the Postgres server as well? — elleciel, Jun 21 '14 at 08:10
I've revised the wiki page to make it clearer; thanks for pointing it out. — Craig Ringer, Jun 21 '14 at 08:12
This doesn't make sense. Nexenta's HA solution is leveraging [RSF-1 clustering](http://www.high-availability.com/high-availability-rsf-1/). It sounds like you're doing this with ZFS on Linux without the RSF-1 piece. Mind you, ZFS on Linux doesn't really have a clustering option, so the Nexenta reference doesn't apply. What do you have to gain by having two head nodes? — ewwhite, Jun 21 '14 at 10:58

score 0 · Answer 1 · answered Mar 01 '19 at 22:12

You can't have two Postgres instances ("clusters" in the Postgres terminology) acting on the same physical files.

if you want performance, sharding may help you (have two instances each carrying different data)

If you want high-availability then fail-over with STONITH may be the solution. you need to make sure that then the hardware is repaired it does not try ot open the database while the second node is serving it.

Using ZFS head node as database server?

1 Answers1