0

We're running gluster on two hypervisors running Ubuntu. When we upgraded from Ubuntu 14.04 to 18.04, it upgraded gluster from 3.4.2 to 3.13.2. As soon as we upgraded and since then, we've been seeing substantially higher iowait on the system, as measured by top and iotop, and iotop indicates that glusterfsd is the culprit. For some reason, glusterfsd is doing more disk reads and/or those reads are being held up up at a greater rate. The guest VMs are also seeing more iowait -- their images are hosted on the gluster volume. This is causing inconsistent responsiveness from the services hosted on the VMs.

I'm looking for any recommendations on how to troubleshoot and/or resolve this problem. We have other sites that are still running 14.04, so I can compare/contrast any configuration parameters and performance.

The block scheduler on 14.04 was set to deadline and 18.04 was set to cfq. But changing the 18.04 scheduler to deadline didn't make any difference.

I was wondering whether glusterfsd on 18.04 isn't caching as much as it should. We tried increasing performance.cache-size substantially but that didn't make any difference.

Another option we're considering but haven't tried yet is upgrading to gluster 5.3 by back-porting the package from Ubuntu 19.04 to 18.04. Does anyone think this might help?

Is there any particular debug logging we could set up or other commands we could run to troubleshoot this better? Any thoughts, suggestions, ideas would be greatly appreciated.

1 Answers1

1

Upgrading to gluster 5.3 resolved the issue. There is a convenient PPA available for Ubuntu 18.04:

https://launchpad.net/~gluster/+archive/ubuntu/glusterfs-5