1

I'm stress testing an application that uses MySQL. The bottleneck is the MySQL disks (separate for main data and redo log), which are written to quite much. atop indicates their usage is near 100%. If I disable the redo logs (ALTER INSTANCE DISABLE INNODB REDO_LOG), then the disk usage becomes negligible (2% for the main disk and zero—naturally—for the redo log disk).

I need to reduce disk usage (it doesn't need to be negligible, but it has to be considerably less) without disabling the redo logs. It's OK to lose several seconds (even minutes) of data in case of a crash. I've tried this:

innodb_flush_method = O_DSYNC
innodb_flush_log_at_trx_commit = 0
innodb_flush_log_at_timeout = 5

but I don't see any difference.

Is there anything else I can do?

This is MySQL 8.0.29 on Ubuntu 20.04.

(Note: the application is Nextcloud and apparently it makes many database write requests.)

Antonis Christofides
  • 2,598
  • 2
  • 23
  • 35
  • There are more GV to change to reduce write frequency. Additional information request, please. RAM size, # cores, any SSD or NVME devices on MySQL Host server? Post on pastebin.com and share the links. From your SSH login root, Text results of: A) SELECT COUNT(*) FROM information_schema.tables; B) SHOW GLOBAL STATUS; after minimum 24 hours UPTIME C) SHOW GLOBAL VARIABLES; D) SHOW FULL PROCESSLIST; E) STATUS; not SHOW STATUS, just STATUS; for server workload tuning analysis to provide suggestions. – Wilson Hauck May 31 '22 at 15:22
  • Which Nextcloud Partner are you hosted by? – Wilson Hauck May 31 '22 at 15:31
  • We can explain disk usage for particular queries. Also, if the "stress tests" don't reflect your application, conclusions from them may be misleading. – Rick James May 31 '22 at 16:42
  • 2
    I would suggest that you take the question to dba imho – djdomi May 31 '22 at 17:31
  • This belongs in dba.stackexchange.com – Rick James Jun 20 '22 at 16:11

1 Answers1

2

I figured out that this was more a Nextcloud issue than a MySQL issue.

By examining the Innodb_os_log_written part of SHOW GLOBAL STATUS's output, I verified that it was writing 1M/s or more to the redo logs. Apparently it was the sheer amount of data that was causing the problem on my slow disk rather than the details of when and how it was being synced.

By setting general_log='ON' and examining the queries, I figured out, in the end, that it was a Nextcloud bug, where logging in causes too many queries. After the workaround of truncating the oc_authtoken table, the rate of writing to the redo logs fell to less than a tenth of what it was.

Antonis Christofides
  • 2,598
  • 2
  • 23
  • 35