0

I am trying to benchmark a ZFS RAID-10 array of SATA SSDs using fio with settings that are somewhat representative of database workloads like for PostgreSQL.

For example, for random reads:

fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=randread.fio --bs=4k --iodepth=4 --numjobs=16 --size=10G --readwrite=randread

However, I am unsure of the values that iodepth and numjobs should be set to, and what they can represent in terms of a database workload.

Can numjobs be interpreted as the number of concurrent database connections from clients? And can iodepth be interpreted as the number of concurrent/pending/queued database queries in each database connection?

What range of values of iodepth and numjobs do you recommend?

Nyxynyx
  • 1,459
  • 11
  • 39
  • 49

1 Answers1

0

There is one big problem with the job as described above with ZFS: ZFS on Linux won't always implement O_DIRECT asynchronously at the moment because it currently implements O_DIRECT via (minimally) buffered I/O. This means libaio may actually become blocking before reaching iodepth...

Can numjobs be interpreted as the number of concurrent database connections from clients?

Not really... One client may trigger multiple I/Os depending on what the query they asked for. We also don't know what your database backend does.

And can iodepth be interpreted as the number of concurrent/pending/queued database queries in each database connection?

Again not really because the mapping will depend on your database's backend code choices. Some databases can be configured to use threads, processes, AIO or even some mixture of all of these...

What range of values of iodepth and numjobs do you recommend?

There is no good answer to this - you would likely have to profile your database (at the system I/O level) and create something that matched (unless you can arrange for your database vendor to help you).

Anon
  • 1,245
  • 10
  • 23