0

I have a Linux box with 4 CPUs, and when I run a process on it, it processes 1.5 million records in 30 mins. Here processing means reading from oracle DB, deriving some stuff and writing 1.5 records to the file system in a file.

We were now planning to run multiple separate instances (20) of this process on this server.

Does this mean, I would process 20 x 1.5 = 20 million records in 30 mins? I see the Thread(s) per core = 1 (lscpu command). So I believe this is not a correct assumption. What all factors are considered when comparing number of CPUs with number of processes?

Actually, we have received a request saying that with the 20 process instances the system should process atleast 20 million records per hour per core.

I don't think these requested numbers can be met with this linux system.

pradipti
  • 85
  • 8

1 Answers1

0

Does this mean, I would process 20 x 1.5 = 20 million records in 30 mins?

No. Emphatically no.

I would consider the 20 million records as a theoretical maximum. But I'd expect you to run into I/O channel saturation before that goal.

This is where a System Architect will earn their money. Lacking that, get your hardware vendor to give you access to a test system where you can benchmark the actual performance. Remember that I/O bandwidth is a typical choke point, but you may also run into memory bandwidth or even motherboard bus limits.

"You'll never know how it tastes until you bite it." applies to benchmarking too.