How many float multiplies can be performed with a single core of the current Intel architectures?

Question

Trying to assess the performance gain from an embedded architecture I tried to search for the number of floating point multiplies that can be performed in a cycle on a single core of the Core 2 and Core i7 architectures, but could not find a quick answer to that. Unfortunately I am not familiar with the ISA so I cannot tell that from looking at the respective instructions. I assume it would be some kind of a SIMD instruction. Any idea?

score 3 · Accepted Answer · answered Nov 11 '11 at 01:20

3

One thing: Core 2 is not Intel's latest architecture. That would be Sandy Bridge.

Core 2 and Core i7 Nehalem, can sustain 1 SSE multiply/cycle. Each SSE instruction can handle up to 4 single-precision or 2 double-precision. So that's 2 DP or 4 SP floating-point multiplies per cycle.

Core i7 Sandy Bridge can sustain 1 AVX multiply/cycle. AVX is double the size of SSE. So that's 4 DP or 8 SP floating-point multiplies per cycle.

answered Nov 11 '11 at 01:20

Mysticial

464,885
45
335
332

Is it safe to assume that current AMD processors offer the same performance? – ysap Nov 11 '11 at 01:56
Correct. I think all AMD processors since the K10 architecture have had the same SSE throughputs. (1 SSE multiply/cycle) For the new Bulldozer architecture, it's a little more complicated than that due the shared FPU between each "Bulldozer Module". – Mysticial Nov 11 '11 at 01:59

How many float multiplies can be performed with a single core of the current Intel architectures?

1 Answers1