Questions tagged [flops]

FLOPS (FLoating point Operations Per Second): a unit of measurement used to quantify the performance of the implementation of a numerical algorithm.

Anything related to the FLOPS unit of measurement (FLoating point Operations Per Second), i.e. a unit of measurement used to quantify the performance of the implementation of a numerical algorithm.

See Wikipedia page on FLOPS.

132 questions
1
vote
4 answers

How many FLOPS for FFT?

I would like to know how many FLOPS a Fast Fourier Transform (FFT) performs. So, if I have a 1 dimensional array of N float numbers and I would like to calculate the FFT of this set of numbers, how many FLOPS need to be performed? I know that this…
thyme
  • 388
  • 5
  • 18
1
vote
1 answer

Intel Xeon E5- 2670 v2 Calculating GFlops

How can i calculate GFlops for processor: Intel Xeon E5-2670 v2 Clock speed: 2.5 GHz vCPU: 2 Memory: 7.5 GiB Storage: 1 * 32 SSD Networking Performance: Moderate(500 Mbps) Its aws instance type: m3.large I am not able to find IPC and calculate…
1
vote
0 answers

estimating flop count for division

I am wondering why the FLOP-count for division is treated different in literature (and internet). I have found this definition here at stackoverflow (1div = 4flop): https://stackoverflow.com/a/329243/6059576 and another one in the book "Matrix…
boschika
  • 7
  • 2
1
vote
1 answer

CPU-GPU FLOP Rate

I need to calculate how many flops per transferred value a code should provide so that running the code on GPU will be worth enough to increase the performance. Here are the flop rates and assumptions: 1. PCIe 16x v3.0 bus is able to transfer data…
Mert Şeker
  • 90
  • 1
  • 8
1
vote
1 answer

Counting FLOPS/GFLOPS in program - CUDA

Already finished my application which multiplies CRS matrix and vector (SpMV) and the only thing to do now is to count FLOPS my application did. In my opinion it's really hard to estimate number of floating point operation in case of sparse matrix -…
howdyhoward
  • 335
  • 1
  • 4
  • 12
1
vote
1 answer

Is FLOPS included in the number of instructions given by perf_event?

I have a program which uses perf_event.h to calculate the IPC of a specific running process. I read the INSTRUCTIONS counter and the CPU_CYCLES counter to do so. My question is about the value returned by the INSTRUCTIONS counter. Does it contain…
David Guyon
  • 2,759
  • 1
  • 28
  • 40
1
vote
1 answer

Over theoretical peak FLOPS benchmark

To measure the peak FLOPS performance of a CPU I wrote a little c++ programm. But the measurements give me results bigger than the theoretical peak FLOPS of my CPU. What is wrong? This is the code I wrote: #include #include…
Dominic Hofer
  • 5,751
  • 4
  • 18
  • 23
1
vote
7 answers

Highly concurrent multi-threaded application requires hardware

I am looking for a hardware, which must run about 256 computationally intensive real-time concurrent tasks in 24 hour mode (one multi-threaded C application). Each task takes about 40-50 MFLOPs, so all tasks require about 10 GFLOPs. CPU-RAM speed is…
psihodelia
  • 29,566
  • 35
  • 108
  • 157
1
vote
2 answers

Automatic way to obtain the floating-point operation count for some piece of code

I have some rather complex and highly templated code (C++, but this may not be very relevant) of which I'd like to know the number of adds, subs, muls, divs, and sqrts at execution. Is there an automatic way to get this information (the compiler…
Walter
  • 44,150
  • 20
  • 113
  • 196
1
vote
1 answer

Matrix multiplication on GPU. Memory bank conflicts and latency hiding

Edit: achievements over time is listed at the end of this question(~1Tflops/s yet). Im writing some kind of math library for C# using opencl(gpu) from C++ DLL and already done some optimizations on single precision square matrix-matrix…
1
vote
2 answers

Calculation of gflops for double precision

I have a device providing the peak GFLOPS specs and I want to measure how far my program is away from it. Since all the data I used was double precision, should I multiply the number of ops by 2 to get the GLOPS value and do the comparison?
Hailiang Zhang
  • 17,604
  • 23
  • 71
  • 117
1
vote
1 answer

Compiler skipping over loop

I am compiling flops via a loop with simple operations like such: for (i = beginvar; i < endvar; i++) { for (j = beginvar; j < endvar; j++) { num1 = ((num1 + num2) / num1); } } I never do anything with num1, however, and so the…
Jim Newtron
  • 227
  • 5
  • 11
0
votes
0 answers

find number of MAC operations in python

I am trying to find the number of MAC operations needed for an inference task (speech to text conversion). I have used thop before but it only works for pytorch Modules. How do I find the number of operations done in the process_file…
afsara_ben
  • 542
  • 1
  • 11
  • 30
0
votes
0 answers

What is the maximum theoretical peak of GFLOPS in single and double precision for a Xeon Silver 4210 with 40 CPU cores?

I have an Intel Xeon Silver 4210 @ 2.20ghz with 40 cores spread on 2 NUMA nodes. I need to know what could be the maximum theoretical GFLOPS for this architecture for single and double precision arithmetics. The values I have found around the web…
lilith
  • 63
  • 5
0
votes
0 answers

Counting FLOPS in tensorflow

Is there a way to count FLOPS for the training and prediction of tensorflow models? The models are running on a CPU using tensorflow 2.8.0 and i would not like to use an external (e.g. command line) tool.
Los
  • 111
  • 6
1 2 3
8 9