I decided to benchmark the realization of a swap function for simple types (like int
, or struct
, or a class
that uses only simple types in its fields) with a static
tmp variable in it to prevent memory allocation in each swap call. So I wrote this simple test program:
#include <iostream>
#include <chrono>
#include <utility>
#include <vector>
template<typename T>
void mySwap(T& a, T& b) //Like std::swap - just for tests
{
T tmp = std::move(a);
a = std::move(b);
b = std::move(tmp);
}
template<typename T>
void mySwapStatic(T& a, T& b) //Here with static tmp
{
static T tmp;
tmp = std::move(a);
a = std::move(b);
b = std::move(tmp);
}
class Test1 { //Simple class with some simple types
int foo;
float bar;
char bazz;
};
class Test2 { //Class with std::vector in it
int foo;
float bar;
char bazz;
std::vector<int> bizz;
public:
Test2()
{
bizz = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
}
};
#define Test Test1 //choosing class
const static unsigned int NUM_TESTS = 100000000;
static Test a, b; //making it static to prevent throwing out from code by compiler optimizations
template<typename T, typename F>
auto test(unsigned int numTests, T& a, T& b, const F swapFunction ) //test function
{
std::chrono::system_clock::time_point t1, t2;
t1 = std::chrono::system_clock::now();
for(unsigned int i = 0; i < NUM_TESTS; ++i) {
swapFunction(a, b);
}
t2 = std::chrono::system_clock::now();
return std::chrono::duration_cast<std::chrono::nanoseconds>(t2 - t1).count();
}
int main()
{
std::chrono::system_clock::time_point t1, t2;
std::cout << "Test 1. MySwap Result:\t\t" << test(NUM_TESTS, a, b, mySwap<Test>) << " nanoseconds\n"; //caling test function
t1 = std::chrono::system_clock::now();
for(unsigned int i = 0; i < NUM_TESTS; ++i) {
mySwap<Test>(a, b);
}
t2 = std::chrono::system_clock::now();
std::cout << "Test 2. MySwap2 Result:\t\t" << std::chrono::duration_cast<std::chrono::nanoseconds>(t2 - t1).count() << " nanoseconds\n"; //This result slightly better then 1. why?!
std::cout << "Test 3. MySwapStatic Result:\t" << test(NUM_TESTS, a, b, mySwapStatic<Test>) << " nanoseconds\n"; //test function with mySwapStatic
t1 = std::chrono::system_clock::now();
for(unsigned int i = 0; i < NUM_TESTS; ++i) {
mySwapStatic<Test>(a, b);
}
t2 = std::chrono::system_clock::now();
std::cout << "Test 4. MySwapStatic2 Result:\t" << std::chrono::duration_cast<std::chrono::nanoseconds>(t2 - t1).count() << " nanoseconds\n"; //And again - it's better then 3...
std::cout << "Test 5. std::swap Result:\t" << test(NUM_TESTS, a, b, std::swap<Test>) << " nanoseconds\n"; //calling test function with std::swap for comparsion. Mostly similar to 1...
return 0;
}
Some results with Test
defined as Test1
(g++ (Ubuntu 4.8.2-19ubuntu1) 4.8.2 called as g++ main.cpp -O3 -std=c++11):
Test 1. MySwap Result: 625,105,480 nanoseconds
Test 2. MySwap2 Result: 528,701,547 nanoseconds
Test 3. MySwapStatic Result: 338,484,180 nanoseconds
Test 4. MySwapStatic2 Result: 228,228,156 nanoseconds
Test 5. std::swap Result: 564,863,184 nanoseconds
My main question: is it good to use this implementation for swapping of simple types? I know that if you use it for swapping types with vectors, for example, then std::swap
is better, and you can see it just by changing the Test
define to Test2
.
Second question: why are the results in test 1, 2, 3, and 4 so different? What am I doing wrong with the test function implementation?