I've successfully implemented a function that copies an arbitrary amount of values starting from an arbitrary point in a ring buffer to a continuous array but I would like to make it more efficient. Here's a minimum example of my code:
#include <string.h>
#include <iostream>
#include <chrono>
#include <thread>
using namespace std;
/*Foo: a function*/
void Foo(int * print_array, int print_amount){
/*Simulate overhead*/
this_thread::sleep_for(chrono::microseconds(1000));
int sum = 0;
for (int i = 0; i < print_amount; i++){
sum += print_array[i]; //Linear operation
// cout << print_array[i] << " "; //Uncomment to check if correct funtionality
}
}
/*Example function*/
int main(){
/*Initialze ring buffer*/
int ring_buffer_elements = 32; //A largeish size
int ring_buffer_size = ring_buffer_elements * sizeof(int);
int * ring_buffer = (int *) malloc(ring_buffer_size);
for (int i = 0; i < ring_buffer_elements; i++)
ring_buffer[i] = i; //Fill buffer with ordered numbers
/*Initialze array*/
int array_elements = 16; //A smaller largeish size
int array_size = array_elements * sizeof(int);
int * array = (int *) malloc(array_size);
/*Set reference pointers*/
int * start_pointer = ring_buffer;
int * end_pointer = ring_buffer + ring_buffer_elements;
/*Set moving copy pointer*/
int * copy_pointer = start_pointer;
/*Set "random" amount to be copied at each iteration*/
int copy_amount = 11;
/*Set loop amount to check functionality or run time*/
int loop_amount = 1000; //Set lower if checking functionality
/***WORKING METHOD***/
/*Start timer*/
auto start_time = chrono::high_resolution_clock::now();
/*"Continuous" loop*/
for (int i = 0; i < loop_amount; i++){
/*Copy loop*/
for (int j = 0; j < copy_amount; j++){
array[j] = *copy_pointer; //Copy value from ring buffer
copy_pointer++; //Move pointer
if (copy_pointer >= end_pointer)
copy_pointer = start_pointer; //Reset pointer if reached end of ring buffer
}
Foo(array, copy_amount); //Call a function
}
/*Check run time*/
chrono::duration<double> run_time_ticks = chrono::high_resolution_clock::now() - start_time;
double run_time = run_time_ticks.count();
/*Print result*/
cout << endl << run_time << endl;
/***NAIVE METHOD***/
/*Reset moving pointer*/
copy_pointer = start_pointer;
/*Start timer*/
start_time = chrono::high_resolution_clock::now();
/*"Continuous" loop*/
for (int i = 0; i < loop_amount; i++){
/*Compute how many elements must be copied after reaching end of ring buffer*/
int copy_remainder = copy_pointer + copy_amount - end_pointer; //Ugly pointer arithmetic?
/*Check if we need to loop back or not*/
if (copy_remainder <= 0){
Foo(copy_pointer, copy_amount); //Call function
copy_pointer += copy_amount; //Move pointer
} else {
Foo(copy_pointer, copy_amount-copy_remainder); //Call function with part of values from copy pointer
Foo(start_pointer, copy_remainder); //Call function with remainder of values from start of ring buffer
copy_pointer = start_pointer + copy_remainder; //Move pointer
}
}
/*Check run time*/
run_time_ticks = chrono::high_resolution_clock::now() - start_time;
run_time = run_time_ticks.count();
/*Print result*/
cout << endl << run_time << endl;
/***memcpy METHOD***/
/*Reset moving pointer*/
copy_pointer = start_pointer;
/*Initialize size reference*/
int int_size = (int) sizeof(int);
/*Start timer*/
start_time = chrono::high_resolution_clock::now();
/*"Continuous" loop*/
for (int i = 0; i < loop_amount; i++){
/*Compute how many elements must be copied after reaching end of ring buffer*/
int copy_remainder = copy_pointer + copy_amount - end_pointer; //Ugly pointer arithmetic?
/*Check if we need to loop back or not*/
if (copy_remainder <= 0){
memcpy(array, copy_pointer, copy_amount*int_size); //Use memcpy
copy_pointer += copy_amount; //Move pointer
} else {
memcpy(array, copy_pointer, (copy_amount-copy_remainder)*int_size); //Use memcpy with part of values from copy pointer
memcpy(array+(copy_amount-copy_remainder), start_pointer, copy_remainder*int_size); //Use memcpy wih remainder of values from start of ring buffer
copy_pointer = start_pointer + copy_remainder; //Move pointer
}
/*Call a function*/
Foo(array, copy_amount);
}
/*Check run time*/
run_time_ticks = chrono::high_resolution_clock::now() - start_time;
run_time = run_time_ticks.count();
/*Print result*/
cout << endl << run_time << endl;
}
The ring buffer is used for a continuously updating stream of audio data and thus the amount of latency introduced must be kept to a minimum, why I'm trying to improve it.
I was thinking that copying the values in WORKING METHOD is redundant and that it should be possible to just pass through the original ring buffer data. My naive approach of doing this was to write with the original data and whenever the data looped back write again (see NAIVE IMPROVEMENT).
Indeed, in this minimum example this improvement is orders of magnitude quicker. However, in my real application Foo is replaced with a function that writes to a hardware buffer and has quite a overhead ̣̣̣̣̣- the end result is slower than the WORKING METHOD code, meaning that I never should use it (or Foo in this case) more than once (per time to write audio data). (EDIT a simulated overhead was added to Foo to accurately depict this issue).
My question is therefore if there is a quicker way to copy data from a ring buffer to a single, continuous array?
(Also, the ring buffer will never need to loop back more than once per time to write: copy_amount is always less than ring_buffer_elements)
Thanks!
EDIT Replaced original code snippet with minimum example as per Passer By's suggestion.
EDIT 2 Added a simulated overhead and an memcpy as per duong_dajgja's suggestion. In the example the memcpy method and the working method has essentially the same performance (with the latter having somewhat of an edge). In my application memcpy is about 3-4 % quicker than the working method when using the smallest buffer possible. So quicker, but regrettably far from significant.