I'm encountering a hard to track down bug on MacOS in an OpenCL-based application. In a release build my code crashes with a SIGABRT at some point, in a release build I get an EXC_BAD_INSTRUCTION
on a thread obviously managing some lib dispatch / GCD stuff (com.apple.libdispatch-manager
). Note that I do not call any GCD related things myself, so I assume this is done by the Apple OpenCL runtime in the background.
The context is a benchmarking application that measures latency between enqueuing CL commands and receiving the CL_COMPLETE
callback for various ways of accessing the CL buffers. You'll find the code below. The error only occurs for one of the three available CL Devices in my MacBook Pro (AMD Radeon Pro 555 Compute Engine).
Relevant part of the code:
nlohmann::json performTestUseHostPtr()
{
nlohmann::json results;
std::vector<cl::Event> inputBufferEvent (1);
std::vector<cl::Event> outputBufferEvent (1);
std::vector<cl::Event> kernelEvent (1);
for (auto size : testSizes)
{
std::vector<float> inputBufferHost (size);
std::vector<float> outputBufferHost (size);
cl::Buffer inputBuffer (context, CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY, size * sizeof (float), inputBufferHost.data());
cl::Buffer outputBuffer (context, CL_MEM_USE_HOST_PTR | CL_MEM_WRITE_ONLY, size * sizeof (float), outputBufferHost.data());
void* inputBufferMapped = queue.enqueueMapBuffer (inputBuffer, CL_TRUE, CL_MAP_WRITE_INVALIDATE_REGION, 0, size * sizeof (float));
std::memcpy (inputBufferMapped, testData.data(), size * sizeof (float));
kernel.setArg (0, inputBuffer);
kernel.setArg (1, outputBuffer);
for (int i = 0; i < numTests; ++i)
{
startTimes[i] = my::HighResolutionTimer::now();
queue.enqueueUnmapMemObject (inputBuffer, inputBufferMapped, nullptr, &inputBufferEvent[0]);
inputBufferEvent[0].setCallback (CL_COMPLETE, setTimestampCallback, &unmapCompletedTimes[i]);
queue.enqueueNDRangeKernel (kernel, cl::NullRange, cl::NDRange (size), cl::NullRange, &inputBufferEvent, &kernelEvent[0]);
kernelEvent[0].setCallback (CL_COMPLETE, setTimestampCallback, &kernelCompletedTimes[i]);
void* outputBufferMapped = queue.enqueueMapBuffer (outputBuffer, CL_FALSE, CL_MAP_READ, 0, size * sizeof (float), &kernelEvent, &outputBufferEvent[0]);
outputBufferEvent[0].setCallback (CL_COMPLETE, setTimestampCallback, &mapCompletedTimes[i]);
inputBufferMapped = queue.enqueueMapBuffer (inputBuffer, CL_TRUE, CL_MAP_WRITE_INVALIDATE_REGION, 0, size * sizeof (float), &kernelEvent, nullptr);
// --- Release build error seems to happen somewhere here ---
queue.finish();
std::memcpy (inputBufferMapped, outputBufferMapped, size * sizeof (float));
queue.enqueueUnmapMemObject (outputBuffer, outputBufferMapped);
queue.finish();
}
queue.enqueueUnmapMemObject (inputBuffer, inputBufferMapped);
results["vecSize=" + std::to_string (size)] = calculateTimes();
queue.finish();
}
return results;
}
Notes:
I checked the error codes of all CL calls, all return CL_SUCCESS
, just removed them in the code above for a better overview.
I marked the line where I roughly assume the error to happen, this is based on inserting print-statements in the release-version and watching which points of the code were completed before the fault occurs. Inserting a print statement above the queue.finish();
statement furthermore lets the bug disappear, so this is likely to be something timing related.
Update:
When inserting a short sleep in the line where I assumed the error to happen and running a debug build it now also triggers a SIGABRT. Additionally I can find the following prints on the console:
OpenCLLatencyTests(17903,0x10012a5c0) malloc: tiny_free_list_remove_ptr: Internal invariant broken (next ptr of prev): ptr=0x1003052d0, prev_next=0x0
OpenCLLatencyTests(17903,0x10012a5c0) malloc: *** set a breakpoint in malloc_error_break to debug
Signal: SIGABRT (signal SIGABRT)
E0412 11:55:02.898913 233472000 ProtobufClient.cpp:63] No such process
Question:
- Can anyone spot an obvious error in my code?
- If not, are there any known bugs in the Apple OpenCL implementation that could cause errors like that?