Why use C++ to speed up Python
C++ code compiles into machine code. This makes it faster compared to interpreter languages (however not every code written in C++ is faster than Python code if you don't know what you are doing). in C++ you can access the data pointers directly and use SIMD instructions on them to make them multiple times faster. You can also multi-thread your loops and codes to make them run even faster (either explicit multi-threading or tools like OpenMP). You can't do these things (at least properly) in a high level language).
When to use C++ to speedup Python
Not every part of the code is worth optimizing. You should only optimize the parts that are computationally expensive and are a bottleneck of your program. These parts can be written in C or C++ and exposed to python by using bindings (by using pybind11 for example). Big machine learning libraries like PyTorch and TensorFlow do this.
Dedicated Hardware
Sometimes having a well optimized C++ CPU code is not enough. Then you can assess your problem and if it is suitable, you can use dedicated hardware. These hardware can go from low-level (like FPGA) to high-level hardware like dedicated graphics cards we usually have on our system (like CUDA programming for NVIDIA GPUs).
Regular Code Difference in Low and High Level Languages
Using a language that compiles has great advantages even if you don't use multi-threading or SIMD operations. For example, looping over a C array or std::vector
in C++ can be more than 100x faster compared to looping over Python arrays or using for
in MATLAB (recently JIT compiling is being used to speed up high-level languages but still, the difference exists). This has many reasons, some of which are basic data types that are recognized at compile time and having contiguous arrays. This is why people recommend using Numpy vectorized operations over simple Python loops (same is recommended for MATLAB).