1

Here is the Black (Black Scholes less the dividend) option pricing model for options on futures written in Cython with actual multi-threading, but I can't run it. (NOW FIXED, SEE LATER POST BELOW FOR ANSWER). I am using Python 3.5 with Microsoft Visual Studio 2015 compiler. Here is the serial version that takes 3.5s for 10M options: Cython program is slower than plain Python (10M options 3.5s vs 3.25s Black Scholes) - what am I missing?

I attempted to make this parallel by using nogil but after compiling, I cannot access the internal function CyBlackP. There are several issues with this (at least on Windows). 1) Cython when generating the OpenMP code assumes you are beyond v2.0 but Microsoft Visual Studio 2015 is stuck on the old version which requires signed iterators. The workaround I have is after first attempting to build the code, it will error out, then open the output CyBlackP.cpp file in Microsoft Visual Studio 2015, search for size_t __pyx_t_2 (line 1430), then change it to ssize_t __pyx_t_2, and change the next line from size_t __pyx_t_3 to ssize_t __pyx_t_3 to get rid of signed/unsigned errors, and compile again. 2) You can't directly go from NumPy arrays into the function as nogil only works on pure C/C++ functions, so I have several helper functions to convert the NumPy array inputs into C++ vector format, pass those to a C++ function, then convert the returned vector back to a NumPy array. I'm posting the parallel code here for others to use and I'm sure someone out there can figure out why I can't access the parallel function from Python - the non-parallel version was accessed like this from CyBlackP.CyBlackP import CyBlackP.

Code is here with steps on how to build. First file save as CyBlackP.pyx [note the exposed function to Python here is CyBlackP, which converts the NumPy input arrays into C vectors through the helper functions, then passes the C vectors to the C function CyBlackParallel, which runs with nogil and OpenMP. The results are then converted back to a NumPy array and returned from CyBlackP back to Python]:

import numpy as np
cimport cython
from cython.parallel cimport prange
from libcpp.vector cimport vector

cdef extern from "math.h" nogil:
    double exp(double)
    double log(double)
    double erf(double)
    double sqrt(double)

cdef double std_norm_cdf(double x) nogil:

    return 0.5*(1+erf(x/sqrt(2.0)))

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.cdivision(True)
cdef CyBlackParallel(vector[double] Black_PnL, vector[double] Black_S, vector[double] Black_Texpiry, vector[double] Black_strike, vector[double] Black_volatility, vector[double] Black_IR, vector[int] Black_callput):
    cdef int i
    N = Black_PnL.size()

    cdef double d1, d2

    for i in prange(N, nogil=True, num_threads=4, schedule='static'):
        d1 = ((log(Black_S[i] / Black_strike[i]) + Black_Texpiry[i] * (Black_volatility[i] * Black_volatility[i]) / 2)) / (Black_volatility[i] * sqrt(Black_Texpiry[i]))
        d2 = d1 - Black_volatility[i] * sqrt(Black_Texpiry[i])
        Black_PnL[i] = exp(-Black_IR[i] * Black_Texpiry[i]) * (Black_callput[i] * Black_S[i] * std_norm_cdf(Black_callput[i] * d1) - Black_callput[i] * Black_strike[i] * std_norm_cdf(Black_callput[i] * d2)) 

    return Black_PnL

cdef vector[double] arrayToVector(np.ndarray[np.float64_t,ndim=1] array):
    cdef long size = array.size
    cdef vector[double] vec
    cdef long i
    for i in range(size):
        vec.push_back(array[i])

    return vec

cdef vector[int] INTarrayToVector(np.ndarray[np.int64_t,ndim=1] array):
    cdef long size = array.size
    cdef vector[int] vec
    cdef long i
    for i in range(size):
        vec.push_back(array[i])

    return vec

cdef np.ndarray[np.float64_t, ndim=1] vectorToArray(vector[double] vec):
    cdef np.ndarray[np.float64_t, ndim=1] arr = np.zeros(vec.size())
    cdef long i
    for i in range(vec.size()):
        arr[i] = vec[i]

    return arr

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.cdivision(True)
cpdef CyBlackP(ndarray[np.float64_t, ndim=1] PnL, ndarray[np.float64_t, ndim=1] S0, ndarray[np.float64_t, ndim=1] Texpiry, ndarray[np.float64_t, ndim=1] strike, ndarray [np.float64_t, ndim=1] volatility, ndarray[np.float64_t, ndim=1] IR, ndarray[np.int64_t, ndim=1] callput):
    cdef vector[double] Black_PnL, Black_S, Black_Texpiry, Black_strike, Black_volatility, Black_IR
    cdef ndarray[np.float64_t, ndim=1] Results
    cdef vector[int] Black_callput 
    Black_PnL = arrayToVector(PnL)
    Black_S = arrayToVector(S0)
    Black_Texpiry = arrayToVector(Texpiry)
    Black_strike = arrayToVector(strike)
    Black_volatility = arrayToVector(volatility)
    Black_IR = arrayToVector(IR)
    Black_callput = INTarrayToVector(callput)
    Black_PnL = CyBlackParallel (Black_PnL, Black_S, Black_Texpiry, Black_strike, Black_volatility, Black_IR, Black_callput)
    Results = vectorToArray(Black_PnL)

    return Results

Next code piece save as setup.py for use by Cython:

try:
    from setuptools import setup
    from setuptools import Extension
except ImportError:
    from distutils.core import setup
    from distutils.extension import Extension

from Cython.Distutils import build_ext
import numpy as np

ext_modules = [Extension("CyBlackP",sources=["CyBlackP.pyx"],
              extra_compile_args=['/Ot', '/openmp', '/favor:INTEL64', '/EHsc', '/GA'],
              language='c++')]

setup(
    name= 'Generic model class',
    cmdclass = {'build_ext': build_ext},
    include_dirs = [np.get_include()],
    ext_modules = ext_modules)

Then from a command prompt, type: python setup.py build_ext --inplace --compiler=msvc to build.

Any help on getting access to this function is appreciated, not sure why I can't seem to locate it after compiling. I can import CyBlackP or from CyBlackP import * but I can't get to the actual function to calculate the option values.

Here is a realistic NumPy test script to use if you want to test this Cython function:

BlackPnL = np.zeros(10000000)
Black_S=np.random.randint(200, 10000, 10000000)*0.01
Black_Texpiry=np.random.randint(1,500,10000000)*0.01
Black_strike=np.random.randint(1,100,10000000)*0.1
Black_volatility=np.random.rand(10000000)*1.2
Black_IR=np.random.rand(10000000)*0.1
Black_callput=np.sign(np.random.randn(10000000))
Black_callput=Black_callput.astype(np.int64)
Community
  • 1
  • 1
Matt
  • 2,602
  • 13
  • 36
  • 2
    *"You can't directly go from NumPy arrays into the function as `nogil` only works on pure C/C++ functions"* - you can pass in existing numpy arrays either as typed memoryviews (e.g. `double[:] array`) or using the old "`np.ndarray[...] array`" syntax without acquiring the GIL, you just can't instantiate them or access their Python methods within a `nogil` block. – ali_m May 12 '16 at 22:22
  • So even though the function converts the NumPy arrays into C++ vectors before the C++ only function with `nogil` processes them, you're saying the cpdef function won't be exposed at all to Python? The compiler won't compile a `nogil` block if it touches a Python object whereas this compiles. – Matt May 12 '16 at 22:46
  • 1
    A NumPy array is essentially a Python wrapper around a memory buffer that can be exposed directly to C/C++ functions - there's no conversion needed. Take a look at the [documentation here](http://docs.cython.org/src/userguide/memoryviews.html). I'm not sure if this is directly related to the problem you're describing, but getting rid of all of those unnecessary helper functions would greatly simplify your code and make debugging it easier. – ali_m May 12 '16 at 23:00
  • Alright upvoted your comments, attempting to try memory views on my strided arrays after some reading... – Matt May 12 '16 at 23:39
  • 1
    you cannot `import` a `cdef` function in Python. Expose it as a `def` function. `cdef` functions can only be `cimport`ed from Cython. – dashesy May 13 '16 at 05:47
  • @dashesy The function he's referring to is `cpdef`ed, so at first glance it ought to be importable from Python. – ali_m May 13 '16 at 13:02
  • Yes it is `cpdef`'d but can't import it... tried using `dir` on it but all I can see is: `CyBlackP.__spec__ Out[40]: ModuleSpec(name='CyBlackP', loader=None, origin='namespace', submodule_search_locations=_NamespacePath(['C:\\Users\\slezm0\\Documents\\Python Scripts\\CyBlackP']))` – Matt May 13 '16 at 13:16
  • @ali_m got it working then tried to implement memory views per your suggestion. Much cleaner code as you mentioned (and way faster) although the return value for BlackPnL shows: - do you know how to convert this back to a numpy array? – Matt May 13 '16 at 21:44
  • You can call `np.asarray(...)` on the output – ali_m May 13 '16 at 21:49
  • Thanks @ali_m for the suggestions! I just figured that one out as well :) – Matt May 13 '16 at 21:50

1 Answers1

0

Okay I figured out what was wrong using dependency walker http://www.dependencywalker.com/ on the CyBlackP.cp35-win_amd64.pyd file generated by Cython. It showed that 2 DLLs were not found: msvcp140_app.dll and vcomp140_app.dll which are just the x64 versions of MSVC OpenMP and CRT C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\redist\x64\ Microsoft.VC140.OpenMP\vcomp140.dll and C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\redist\x64\Microsoft.VC14 0.CRT\msvcp140.dll renamed with _app inserted, and copied to the \CyBlackP\ project directory. I also updated my setup.py like this which gets rid of the annoying import statement (now just from CyBlackP import CyBlackP):

try:
    from setuptools import setup
    from setuptools import Extension
except ImportError:
    from distutils.core import setup
    from distutils.extension import Extension

from Cython.Distutils import build_ext
import numpy as np
import os

module = 'CyBlackP'

ext_modules = [Extension(module, sources=[module + ".pyx"],
              extra_compile_args=['/Ot', '/favor:INTEL64', '/EHsc', '/GA', '/openmp'],
              language='c++')]

setup(
    name = module,
    cmdclass = {'build_ext': build_ext},
    include_dirs = [np.get_include(), os.path.join(np.get_include(), 'numpy')],
    ext_modules = ext_modules)
Matt
  • 2,602
  • 13
  • 36
  • `%timeit CyBlackP(BlackPnL, Black_S, Black_Texpiry, Black_strike, Black_volatility, Black_IR, Black_callput) 1 loop, best of 3: 2.88 s per loop` - verified in the task manager this is going to 4 cores as stated in the code. – Matt May 13 '16 at 19:03
  • Ok so the problem was `name= 'Generic model class'` because that is the name of the module to import from. Missing dlls should not change importing logic, they should result in exceptions (or crashes) but nothing to do with import. – dashesy May 13 '16 at 20:58
  • Not so sure as that name worked fine using the serial version of the code... and you didn't have to call it either. – Matt May 13 '16 at 21:07
  • Actually after converting my code to memoryviews per @ali_m suggestion, the module hid itself in another layer down again. To fix I deleted the `name = module` from above and it imported fine. – Matt May 13 '16 at 23:26
  • BTW the code with memoryviews only takes ~0.65s so a huge improvement over the original 2.88s. – Matt May 14 '16 at 02:45