Generalized Universal Function in numpy

Question

I'm trying make a generalized ufunc using numpy API. The inputs are one (n x m) matrix and a scalar, and outputs are two matrix ((n x p) and (p x m)). But I don't knowing how to do it. Someone could help me? In init function I'm using PyUFunc_FromFuncAndDataAndSignature function with signature:

"(n,m),()->(n,p),(p,m)"

I can read the inputs (matrix and scalar), but I wanted to use the scalar input like the dimension p in signature. Is it possible?

Here a example code that just print the inputs:

#include "Python.h"
#include "math.h"
#include "numpy/ndarraytypes.h"
#include "numpy/ufuncobject.h"

static PyMethodDef nmfMethods[] = {
        {NULL, NULL, 0, NULL}
};


static void double_nmf(char **args, npy_intp *dimensions,
                            npy_intp* steps, void* data)
{
    npy_intp i, j, 
             n = dimensions[1], //dimensions of input matrix
             m = dimensions[2]; //

    printf("scalar: %d\n",*(int*)args[1]); // input scalar

    // just print input matrix
    printf("Input matrix:\n");
    for(i=0;i<n;i++){
        for(j=0;j<m;j++){
            printf("%.1f ",*(double*)(args[0]+8*(i*m+j)));
        }
    printf("\n");
    }
    return;

}

static PyUFuncGenericFunction nmf_functions[] = { double_nmf };
static void * nmf_data[] = { (void *)NULL };
static char nmf_signatures[] = { PyArray_DOUBLE, PyArray_INT, PyArray_DOUBLE, PyArray_DOUBLE };
char *nmf_signature = "(n,m),()->(n,p),(p,m)";

PyMODINIT_FUNC initnmf(void)
{
    PyObject *m, *d, *version, *nmf;

    m = Py_InitModule("nmf", nmfMethods);
    if (m == NULL) {
        return;
    }

    import_array();
    import_umath();
    d = PyModule_GetDict(m);
    version = PyString_FromString("0.1");
    PyDict_SetItemString(d, "__version__", version);
    Py_DECREF(version);

    nmf = PyUFunc_FromFuncAndDataAndSignature(nmf_functions, nmf_data, nmf_signatures, 1,
                                    2, 2, PyUFunc_None, "nmf",
                                    "", 0, nmf_signature);
    PyDict_SetItemString(d, "nmf", nmf);
    Py_DECREF(nmf);
}

This code compiles and works. The python script is here:

#/usr/bin/python

import numpy as np
import nmf

x = np.array([[1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15],[16,17,18,19,20]])
y,z = nmf.nmf(x,2)
print "Shapes of outputs: ", y.shape, z.shape

And the terminal output is:

scalar: 2
Input matrix:
1.0 2.0 3.0 4.0 5.0 
6.0 7.0 8.0 9.0 10.0 
11.0 12.0 13.0 14.0 15.0 
16.0 17.0 18.0 19.0 20.0 
Shapes of outputs:  (4, 1) (1, 5)

My doubt is how use the scalar input (2 in the case) like dimension p of outputs matrices. In example p = 1, and I don't set it.

Please show a, minimal, example of original array, scalar, and final array(s). — wwii, Oct 09 '14 at 18:57
Unfortunately no, it is not possible. THe way the signature parser of gufuncs work, your output arrays will always have `p == 1`. If you want any other value, the only way to achieve it is to preallocate the output arrays, and pass them in using the `out` keyword argument of your gufunc. NumPy 1.10 will probably bring additional functionality that may enable what you are trying to do. — Jaime, Oct 09 '14 at 19:13
Are you saying that you want `p` to be determined by the *value* of the scalar input? If so then you a gufunc is not what you want -- just make a regular function. The idea of a gufunc is that it can broadcast over the arguments, but suppose I want to pass in two different scalars [3, 4] -- the output shapes can't be (2,n,3),(2,3,m) and (2,n,4),(2,4,m) simultaneously. (The 2's come from broadcasting over the input vector.) — Nathaniel J. Smith, Oct 09 '14 at 19:57

Jaime · Accepted Answer · 2014-10-10T14:04:51.057

There is no way to set a dimension in a gufunc, other than by providing an array of a certain size. The 1 is the value that all dimensions are initialized to internally, and you shouldn't rely on that not changing. My personal view on that is that an undefined dimension should raise an error.

The only thing you can do to set p, is to create an empty array with the right shape and pass it in as an output array. To achieve it, you would redefine your nmf to have signature "(m,n)->(m,p),(p,n)" and wrap it with some Python similar to:

def nmf_wrap(x, p):
    x = np.asarray(x)
    assert x.ndim >= 2
    shape = x.shape[:-2]
    m, n = x.shape[-2:]
    out1 = np.empty(shape+(m, p), dtype=x.dtype)
    out2 = np.empty(shape+(p, n), dtype=x.dtype)
    return nmf.nmf(x, out1, out2)

There has been some ongoing discussion on extending the functionality provided by gufuncs signatures on the numpy-dev mailing list recently. What you describe has some similarity with what I call there "computed dimensions." If you would like to see something implemented in numpy 1.10, it would be great if you could explain your use case in more detail on that list: we don't know of many (any?) gufunc coders out in the wild!

fabincarmo · Answer 2 · 2014-10-10T18:40:38.377

Thanks @jaime for your answer, helped me a lot. I made the changes that you suggested and it works. Here the C example code. It just copies some input matrix elements to outputs.

#include "Python.h"
#include "numpy/ndarraytypes.h"
#include "numpy/ufuncobject.h"

static PyMethodDef nmfMethods[] = {
        {NULL, NULL, 0, NULL}
};


static void double_nmf(char **args, npy_intp *dimensions,
                            npy_intp* steps, void* data)
{
    npy_intp i, j,
            n = dimensions[1],
            m = dimensions[2],
            p = dimensions[3];
    char *in = args[0], *out1 = args[1], *out2 = args[2];

    for(i=0; i<n; i++){
        for(j=0; j<p; j++){
            *(double*)(out1 + 8*(j + p*i)) = *(double*)(in + 8*(j + m*i));
        }
    }
    for(i=0; i<p; i++){
        for(j=0; j<m; j++){
            *(double*)(out2 + 8*(j + m*i)) = *(double*)(in + 8*(j + m*i));
        }
    }
    return;

}

static PyUFuncGenericFunction nmf_functions[] = { double_nmf };
static void * nmf_data[] = { (void *)NULL };
static char nmf_signatures[] = { PyArray_DOUBLE, PyArray_DOUBLE, PyArray_DOUBLE };
char *nmf_signature = "(n,m)->(n,p),(p,m)";

PyMODINIT_FUNC initnmf(void)
{
    PyObject *m, *d, *version, *nmf;

    m = Py_InitModule("nmf", nmfMethods);
    if (m == NULL) {
        return;
    }

    import_array();
    import_umath();
    d = PyModule_GetDict(m);
    version = PyString_FromString("0.1");
    PyDict_SetItemString(d, "__version__", version);
    Py_DECREF(version);

    nmf = PyUFunc_FromFuncAndDataAndSignature(nmf_functions, nmf_data, nmf_signatures, 1,
                                    1, 2, PyUFunc_None, "nmf",
                                    "", 0, nmf_signature);
    PyDict_SetItemString(d, "nmf", nmf);
    Py_DECREF(nmf);
}

Here the Python example code and the terminal output.

#/usr/bin/python

import numpy as np
import nmf

def nmf_wrap(x,p):
    x = np.asarray(x)
    assert x.ndim >=2
    shape = x.shape[-2:]
    n,m = shape[-2:]
    out1 = np.empty((n, p), dtype=x.dtype)
    out2 = np.empty((p, m), dtype=x.dtype)
    return nmf.nmf(x, out1, out2)

x = np.array([[1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15],[16,17,18,19,20]])
y,z = nmf_wrap(x,2)
print 'Input:\n', x
print 'Output 1:\n', y
print 'Output 2:\n', z

Input:
[[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]
 [16 17 18 19 20]]
Output 1:
[[ 1  2]
 [ 6  7]
 [11 12]
 [16 17]]
Output 2:
[[ 1  2  3  4  5]
 [ 6  7  8  9 10]]

Now I can continue to program.

I'm also writing the non-ufunc, using Python/C API (no numpy), as @nathaniel-j-smith suggested (that's what I got it.). I hope to have some results and compare the two approaches briefly.

You should edit your question with this, not answer it with the extra details. — Jaime, Oct 09 '14 at 23:08

Generalized Universal Function in numpy

2 Answers2