1

I wish to use scikit-learn Python machine learning library inside PostgreSQL plpython3u language functions. The easisest way to install scikit-learn (along with prerequisite NumPy and SciPy) is to install Anaconda.

Anaconda comes with built-in Python 3.5. However, PostgreSQL 9.5 EnterpriseDB installer installs PostgreSQL that requires Python 3.3 and does not utilize Anaconda with Python 3.3.

What is the way to go in order to enable using scikit-learn inside plpython3u PostgreSQL funtions?

a) Can I force PostgreSQL plpython3u to work with Python 3.5?

b) Can I force Anaconda to use Python 3.3 instead Python 3.5?

c) Is there any other solution to enable scikit-learn in PostgreSQL?

zlatko
  • 596
  • 1
  • 6
  • 23
  • 2
    I am not sure if it can be set as default behaviour, but you can create an [environment](http://conda.pydata.org/docs/using/envs.html) that uses python3.3 with `$ condas create -n envname python=3.3` (assuming a *nix OS, on Windows it might differ.) – m00am Jun 05 '16 at 10:00
  • Thanks m00am, this seems to work, I was able to create Plpythonu language. However, I still get error ( " No module named numpy") when executing this function: CREATE FUNCTION kmeans(x float[], y float[]) RETURNS int[] AS $$ from numpy import array from scipy.cluster.vq import vq, kmeans, whiten features = array(zip(x, y)) whitened = whiten(features) book = array((whitened[0], whitened[2])) codebook, distortion = kmeans(whitened, book) code, dist = vq(whitened, codebook) return list(code) $$ LANGUAGE plpythonu; – zlatko Jun 06 '16 at 12:52

1 Answers1

1

You need to install scilit-learn against the Python-3.3 distribution provided in EnterpriseDB's LanguagePack installer.

You can get it from StackBuilder GUI installer, and post-installation set up can be found here.

Then you need to install NumPy, SciPy and scikit-learn with pip command provided by LanguagePack python.

C.C. Hsu
  • 169
  • 2