Azure ML Studio Environment throws the below error while consuming the pickle file from the custom python model. Where the python local model, pickle file works fine with the local environment, but not in Azure ML Studio Environment
Error 0085: The following error occurred during script evaluation, please view the output log for more information: ---------- Start of error message from Python interpreter ---------- Caught exception while executing function: Traceback (most recent call last): File "C:\server\invokepy.py", line 199, in batch odfs = mod.azureml_main(*idfs) File "C:\temp\b1cb10c870d842b9afcf8bb8037155a1.py", line 49, in azureml_main return DATA, model.predict_proba(DATA) File "C:\pyhome\lib\site-packages\sklearn\ensemble\forest.py", line 540, in predict_proba n_jobs, _, _ = _partition_estimators(self.n_estimators, self.n_jobs) File "C:\pyhome\lib\site-packages\sklearn\ensemble\base.py", line 101, in _partition_estimators n_jobs = min(_get_n_jobs(n_jobs), n_estimators) File "C:\pyhome\lib\site-packages\sklearn\utils__init__.py", line 456, in _get_n_jobs if n_jobs < 0: TypeError: unorderable types: NoneType() < int() Process returned with non-zero exit code 1 ---------- End of error message from Python interpreter ----------
Anything is missing?
Python Pickle file works fine with the local environment.
# The script MUST contain a function named azureml_main
# which is the entry point for this module.
# imports up here can be used to
import pandas as pd
import sys
import pickle
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import LabelEncoder
import numpy as np
import pickle
import os
def azureml_main(DATA = None, dataframe2 = None):
# Execution logic goes here
# print('Input pandas.DataFrame #1:\r\n\r\n{0}'.format(DATA))
# If a zip file is connected to the third input port is connected,
# it is unzipped under ".\Script Bundle". This directory is added
# to sys.path. Therefore, if your zip file contains a Python file
# mymodule.py you can import it using:
# import mymodule
sys.path.append('.\\Script Bundle\\MyLocalModel.zip')
sys.path.insert(0,".\Script Bundle")
model = pickle.load(open(".\Script Bundle\MyLocalModel.pkl", 'rb'))
#result = pd.DataFrame(model.predict_proba(dataframe1), columns=['p0','p1'])
# Return value must be of a sequence of pandas.DataFrame
return DATA, model.predict_proba(DATA)
The python custom model needs to be consumed in azure ml studio, to deploy as web service, with the same outputs of the local model
Update1 on April 17:
the Python Version 2.7.11 is same in local and Azure ML Studio, but found out that, the sklearn version is different in local [0.18.x] and Azure ML Studio [0.15.x], where the train_test_split is different as code below:
##from sklearn.model_selection import train_test_split ## works only with 0.18.x
import sklearn
from sklearn.cross_validation import train_test_split ## works only with 0.15.x
print ('sklearn version {0}'.format(sklearn.__version__))
1) Now, how do update the sklearn package to the latest version in Azure ML Studio? Or the other way is to degrade my local sklearn, to try out, will experiment this out.
2) Another exercise did was, to create the model in Azure ML Studio using the MDF [MulticlassDecisionForest] Algorithm. And the local was using RFC [RandomForestClassifier] algorithm, but both the outputs are entirely different, not matching?
Below code in the local environment with sklearn version 0.18.x using RFC algorithm: ## Random Forest Classifier in local environment with sklearn version 0.18.x from sklearn.ensemble import RandomForestClassifier
## Random Forest Classifier
rfc = RandomForestClassifier(n_estimators = 550,max_depth = 6,max_features = 30,random_state = 0)
rfc.fit(X_train,y_train)
print (rfc)
## Accuracy test
accuracy = rfc.score(X_test1,y_test1)
print ("Accuracy is {}".format(accuracy))
3) Have reproduced the local python code with Azure ML Studio Execute Python Script with the lower version of sklearn version 0.15.x Which has resulted the same outputs of local as well, except very few test data set rows. Now, How to train the model from Python Script as Untrained model input to Train Model component? Or to write the pickle file inside the DataSet, and to consume as Custom Model?
Your valuable inputs are much appreciated.