I am using AutoSKlearn in Python
The code works fine but when I change parameter for n_jobs = -1 that cause this error
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
/usr/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 12 leaked semaphore objects to clean up at shutdown
I googled and found a solution
https://github.com/automl/auto-sklearn/issues/996
The solution states that using
if __name__ == '__main__':
should fix the problem
I did that but still having the same error
Am I using it in a wrong way?
Can someone advise if I am setting that line correctly and how I should use it?
Here is my code:
import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
import pyodbc
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import f1_score, precision_score, recall_score
import datetime
from sklearn.preprocessing import MultiLabelBinarizer
from sklearn.multiclass import OneVsRestClassifier
from sklearn.model_selection import cross_val_predict
#import winsound
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.neural_network import MLPClassifier
from sklearn.ensemble import AdaBoostClassifier
import time
import autosklearn.classification
if __name__ == '__main__':
df = pd.read_csv("c:\\my.csv")
X = df.drop(Code, axis=1, errors='ignore')
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
mdl = autosklearn.classification.AutoSklearnClassifier(
time_left_for_this_task=60*5,
per_run_time_limit=30*1,
n_jobs=-1,
memory_limit = 1024 * 10,
initial_configurations_via_metalearning=0,
smac_scenario_args={'runcount_limit': 50}, )
mdl.fit(X_train,y_train)
y_pred=mdl.predict(X_test)