I have developed a Python script that basically reads an excel file and trains a model using sklearns GridSearchCV, using the n_jobs
statement:
def create_table():
my_model = GridSearchCV(GradientBoostingRegressor(), tuned_parameters, cv=5, scoring='neg_mean_absolute_error', n_jobs=7)
my_model.fit(x_data, y_data)
return(my_model.predict(new_x_data))
This perfectly works when executing it. But now I am trying to execute it from a button click in a Dash app
Multiprocessing backed parallel loops cannot be nested below threads, setting n_jobs=1
My Dash app is like this:
def generate_html_table(dataframe, max_rows=50):
return html.Table(
# Header
[html.Tr([html.Th(col) for col in dataframe.columns])] +
# Body
[html.Tr( [html.Td(dataframe.index[i])] + [html.Td(dataframe.iloc[i][col]) for col in dataframe.columns]) for i in range(min(len(dataframe), max_rows))]
)
app = dash.Dash()
app.layout = html.Div([
html.H1(children='Region Forecast',
style={'textAlign': 'center'} ),
html.Button(id='submit-button', n_clicks=0, children='Submit',
style={ 'margin': 'auto',
'display': 'block' }),
html.Table(id='output-table', children = generate_html_table(pd.DataFrame()))
])
@app.callback(Output('output-table', 'children'),
[Input('submit-button', 'n_clicks')])
def reactive_compute(n_clicks):
print('inside reactive compute')
my_table = create_my_table()
return(generate_html_table(my_table))
if __name__ == '__main__':
app.run_server(debug=True)
I've seen this question, but it doesn't help me because I do not handle the multiprocessing myself (it's the scikitlearn function): Multiprocessing backed parallel loops cannot be nested below threads
The app would have to work only on local, I am not planning to put it on a web server.
Can I keep the parallel model fitting from the Dash app and if it is possible, how should I best approach this ?