Questions tagged [joblib]

Joblib is a set of tools to provide lightweight pipelining in Python.

Joblib is a set of tools to provide lightweight pipelining in Python.

https://joblib.readthedocs.io/en/latest/

715 questions
0
votes
1 answer

Is it possible to get list of features/variables used in model after saving model file using joblib.dump?

I had built scikit-learn kmeans model and had dumped it using joblib.dump command. Now I want to test it with new set of data, but not able to recall features that were used in building. Could anyone help with which model attribute/function would…
bioinformatician
  • 364
  • 1
  • 12
  • 27
0
votes
2 answers

Error when using sklearn model loaded by joblib. TypeError: Cannot cast array data from dtype('O') to dtype('int64') according to the rule 'safe'

I created a VotingClassifier() object using sklearn. Later, I save it to voting_predictor.pkl file using joblib. While I load it successfully, when I try to predict some data as voting_predictor.predict(X_test) I get the following…
0
votes
1 answer

How to calculate the evaluation metrics using a persistence model in scikt-learn

I am running this model persistence using the joblib build-in. I am able to save the model and now I would like to test the probability and evaluate the outcome of a new project. from sklearn.tree import DecisionTreeClassifier from sklearn import…
Mohamed Abdillah
  • 379
  • 1
  • 3
  • 16
0
votes
1 answer

widely used multi-thread or concurrent processing for pyqt5?

My question is about the usage of thread in the pyqt5 application. I am fair newly to the GUI world, I am an embedded guy. I m having a hard time bundling my python3 application in Windows that uses Joblib to achieve parallelism. I am doing read and…
0
votes
1 answer

Printing a Parellel Function Outputs in True Order w/Python

Looking to print everything in order, for a Python parallelized script. Note the c3 is printed prior to the b2 -- out of order. Any way to make the below function with a wait feature? If you rerun, sometimes the print order is correct for shorter…
Bob Hopez
  • 773
  • 4
  • 10
  • 28
0
votes
1 answer

XGBoost too large for pickle/joblib

I'm having difficulty loading an XGBoost regression with both pickle and joblib. One difficulty could be the fact I am writing the pickle/joblib on a Windows desktop, but I am trying to load on a Macbook Pro I attempted to use this solution…
0
votes
1 answer

joblib to do parallel computing of a for loop, get error:'A task has filed to un-serialize'

I am using joblib to parallel a for loop for my own function. from joblib import Parallel, delayed from my_function import my_case_study result = Parallel(n_jobs=4)(delayed(my_case_study)(i) for i in range(100)) So my_case_study is the only…
Vickyyy
  • 197
  • 1
  • 9
0
votes
0 answers

How to achieve GPU parallelism using tensor-flow?

I am writing a gpu based string matching program using tensorflow edit distance features. By knowing the matching portion, I will extract the details and then store it to a datatable which eventually will be saved as a csv file. Here are the…
Sooraj
  • 514
  • 4
  • 20
0
votes
1 answer

Storing TfIdf model and then loading it to test the new dataset

I m trying to store the TfIdf vectorizer/model(Don't know whether it is a right word or not) obtained after training the dataset and then loading the stored model to fit the new dataset. Model is stored and loaded using pickle I have stored the…
0
votes
0 answers

Can't pickle dataclass with lambda default

If I have a simple dataclass with a mutable default, as far as I know I need to use default_factory. However, this makes the class unpickleable because of the lambda: @dataclass class ExperimentConfig: features: List[str] = field( …
gozzilli
  • 8,089
  • 11
  • 56
  • 87
0
votes
0 answers

Recursive issue on non-recursive script - Parallel Processing

I'm having trouble with parallel processing on a news scraping script. I have the following script that reads a google news rss page and processes each of the links returned. news_list is a BeautifulSoup element which contains information on the 10…
ebravo
  • 55
  • 1
  • 8
0
votes
1 answer

In a PyQt5 application, is it possible to run sklearn with parallel jobs without freezing

Is it possible to run, in a qt application, without freezing the gui, let's say a sklearn gird search that use several jobs parallel (n_jobs > 1)? The problem is that joblib that is used for parallelizing sklearn code cannot run multiprocess into a…
beesleep
  • 1,322
  • 8
  • 18
0
votes
1 answer

Reverse engineer scikit-learn serialized model

I am trying to understand the security implications of serializing a scikit-learn/keras fitted model (using pickle/joblib etc). Specifically, if I work on data that I don't want to be revealed, would there be anyway for someone to reverse engineer…
sbnukala
  • 33
  • 3
0
votes
1 answer

How to composite tasks in dask-distributed

I am trying to run a joblib parallel loop inside of a threaded dask-distributed cluster (see below the reason), but I can't get any speedup due to GIL-lock. Here's an example: def task(x): """ Sample single-process task that takes between 2 and…
A32167
  • 26
  • 2
0
votes
1 answer

How to save 2 sklearn models in one file

This page show methods to save model using either pickle: >>> import pickle >>> s = pickle.dumps(clf) >>> clf2 = pickle.loads(s) or joblib: >>> from sklearn.externals import joblib >>> joblib.dump(clf, 'filename.joblib') >>> clf =…
rnso
  • 23,686
  • 25
  • 112
  • 234