Python module providing a bridge between Scikit-Learn’s Machine Learning methods and pandas-style DataFrames
Questions tagged [sklearn-pandas]
1336 questions
0
votes
1 answer
Inverse Transform Predicted Results
I have a training data CSV with three columns (two for data and a third for targets) and I successfully predicted the target column for my test CSV. The problem is I need to inverse transform the results back to strings for further analysis. Below…

jchristensen912
- 3
- 1
- 3
0
votes
1 answer
Visualize Sparse Input from SKlearn Kmeans with MatplotLib
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.cluster import KMeans
cc_tfid = TfidfVectorizer().fit_transform(cc_corpus)
cc_km = KMeans(n_clusters = 3, init = 'k-means++', max_iter = 99, n_init = 4, verbose = False…

SarahJessica
- 473
- 1
- 7
- 18
0
votes
0 answers
Same number of outliers in LOF
I am running lof algorithm for around 100k 2d points. Each time, I run the lof algorithm with different n_neighbours parameter, I get the same number of points as outliers. It's always 10% of the points as outliers. Is this how this algorithm is…

ayush gupta
- 607
- 1
- 6
- 14
0
votes
0 answers
my from sklearn.decomposition import PCA have an error
I tried using "from sklearn.decomposition import PCA" on windows python 2.7 to my program, but the result was an error and it said like this:
Traceback (most recent call last):
File "", line 1, in
from sklearn.decomposition…

Nomad
- 1
- 3
0
votes
1 answer
How to match and merge the pandas Dataframe with the list?
I have a simple pandas data frame and list which is as fallows
import pandas as pd
frame = pd.DataFrame({'a' : ['the cat is blue', 'the sky is green', 'the dog is black']})
mylist =['cat blue', 'sky green', 'dog black']
how to find the match…

Syed Jameer
- 1
- 3
0
votes
0 answers
T-SNE memory error
I running tsne on a dataset which has 314k records. I took one column from the dataset which is text column and converted into bag of words. When I am running it is giving me the memory error. Could anyone help how to solve it?
from sklearn.manifold…

merklexy
- 69
- 2
- 7
0
votes
1 answer
sklearn.model_selection fails to load DLL
I'm trying to work through a tensorflow example which utilises sklearn and keep getting a DLL load error. I've cut down the code to the bare minimum in order to debug:
import sklearn
print(sklearn.__version__)
from…

cmacdona101
- 175
- 1
- 2
- 11
0
votes
1 answer
Getting dimension mismatch error when i try to predict with naive bayes / Python
I've created a sentiment script and use Naive Bayes to classify the reviews. I trained and tested my model and saved it in a Pickle object. Now I would like to perform on a new dataset my prediction but I always get following error message
raise…

Nika
- 145
- 1
- 13
0
votes
2 answers
How to find and add frequency column for ID?
I am a beginner at python, so bear with me!
My dataset is from excel and I was curious how to find and add a frequency column for my ID.
I first performed the groupby function for ID and date by doing:
dfcount = dfxyz.groupby(["ID", "Date"])
and…

Mark White
- 1
- 1
0
votes
1 answer
Confused about sklearn’s implementation of OSVM
I have recently started experimenting with OneClassSVM ( using Sklearn ) for unsupervised learning and I followed
this example .
I apologize for the silly questions But I’m a bit confused about two things :
Should I train my svm on both…

mousa alsulaimi
- 316
- 1
- 14
0
votes
2 answers
Text field concatenation in sklearn pipeline
I have a multi line json dataset that contains multiple fields that can or cannot exists and can contain textual data in either string, list of strings or more complicated mapping (list of dicts)
eg.:
{"yvalue":1.0,"field1":"Some text",…

Tom Lous
- 2,819
- 2
- 25
- 46
0
votes
1 answer
UnicodeDecodeError in Python Classification Arabic Datasets
I have Arabic datasets for classification using Python; two directories (negative and positive) in a Twitter directory.
I want to use Python classes to classify the data. When I run the attached code, this error occurs:
>
File…

Khalid
- 37
- 1
- 8
0
votes
1 answer
create training validation split using sklearn
I have a training set consisting of X and Y, The X is of shape (4000,32,1) and Y is of shape (4000,1).
I would like to create a training/validation set based on split. Here is what I have been trying to do
from sklearn.model_selection import…

user785099
- 5,323
- 10
- 44
- 62
0
votes
1 answer
Write custom transformer in sklearn which returns .predict of estimator in .transform
We have a custom transformer
class EstimatorTransformer(base.BaseEstimator, base.TransformerMixin):
def __init__(self, estimator):
self.estimator = estimator
def fit(self, X, y):
self = self.estimator.fit(X,y)
…

Rudrani Angira
- 956
- 2
- 14
- 28
0
votes
1 answer
Naive Bayes classifier - empty vocabulary
I am trying to use Naive Bayes to detect humor in texts. I have this code taken from here but I have some errors and I don't know how to resolve them because I am pretty new to Machine Learning and these algorithms. My train data contains…

Mr. Wizard
- 1,093
- 1
- 12
- 19