Questions tagged [sklearn-pandas]

Python module providing a bridge between Scikit-Learn’s Machine Learning methods and pandas-style DataFrames

Resources

1336 questions
-1
votes
1 answer

Evaluating logistic regression using cross validation and ROC

I am trying to evaluate logistic regression using the AUROC curve and and cross-validate my scores. When I don't cross-validate I have no issues, but I really want to use cross validation to help decrease bias in my method. Anyway, below is the code…
-1
votes
1 answer

read word from each row in a dataframe

I want to read the word "risk" from every row in dataframe. If a row have the word risk in it then the dataframe should make a new column which will put 1 in it else 0. How can I achieve this ?
chetan parmar
  • 73
  • 1
  • 7
-1
votes
1 answer

Function does not finish executing in `hist` function only on second time

In Python DataFrame, Im trying to generate histogram, it gets generated the first time when the function is called. However, when the create_histogram function is called second time it gets stuck at h = df.hist(bins=3, column="amount"). When I say…
Temp O'rary
  • 5,366
  • 13
  • 49
  • 109
-1
votes
1 answer

Python 3 Cosine Nearest Neighbor Format

I am working on some data mining self-learning from a free online resource I found. Basically I got a csv file with a bunch of names, movie titles, and what each person rated it. I'm trying to get the K-Nearest Neighbor from it using a cosine metric…
-1
votes
2 answers

encoder gives value error when I call function on the data frame

I am trying to onehotencode one column of my data frame and the remaining columns are label encoded. I am using the code as below: def OneHotEncoder(repair,field): oe=preprocessing.OneHotEncoder() oe.fit(repair[field]) …
sayo
  • 207
  • 4
  • 18
-1
votes
1 answer

word count in graphlab vs sklearn

Is there any function in pandas or sklearn like in graphlab-create "graphlab.text analytics.count_words" to count words of every row and make a new column in csv data sheet of word count ?
-1
votes
1 answer

Working with the sklearn Boston Housing Dataset: Trying to create dataframe for coefficients

I've ran the following lines of code import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns %matplotlib inline from sklearn.datasets import load_boston boston = load_boston() print(boston.data.shape) from…
Ian M.
  • 1
  • 2
-1
votes
1 answer

Merge different columns Values - Pandas

I have Nine columns 'instlevel1','instlevel2','instlevel3', 'instlevel4', 'instlevel5','instlevel6','instlevel7','instlevel8','instlevel9' the values on this column are populated as follow : if instlevel1 value is 1, all others values for are 0, if…
-1
votes
1 answer

OneHotEncoding method is failing in sklearn

I have a data frame which i will denote df for now and i obtain an ndarray as follows X=df.iloc[:,5:].values which i want to use for a machine learning model. I need to one-hot-encode the 12th column of X. Using sklearn i first labelencoded it as…
Iltl
  • 153
  • 9
-1
votes
1 answer

ShuffleSplit of Sklearn issue

I have a data set named df_noyau_yes and I want to apply a ShuffleSplit to split it into train and test sets to train an autoencoder. The problem is that this functions returns indices of the shuffled data, I tried to extract the data of these…
Mari
  • 69
  • 1
  • 8
-1
votes
1 answer

Why is my y_pred model only close to zero?

I am new to python and also learning machine learning. I got a data-set for titanic and trying to predict who survived and who did not. But my code seems to have an issue with the y_pred, as none of them is close to 1 or above one. Find attached…
Banky
  • 1
  • 1
-1
votes
1 answer

How to binary encode tow mixed features?

I have a dataset looking like this one: import pandas as pd pd.DataFrame({"A": [2, 2, 1, 0, 5, 3, 0, 4, 5], "B": [1, 0, 0, 0, 1, 1, 1, 0, 0]}) A B 0 2 1 1 2 0 2 1 0 3 0 0 4 5 1 5 3 1 6 0 1 7 4 0 (I know that A is between 0 and…
stellasia
  • 5,372
  • 4
  • 23
  • 43
-1
votes
1 answer

Python sklearn df issue - Field Cady sample code issue

I'm working through Field Cady's "The Data Science Handbook", with sample code here: https://github.com/field-cady/the_data_science_handbook/blob/master/chapter08_classifiers/example.py I get syntax error from line 23 of this code, namely: File…
justdata
  • 153
  • 1
  • 7
-1
votes
1 answer

Python Sklearn Predicting values on an unseen data set

I have a set of football data in a database that I am trying to predict values for. import MySQLdb import pandas as pd from sklearn.feature_selection import RFE from sqlalchemy import create_engine import mysql.connector from matplotlib import…
-1
votes
2 answers

Pandas how update values with counts greater x

I have a pandas column that contains a lot of string that appear less than 5 times, I do not to remove these values however I do want to replace them with a placeholder string called "pruned". What is the best way to do this? df=…
Ari
  • 563
  • 2
  • 17