Highest Voted 'sklearn-pandas' Questions

0

votes

0 answers

Repetition of raw dataset after clustering

from sklearn.feature_extraction.text import TfidfVectorizer tfidf_vectorizer = TfidfVectorizer(max_df=0.08, max_features=200, min_df=0.02, stop_words='english', use_idf=True,…

asked Jun 17 '16 at 10:27

Jeet Dadhich

71
1
1
6

0

votes

1 answer

How can I organize data using Pandas?

I'm a newbie at Python. I'm trying to organize a CSV file into a readable grid. When I converted my Excel file to CSV, the output became garbled, a mess of commas and scattered values. I tried list, but it still didn't organize the data the way I…

python csv numpy pandas sklearn-pandas

asked May 25 '16 at 17:16

dabberson567

43
2
2
11

0

votes

1 answer

Number of features of the model must match the input

For some reason the features of this dataset is being interpreted as rows, "Model n_features is 16 and input n_features is 18189" Where 18189 is the number of rows and 16 is the correct feature list. The suspect code is here: for var in cat_cols: …

python numpy scikit-learn sklearn-pandas

asked Apr 19 '16 at 15:30

AaronS

23
5

0

votes

1 answer

Performing PCA on a dataframe with Python with sklearn

I have a sample input file that has many rows of all variants, and columns represent the number of components. A01_01 A01_02 A01_03 A01_04 A01_05 A01_06 A01_07 A01_08 A01_09 A01_10 A01_11 A01_12 A01_13 A01_14 A01_15 A01_16 A01_17 …

python r for-loop pca sklearn-pandas

asked Mar 27 '16 at 16:33

user5927494

129
1
10

0

votes

1 answer

Imputer with different types of values

Does the Imputer in sklearn can deal with different types of data? For example string and numbers are both represented as ?, when applying the Imputer it works with only one strategy.

python scikit-learn sklearn-pandas

asked Dec 04 '15 at 12:44

shermanv

13
5

-1

votes

2 answers

Sklearn Random Forest: determine the name of features ascertained by parameter grid for model fit and prediction

New to ML here and trying my hands on fitting a model using Random Forest. Here is my simplified code: X_train, X_test, y_train, y_test = train_test_split(features, target, test_size=0.15, random_state=42) model = RandomForestRegressor() …

machine-learning scikit-learn random-forest sklearn-pandas grid-search

asked Jul 06 '23 at 04:45

Sinha

431
1
5
12

-1

votes

1 answer

Error when trying to fit a dataset. (python)

I am trying to fit a sklearn linear regression model with many points from a pandas dataframe. this is the program: features =["floors", "waterfront", "lat", "bedrooms", "sqft_basement", "view", "bathrooms", "sqft_living15", "sqft_above", "grade",…

python pandas dataframe scikit-learn sklearn-pandas

asked May 18 '23 at 04:28

Legofan35664

1
2

-1

votes

1 answer

ValueError: Input contains NaN, infinity or a value too large for dtype('float64') when using randomizedSearch

I am trying to use RandomizedSearchCV from sklearn on an MLPRegressor model, and I have scaled the data using standardScaler. The code for the model is presented below. When I try to run the code I get this error from the…

python machine-learning sklearn-pandas mlp randomized-algorithm

asked May 10 '23 at 11:12

user17637519

31
5

-1

votes

1 answer

Do we need to exclude OneHotEncoded columns while standardizing or normalizing using MinMaxScaler() or StandardScaler()?

This is the final cleaned DataFrame (df2) before Standardizing my code: scaler=StandardScaler() df2[list(df2.columns)]=scaler.fit_transform(df2[list(df2.columns)]) df2 This returns a DataFrame after Standardizing every column including dummies and…

python-3.x machine-learning sklearn-pandas data-preprocessing standardization

asked Apr 06 '23 at 18:42

SAJEER AR

3
2

-1

votes

1 answer

How to implement regularization

My task was to implement model parameter tuning using stochastic gradient descent. Below is my function implementation code. However, I would like to add any regularization. def gradient(X, y, w, batch, alpha): gradients = [] error =…

python machine-learning sklearn-pandas regularized

asked Mar 17 '23 at 18:07

villerpa

1

-1

votes

1 answer

Polynomial Features Error: X has 10 features, but PolynomialFeatures is expecting 9 features as input

Today i'm modeling a dataframe using PolinomialFeatures from sklearn but I keep encountering this error: ValueError: X has 10 features, but PolynomialFeatures is expecting 9 features as input. Coming from the line where I generate the new data frame…

python pandas scikit-learn data-science sklearn-pandas

asked Jan 22 '23 at 00:18

Daniel Martinez

3
1

-1

votes

1 answer

RuntimeWarning: invalid value encountered in divide in ML By Sklearn in Python

After I run my project these error shown and i don't know what am i doing? :\Users\Alir\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\utils\extmath.py:1047: RuntimeWarning: invalid value encountered in divide updated_mean =…

python-3.x machine-learning sklearn-pandas

asked Jan 13 '23 at 05:20

Alir Riazi

1

-1

votes

1 answer

What is an acceptable enough difference between the accuracy of the Train_set and Test_set?

I am working on a Data Science project which is a model to predict whether the imports are Fake or not. I have a training database on which one of my models is achieving up to 92-93% accuracy but on 51% of the test database, it is achieving only…

database scikit-learn data-science sklearn-pandas

asked Jun 05 '22 at 07:51

user202004

151
8

-1

votes

1 answer

Sklearn can't convert string to float

I'm using Sklearn as a machine learning tool, but every time I run my code, it gives this error: Traceback (most recent call last): File "C:\Users\FakeUserMadeUp\Desktop\Python\Machine Learning\MachineLearning.py", line 12, in …

python csv artificial-intelligence sklearn-pandas

asked May 08 '22 at 11:55

Mr. MAD

1
4

-1

votes

1 answer

Pandas groupby -- get output value based on max value of another column

I have the following dataframe: df = pd.DataFrame({'Animal': ['Falcon', 'Falcon', 'Parrot', 'Parrot'], 'Habitat':['Jungle', 'Jungle', 'Sky', 'Sky'], …

python pandas sklearn-pandas

asked Sep 29 '21 at 14:54

DumbCoder

233
2
9

Questions tagged [sklearn-pandas]

Resources

Repetition of raw dataset after clustering

How can I organize data using Pandas?

Number of features of the model must match the input

Performing PCA on a dataframe with Python with sklearn

Imputer with different types of values

Sklearn Random Forest: determine the name of features ascertained by parameter grid for model fit and prediction

Error when trying to fit a dataset. (python)

ValueError: Input contains NaN, infinity or a value too large for dtype('float64') when using randomizedSearch

Do we need to exclude OneHotEncoded columns while standardizing or normalizing using MinMaxScaler() or StandardScaler()?

How to implement regularization

Polynomial Features Error: X has 10 features, but PolynomialFeatures is expecting 9 features as input

RuntimeWarning: invalid value encountered in divide in ML By Sklearn in Python

What is an acceptable enough difference between the accuracy of the Train_set and Test_set?

Sklearn can't convert string to float

Pandas groupby -- get output value based on max value of another column