Questions tagged [kaggle]

Relating to Competitions, Datasets, Kernels, Learn, or Kaggle's API.

Relating to the following Kaggle data science categories:

1115 questions
5
votes
1 answer

Is there any alternative way to download kaggle competition data in Colab?

I am trying to use google colab for Kaggle Competitions. However, everything went fine until I tried to download the data. I got 403 - Forbidden error. I am able to download other competition's data such as us-consumer-finance-complaints but not…
sargupta
  • 953
  • 13
  • 25
5
votes
1 answer

How to prevent Azure ML Studio from converting a feature column to DateTime while importing a dataset

I’m having some issues trying to load a dataset in Azure ML Studio, a dataset containing a column that looks like a DateTime, but is in fact a string. Azure ML Studio converts the values to DateTimes internally, and no amount of wrangling seems to…
Vlad Iliescu
  • 8,074
  • 5
  • 27
  • 23
5
votes
1 answer

how to order seaborn pointplot

Here's code from kaggle Titanic competition kernel: grid = sns.FacetGrid(train_df, row='Embarked', size=2.2, aspect=1.6) grid.map(sns.pointplot, 'Pclass', 'Survived', 'Sex', palette='deep') grid.add_legend() It produces wrong plot, the one with…
sdd
  • 721
  • 9
  • 23
5
votes
1 answer

Python .loc confusion

I am doing a Kaggle tutorial for Titanic using the Datacamp platform. I understand the use of .loc within Pandas - to select values by row using column labels... My confusion comes from the fact that in the Datacamp tutorial, we want to locate all…
fashioncoder
  • 79
  • 2
  • 8
5
votes
1 answer

Import Kaggle csv from download url to pandas DataFrame

I've been trying different methods to import the SpaceX missions csv file on Kaggle directly into a pandas DataFrame, without any success. I'd need to send requests to login. This is what I have so far: import requests import pandas as pd from io…
Hadrien
  • 145
  • 3
  • 10
5
votes
3 answers

Group by a column and sort by another column in R

I am examining the imdb movie dataset in kaggle with R. Here is a minimal repro dataset: > movies <- data.frame(movie = as.factor(c("Movie 1", "Movie 2", "Movie 3", "Movie 4")), director = as.factor(c("Dir 1", "Dir 2", "Dir 1", "Dir 3")),…
Anand
  • 3,690
  • 4
  • 33
  • 64
5
votes
1 answer

Error in eval(expr, envir, enclos) : could not find function "eval"

I am working on the Kaggle Digit Recognizer problem.when I tried the given code I got the error. Error in eval(expr, envir, enclos) : could not find function "eval" library(ggplot2) library(proto) library(readr) train <-…
5
votes
1 answer

R: Kaggle Titanic Dataset Random Forest NAs introduced by coercion

Im currently practicing R on the Kaggle using the titanic data set I am using the Random Forest Algorthim Below is the code fit <- randomForest(as.factor(Survived) ~ Pclass + Sex + Age_Bucket + Embarked + Age_Bucket + Fare_Bucket +…
John Smith
  • 2,448
  • 7
  • 54
  • 78
5
votes
2 answers

SKLearn - Principal Component Analysis leads to horrible results in knn predictions

by adding PCA to the algorithm, I'm working to improve %96.5 SKlearn kNN prediction score for kaggle digit recognition tutorial, yet new kNN predictions based on PCA output are horrible like 23%. below is the full code and i appreciate if you point…
kannbaba
  • 135
  • 2
  • 7
5
votes
2 answers

How to Vectorize this R code Using Plyr, Apply, or Similar?

I wrote the following R code that identifies duplicate files in a directory. How can one vectorize the for-loop using the plyr package (or similar)? I would like to achieve a more idiomatic R solution than the one I came up with. library("digest")…
goplayer
  • 53
  • 4
5
votes
3 answers

Resolving PyDev Unresolved imports regarding numpy & sklearn

I've nearly everything I can find to resolve these Unresolved imports. Here is what I am trying to import: from sklearn.ensemble import RandomForestClassifier from numpy import genfromtxt, savetxt In eclipse on Mac OS X Lion running PyDev I get the…
Zaheer
  • 2,794
  • 5
  • 28
  • 33
4
votes
2 answers

Getting the message 'Cleanup called...' repeatedly while training a model on kaggle. How can we get rid of this? (CNN model using Keras)

model.compile(optimizer='adam',loss='categorical_crossentropy', metrics=['accuracy']) history = model.fit(train_data,epochs = 1,validation_data = test_data,verbose=1, callbacks =[earlystopping, csv_logger]) 9/87606 [..............................]…
4
votes
2 answers

how to install kaggle_datasets?

I'm trying to follow the Kaggle Monet CycleGAN Tutorial and in the first block of code where we are importing the libraries, one of them is kaggle_datasets. I have pip installed Kaggle, but when I try to import kaggle_datasets I get the…
Conweezy
  • 105
  • 5
  • 15
4
votes
1 answer

Overriding my company's proprietary pypi repository for a specific package (kaggle)

I want to install the kaggle package on my employer's laptop, but it does not exist in the proprietary pypi mirror they have configured. How do I bypass their pypi repo for the default one?
4
votes
2 answers

PyCaret methods on GPU/TPU

When I ran best_model = compare_models() there is a huge load on CPU memory, while my GPU is unutilized. How do I run the setup() or compare_models() on GPU? Is there an in-built method in PyCaret?
LITDataScience
  • 380
  • 5
  • 14