Questions tagged [kaggle]

Relating to Competitions, Datasets, Kernels, Learn, or Kaggle's API.

Relating to the following Kaggle data science categories:

1115 questions
0
votes
2 answers

logits and labels must have the same first dimension

I am trying to create a recipe generator at kaggle using tensorflow and lstm. But I am totally stuck in something related to dimesions. Can someone point me out in the right…
Pablo Castilla
  • 2,723
  • 2
  • 28
  • 33
0
votes
1 answer

Predict the output based on multiple input variable using XGBoost in Python

I am new to xgboost and trying to do the following things. predict the output variable using input variables Trying to find out which input variables are having more correlation (good relationship) with the output variable. I am not able to get…
user3827728
  • 109
  • 1
  • 1
  • 9
0
votes
1 answer

How to test a logistic regression model in R?

I'm developing a CTR prediction model for the Kaggle competition (link). I've read in the first 100,000 lines of data from the training set, then further split this into train/test sets at 80/20 by ad_data <- read.csv("train", header = TRUE,…
0
votes
2 answers

How to display dataframes?

I am doing the Titanic problem in Kaggle and I have problems displaying the dataframe: import pandas as pd import numpy as np titanic = pd.read_csv("input/train.csv") titanic.head() This should display the train.csv but it doesn't. Do you know…
Nacho
  • 35
  • 5
0
votes
1 answer

Creating Dataframe from a json file

I want to create a proper data frame reading from a json file. I am able to view the created data frame properly, but dplyr function group_by does not work on it. It is probably because when I do the str() of the data frame created it gives every…
Arpit Goel
  • 163
  • 1
  • 16
0
votes
1 answer

Poor results with Keras inbuilt VGG16 model on cats vs. dogs dataset

I am trying to apply the inbuilt VGG16 Keras model to the Kaggle Cats vs Dogs dataset. However, I get 52% accuracy which is barely better than complete hazard. Any idea why this would be the case? Download notebook Vlad
Vlad
  • 55
  • 1
  • 14
0
votes
0 answers

Log loss score in training & validation very different from score on Kaggle test set

I'm able to get a log loss score as low as 0.24 in training and 0.38 in validation, but once I submit my predictions to Kaggle to score on the test set, the loss is way off (sometimes as high as 4, but mostly never below 0.69). Any ideas what could…
Anas
  • 866
  • 1
  • 13
  • 23
0
votes
1 answer

Python sklearn kaggle/titanic tutorial fails on the last feature scale

I was in the process of working through this tutorial: http://ahmedbesbes.com/how-to-score-08134-in-titanic-kaggle-challenge.html And it went with no problems, until I got to the last section of the middle section: As you can see, the features range…
Rich
  • 1,103
  • 1
  • 15
  • 36
0
votes
3 answers

Decision Trees with SKlearn and Visualization

working on the Kaggle Titanic data set. I'm trying to understand decision trees better, I've worked with linear regressions a good bit but never decision trees. I'm trying to create a visualization in python for my tree. Something isn't working…
Josh Dautel
  • 143
  • 2
  • 16
0
votes
1 answer

Jupyter IPYNB Problems

so I'm trying to familiarize myself with the Jupyter Notebook and am running into issues. When I run the following code in the normal .py file of my PyCharm IDE it runs perfect; however, if I run it in my notebook the [*] never disappears meaning it…
Josh Dautel
  • 143
  • 2
  • 16
0
votes
2 answers

Very slow performance on fitting Keras model in Windows 10

I have installed Anaconda and Python 3.5 on Windows 10. When I try a sample model, it took a long time for the first epoch so I never get into the second epoch! Here is my model: def larger_model(): # create model model = Sequential() …
Amir Pournasserian
  • 1,600
  • 5
  • 22
  • 46
0
votes
0 answers

Keras - Char 74k Character Recoginition - CNN

I followed the following blog for Character Recognition using CNN. http://ankivil.com/kaggle-first-steps-with-julia-chars74k-first-place-using-convolutional-neural-networks The only change I did was dim_ordering="th" latest keras…
siva
  • 1,429
  • 3
  • 26
  • 47
0
votes
1 answer

How to run .sqlite files?

I have downloaded a file called "database.sqlite" from "https://www.kaggle.com/benhamner/d/kaggle/college-scorecard/exploring-the-us-college-scorecard-data" .I have to connect the database to R but i don't know how to run the file or create database…
0
votes
1 answer

Kernel gets busy when using nltk.download

I am using jupyter notebook to practice this problem on kaggle https://www.kaggle.com/c/word2vec-nlp-tutorial/details/part-1-for-beginners-bag-of-words. When I use the following code import nltk nltk.download() # Download text data sets,…
nitinvijay23
  • 1,781
  • 3
  • 13
  • 11
0
votes
2 answers

Python 3 Comparison String from Kaggle Dataset CSV-data: Error 'string index out of range python'

I´m currently doing a project for university in which I need to evaluate a dataset from Kaggle: enter image description here My problem is pretty simple, but I just couldn´t figure it out by researching: How can I make a comparision if the salary…
Andy89
  • 21
  • 4