Questions tagged [kaggle]

Relating to Competitions, Datasets, Kernels, Learn, or Kaggle's API.

Relating to the following Kaggle data science categories:

1115 questions
0
votes
1 answer

Plot proportion from Dataset

I am trying to plot proportion for age distribution for Titanic Data from Kaggle. age_distribution_died=…
0
votes
1 answer

OutOfRangeError while doing logistic regression

I have created repo https://github.com/joyjeni/tensorflowexample.git I m trying to do logistic regression on kaggle titanic dataset I get below error OutOfRangeError: RandomShuffleQueue '_398_shuffle_batch_16/random_shuffle_queue' is closed and has…
Rehoboth
  • 43
  • 2
  • 6
0
votes
2 answers

How to efficiently extract numbers from text in a data.table column in R

I'm just learning R for data science, and used these few lines to extract numbers from data (using data.table): library(stringr) library(data.table) prods[, weights := str_extract(NombreProducto, "([0-9]+)[kgKG]+")] prods[, weights :=…
wordsforthewise
  • 13,746
  • 5
  • 87
  • 117
0
votes
1 answer

My rows are mismatched in my SVM scripting code for Kaggle

I am reviewing my e1071 code for SVM for the Kaggle Titanic data. Last I knew, this part of it was working, but now I'm getting a rather strange error. When I try to build my data.frame so I can submit to kaggle, it seems my prediction is the size…
hlyates
  • 1,279
  • 3
  • 22
  • 44
0
votes
0 answers

Alternative syntax for "for loop" in python

When learning python basics from Kaggle scripts i encountered such code: NL_1 = [elem.split("\n") for elem in Name_list] Where Name_list is list of character objects - names of people in titanic dataset. I know what does it do, but still i feel…
cure
  • 425
  • 2
  • 12
0
votes
1 answer

R: Create a list of functions calls

I'm trying to understand a little more about R and came across this really good script here on Kaggle: https://www.kaggle.com/msjgriffiths/d/kaggle/sf-salaries/explore-sf-salary-data/code I'm a beginner in R and I'm struggling to understand a…
Simon
  • 19,658
  • 27
  • 149
  • 217
0
votes
1 answer

Getting HTML elements via XPath in bash

I was trying to parse a page (Kaggle Competitions) with xpath on MacOS as described in another SO question: curl…
Anton Tarasenko
  • 8,099
  • 11
  • 66
  • 91
0
votes
1 answer

ValueError: array length does not match index length

I am practicing for contests like kaggle and I have been trying to use XGBoost and am trying to get myself familiar with python 3rd party libraries like pandas and numpy. I have been reviewing scripts from this particular competition called the…
Pavan Vasan
  • 391
  • 1
  • 9
  • 28
0
votes
2 answers

How do you work on an AWS machine in kaggle?

I want to work on an aws machine for a kaggle competition. While working on my own pc i have Anaconda installed, pycharm. How do i set it up on an AWS machine? Do i need to install the tools each time i log in the AWS machine. What is the…
0
votes
2 answers

Python tfidf returning same values regardless of idf

I am trying to build a small program that calculates the tfidf in python. There are two very nice tutorials which I have used (I have code from here and another function from kaggle ) import nltk import string import os from bs4 import * import…
Peter
  • 355
  • 1
  • 8
  • 23
0
votes
1 answer

Python 3.x - Merge pandas data frames

I am using Python for Titanic disaster competition on Kaggle. The dataset (df) contains 3 attributes corresponding to each passenger - 'Gender'(1/0), 'Age' and 'Pclass'(1/2/3). I want to obtain median age corresponding to each Gender-Pclass…
Rohan Bapat
  • 343
  • 2
  • 4
  • 17
0
votes
1 answer

run r script using docker kaggle image

I am trying to reproduce results of an R script on my local Windows OS (reproduce the results which it gave on kaggle server). For this someone suggested to use docker images to run r script on my local. I have installed docker and finished the…
user3664020
  • 2,980
  • 6
  • 24
  • 45
0
votes
1 answer

Turning hexadecimal representation of code segment back to binary

The malware samples provided by Microsoft in the Kaggle challenge (https://www.kaggle.com/c/malware-classification/data) contain hexadecimal representation of the code segment. An example: 00401000 00 00 80 40 40 28 00 1C 02 42 00 C4 00 20 04…
user1734905
  • 333
  • 3
  • 14
0
votes
1 answer

Scikit-learn TruncatedSVD documentation

I plan to use sklearn.decomposition.TruncatedSVD to perform LSA for a Kaggle competition, I know the math behind SVD and LSA but I'm confused by scikit-learn's user guide, hence I'm not sure how to actually apply TruncatedSVD. In the doc, it states…
howard
  • 255
  • 1
  • 4
  • 12
0
votes
2 answers

Python 3.+, Scipy Stats Mode function gives Type Error unorderable types: str() > float()

I am trying to solve kaggle titanic disaster problem, specifically using mode/ mean/ median to input missing values. Here is a peak at my data set Parch Ticket Fare Cabin Embarked 0 0 A/5 21171 7.2500 NaN …
aks
  • 8,796
  • 11
  • 50
  • 78