-2

I have 62000 font images( 0-9,A-Z and a-z images) data set in which for single character have 1000 image.I have created csv file of 62000 row of images normalized pixel value and labels. I don't know to extract this csv file in training,validation and testing dataset so that i can get better accuracy. enter image description here

  • Work on your formatting, also, perhaps you want to use https://keras.io/datasets/ or http://scikit-learn.org/stable/tutorial/basic/tutorial.html#introduction ? Many machine-learning platforms have some easy loading of mnist available. – TheLaurens Mar 21 '17 at 14:29
  • these dataset is for digits only. i want to train on character dataset. i have dataset but i am confusing how to make it more simple for training, validation and testing. dataset of CSV file i have in which 1000 images of 'A',1000 images of 'B' and so on. – vikas sonwani Mar 22 '17 at 05:49

1 Answers1

0

You can use SciKit-Learn's train_test_split.

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

X, y = your.data, your.target #input your own data here
train, test = train_test_split(X, test_size = 0.2, random_state=0)

Also, read this sklearn tutorial

Avinash Hindupur
  • 421
  • 5
  • 15