I have 62000 font images( 0-9,A-Z and a-z images) data set in which for single character have 1000 image.I have created csv file of 62000 row of images normalized pixel value and labels. I don't know to extract this csv file in training,validation and testing dataset so that i can get better accuracy. enter image description here
Asked
Active
Viewed 106 times
-2

vikas sonwani
- 11
- 4
-
Work on your formatting, also, perhaps you want to use https://keras.io/datasets/ or http://scikit-learn.org/stable/tutorial/basic/tutorial.html#introduction ? Many machine-learning platforms have some easy loading of mnist available. – TheLaurens Mar 21 '17 at 14:29
-
these dataset is for digits only. i want to train on character dataset. i have dataset but i am confusing how to make it more simple for training, validation and testing. dataset of CSV file i have in which 1000 images of 'A',1000 images of 'B' and so on. – vikas sonwani Mar 22 '17 at 05:49
1 Answers
0
You can use SciKit-Learn's train_test_split
.
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
X, y = your.data, your.target #input your own data here
train, test = train_test_split(X, test_size = 0.2, random_state=0)
Also, read this sklearn tutorial

Avinash Hindupur
- 421
- 5
- 15