How can I convert a png to a dataframe for python?

Question

I trained a model for Digit Recognizer (https://www.kaggle.com/c/digit-recognizer/data). The input data is a csv file. Each row in the file represent an image which is 28 pixels in height and 28 pixels in width, for a total of 784 pixels in total. The model is ready to use but I wonder how I can create a testing data for this input? If I have an image with digital number, how can I convert it to 28 by 28 pixels in an array format.

I tried below code but it renders the image background as yellow color. The png image has white background so I don't understand why it shows yellow.

import numpy as np
import cv2 
import csv 
import matplotlib.pyplot as plt

img = cv2.imread('./test.png', 0) # load grayscale image. Shape (28,28)

flattened = img.flatten() # flatten the image, new shape (784,)
row = flattened.reshape(28,28)

plt.imshow(row)
plt.show()

Please provide a sample of the CSV. – gmds Apr 29 '19 at 06:56 — gmds, Apr 29 '19 at 06:56

Tim · Accepted Answer · 2019-04-30T06:31:10.030

I prepared a little example for you, which gives you hopefully an idea on how you can achieve this task:

I am using this image as example:

Full script:

import numpy as np
import cv2 
import csv 

img = cv2.imread('./1.png', 0) # load grayscale image. Shape (28,28)

flattened = img.flatten() # flatten the image, new shape (784,)

flattened = np.insert(flattened, 0, 0) # insert the label at the beginning of the array, in this case we add a 0 at the index 0. Shape (785,0)


#create column names 
column_names = []
column_names.append("label")
[column_names.append("pixel"+str(x)) for x in range(0, 784)] # shape (785,0)

# write to csv 
with open('custom_test.csv', 'w') as file:
    writer = csv.writer(file, delimiter=';')
    writer.writerows([column_names]) # dump names into csv
    writer.writerows([flattened]) # add image row 
    # optional: add addtional image rows

Now you have the same csv structure as provided in your example.

custom_test.csv output (shortened):

label;pixel0;pixel1;pixel2;pixel3;pixel4;pixel5;pixel6;pixel7;pixel ...
0;0;0;0;0;0;0;0;0;0;0;0....

EDIT: To visualize the flattened image with matplotlib, you have to specfiy a colormap:

row = flattened.reshape(28,28)
plt.imshow(row, cmap='gray') # inverse grayscale is possible with: cmap='gray_r'

I have tried your code and I plot the flattened by `imshow` which give me the image with yellow background color. The original image has white background why is it changed to yellow? — Joey Yi Zhao, Apr 30 '19 at 01:24
@ZhaoYi i updated my answer. Please upvote/accept my answer if it helped you. Thanks! — Tim, Apr 30 '19 at 06:31

How can I convert a png to a dataframe for python?

1 Answers1

Linked