0

i am actually a new to python , currently i am facing an issue of panda dataframe ... i have a loop which outputs me on certain conditions the name of a person/folder in each iteration , i want this output name to be immediately sent to the dataframe but what i get in the dataframe is only 1 single row having the output of last iteration and all previous iteration outputs gets over write ... below is the code i am using i hope u will understand my problem and will help

from scipy.spatial import distance
import csv
import dlib
import os
import numpy as np
import cv2
import pandas as pd
from skimage import  io
import face_recognition
from PIL import Image
with open("Data/train.csv","r") as facefeatures2:
    reader=csv.reader(facefeatures2)
    featureslist2=[]
    for row in reader:
        if len(row) != 0:
            featureslist2= featureslist2 +[row]

facefeatures2.close()
float_int2=[]
results=[]
for f2 in range(0,len(featureslist2)):
    float_int2 = float_int2 +[[float(str) for str in subarray] for subarray in [featureslist2[f2]]]
    csv2 = np.vstack(float_int2)
faces_folder_path = "Data/newcropped"
list = os.listdir(faces_folder_path) # dir is your directory path
number_files = len(list)
print (number_files)

writer = pd.ExcelWriter('pandas_name11.xlsx', engine='xlsxwriter')
for loop in range(0,number_files):
    print("iteration ="+str(loop+1))
    unknown_image = face_recognition.load_image_file(faces_folder_path + "/" + str(loop+1)+".jpg")
    cv2.imshow("test",unknown_image)
    cv2.waitKey(0)
    #### --------------exception handling-----------####
    try:
        unknown_face_encoding = face_recognition.face_encodings(unknown_image)[0]

    except  IndexError:
        print("--->image is not detectable")
        pass
        # ...........................#
    results = face_recognition.compare_faces(csv2, unknown_face_encoding)
    chunks=[results[x:x + 12] for x in range(0, len(results),12)] # splits "results" list into sublists of size 12
    dirpath = "Data/eachperson"
    fname = []
    fname = [f for f in sorted(os.listdir(dirpath))]
    counter = 0
    index=0
    for c in range (0,len(chunks)):
        if 'True' in str(chunks[c]):
            counter=counter+1
            index=c
            df = pd.DataFrame({'names': [fname[index]]})
            df.to_excel(writer, sheet_name='Sheet1')
    if counter !=1 or counter ==0 :
           print("student is not present :(")
    else:
        print(str(fname[index])+" is present!!!")
writer.save()
cs95
  • 379,657
  • 97
  • 704
  • 746

1 Answers1

1

Why don't you initialise a dataframe list? Keep appending to the list, and only at the end, you should merge it into one big dataframe and write to it. .to_excel overwrites the excel file each time it writes, so calling it inside a loop is not a good idea unless you open it in append mode. But again, that is inefficient.

Try something like this:

df_list = []
for loop in range(0, number_files):
   ...

   for c in range (0,len(chunks)):
        if 'True' in str(chunks[c]):
            ...

            df_list.append(pd.DataFrame({'names': [fname[index]]}))

writer = pd.ExcelWriter('pandas_name11.xlsx', engine='xlsxwriter')
pd.concat(df_list).reset_index(drop=True).to_excel(writer, sheet_name='Sheet1')

If you, instead, want to rewrite on each iteration, you can also consider taking a look at this.

cs95
  • 379,657
  • 97
  • 704
  • 746
  • thank you so very much it works , but it always have a column in dataframe written 0 all the way ! can you please tell me why ? – hamza yahya Aug 21 '17 at 08:08
  • @hamzayahya Cannot say without seeing your data. If it isn't a big problem, just open it in MS Excel and delete. Anyway, if it helps, you can mark this answer accepted. Cheers. – cs95 Aug 21 '17 at 08:09
  • brother its just look creep that why i am asking it actually looks like [ 0 john present ] then again in next row its like [ 0 pattrick present] i hope u understand ... – hamza yahya Aug 21 '17 at 08:15
  • and also is there any way that code automatically generates date and time in front of each row against each name (i-e when its created) – hamza yahya Aug 21 '17 at 08:18
  • @hamzayahya Try this: `pd.concat(df_list).reset_index(drop=True).to_excel(writer, sheet_name='Sheet1')` – cs95 Aug 21 '17 at 08:21
  • yes brother it works u r a savage ... cheers ... one last question is there anyway that this code automatically generates date and time in front of each row (i-e in front of each name ) as an additional column – hamza yahya Aug 21 '17 at 08:28
  • @hamzayahya I would recommend you open another question, because that requires more code than I am willing to answer in comments :) Please mark accepted if it helped, and open a new question. Explain what is your current input and what is your expected output. I'd love to help, and it helps the community too. – cs95 Aug 21 '17 at 08:30
  • if you dont mind please answer it here , this is what i am using as you know from my above code to generate names and status column ....... df_list.append(pd.DataFrame({'names': [fname[index]] , 'status': ['present']})) ..... now i want date and time in it also infront of these – hamza yahya Aug 21 '17 at 08:36
  • @hamzayahya The problem is I can't understand what you want. You will need to explain it in more details and I would rather you did that in another question, that's it. – cs95 Aug 21 '17 at 08:37
  • How would you get the date and time? Where would it come from? – cs95 Aug 21 '17 at 08:37