0

I have 1500 png images and associated .txt files for each image that hold the multi-label values (1-7 values) for what is identified in each photo. The problem is that the labels are out of order with each image and one file could have 1 value and another all 7 values, in any arrangement. I don't know how to pull the information out, order it, and then one-hot-encode it accordingly. I need help please. I get the following error using le.fit() because of it:

ValueError: y contains previously unseen labels: ['label2\nlabel7\nlabel1', 'label2\nlabel1', 'label2\nlabel1\nlabel6',....

What I think needs to happen is:

  1. I create a dictionary (Dict={1:'label1',2:'label2',3:'label3'})
  2. I look in the folder that has the .txt files
  3. I match up what is in the files with the dictionary values and put the number into a list. So I will have a list of 1500 lists.
  4. Somehow I need to order those lists and put zeros in the missing number spots

That would give me the target values of each image one-hot-encoded.

Thanks in advance

Jennifer Crosby
  • 185
  • 1
  • 1
  • 14

1 Answers1

0

This is what I ended up doing after a variety of wonderful help.

Dict = {'label1': 0,
        'label2': 1,
        'label3': 2,
        'label4': 3,
        'label5': 4,
        'label6': 5,
        'label7': 6}

labels_arr = [['label1', 'label5', 'label4'], ['label1', 'label4', 'label3'], 
             ['label1', 'label3'], ['label1'], ['label1', 'label4', 'label3'], 
             ['label1', 'label3', 'label4'], 
             ['label1', 'label2', 'label3', 'label4', 'label5', 'label6', 'label7']]

nums_arr  =[]                      # this array saves the list after each loop
for i in range(len(labels_arr)):   # needed first to loop through the list of lists
    nums_arr_i=[]                  # this array needed to append the 1's and 0's to it
    for key in Dict.keys():        # after we loop through the Dict keys first
        if key in labels_arr[i]:   # compares the keys to original labels array at [i]
            nums_arr_i.append(1)   # append 1 or 0 if it matches or not
        else:
            nums_arr_i.append(0)
    nums_arr.append(nums_arr_i)    # end result list of 7 1's or 0's is appended to 
print('nums_arr= ', nums_arr)      # nums_arr and we loop to the next i in labels_arr

# End Result
nums_arr=  [[1, 0, 0, 1, 1, 0, 0], [1, 0, 1, 1, 0, 0, 0], [1, 0, 1, 0, 0, 0, 0], 
           [1, 0, 0, 0, 0, 0, 0], [1, 0, 1, 1, 0, 0, 0], [1, 0, 1, 1, 0, 0, 0], 
           [1, 1, 1, 1, 1, 1, 1]]
Jennifer Crosby
  • 185
  • 1
  • 1
  • 14