0

I'm playing with iris_dataset from sklearn.datasets I want to generate list similiar to iris_dataset['target'] but to have name of class instead of index. The way I did it:

from sklearn.datasets import load_iris
iris_dataset=load_iris()
y=iris_dataset.target
print("Iris target: \n {}".format(iris_dataset.target))
unique_y = np.unique(y)
class_seq=['']
class_seq=class_seq*y.shape[0]

for i in range(y.shape[0]):
    for (yy,tn) in zip(unique_y,iris_dataset.target_names):
        if y[i]==yy:
            class_seq[i]=tn          

print("Class sequence: \n {}".format(class_seq))

but I would like to do it not looping through all of the elements of y, how to do it better way?

The outcome is that I need this list for pandas.radviz plot to have a proper legend:

pd.plotting.radviz(iris_DataFrame,'class_seq',color=['blue','red','green'])

And further to have it for any other dataset.

Kostia
  • 3
  • 1
  • Actually it's probably better to use mask as given in second answer cause it's easier to work with np.arrays not lists, my mistake in formulating a question – Kostia Dec 10 '18 at 20:29

2 Answers2

1

You can do it by looping over iris_dataset.target_names.size. This is only size 3 so it should be alot faster for large y arrays.

class_seq = np.empty(y.shape, dtype=iris_dataset.target_names.dtype)

for i in range(iris_dataset.target_names.size):
    mask = y == i
    class_seq[mask] = iris_dataset.target_names[i]

If you want to have class_seq as a list: class_seq = list(class_seq)

JE_Muc
  • 5,403
  • 2
  • 26
  • 41
0

Yo can do it by list comprehension.

class_seq = [ iris_dataset.target_names[i] for i in iris_dataset.target]

or by using map

class_seq = list(map(lambda x : iris_dataset.target_names[x], iris_dataset.target))
Sach
  • 904
  • 8
  • 20