1

I am trying to plot Dendrogram to cluster data but this error is stopping me. my date is here "https://assets.datacamp.com/production/repositories/655/datasets/2a1f3ab7bcc76eef1b8e1eb29afbd54c4ebf86f2/eurovision-2016.csv"

I first chose columns to work with

target_col = df_euro["To country"]
feat = df_euro[["Jury A","Jury B","Jury C","Jury D","Jury E"]]

#Convert them into ndarrays
x = feat.to_numpy(dtype ='float32')
y = target_col.to_numpy()

# Calculate the linkage: mergings
mergings = linkage(x, method = 'complete')
# Plot the dendrogram
dendrogram(
    mergings,
    labels = y,
    leaf_rotation = 90,
    leaf_font_size = 6
)
plt.show()

But I'm getting this error which I can't understand. I googled it and checked that both has same shape (1066,5) and (1066,) There is No NA in both features and target_col

I know the issue is with labels but i couldn't find away to solve it. find Any help will be really appreciated :)

Edit: Here is entire traceback

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-113-7fffdc847e5e> in <module>
      4 mergings = linkage(feat, method = 'complete')
      5 # Plot the dendrogram
----> 6 dendrogram(
      7     mergings,
      8     labels = target_col,

C:\ProgramData\Anaconda3\lib\site-packages\scipy\cluster\hierarchy.py in dendrogram(Z, p, truncate_mode, color_threshold, get_leaves, orientation, labels, count_sort, distance_sort, show_leaf_counts, no_plot, no_labels, leaf_font_size, leaf_rotation, leaf_label_func, show_contracted, link_color_func, ax, above_threshold_color)
   3275                          "'bottom', or 'right'")
   3276 
-> 3277     if labels and Z.shape[0] + 1 != len(labels):
   3278         raise ValueError("Dimensions of Z and labels must be consistent.")
   3279 

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
   1476 
   1477     def __nonzero__(self):
-> 1478         raise ValueError(
   1479             f"The truth value of a {type(self).__name__} is ambiguous. "
   1480             "Use a.empty, a.bool(), a.item(), a.any() or a.all()."

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Sam.H
  • 193
  • 2
  • 14

2 Answers2

2

The problem is that the labels keyword argument in dendrogram must have a __bool__ methods that returns wether it contains any items, just like in list. So the only change you need to do is to convert to a list when passing the argument:

dendrogram(
    mergings,
    labels = list(y),
    leaf_rotation = 90,
    leaf_font_size = 6
)

All the other lines can stay the same.

Roy Cohen
  • 1,540
  • 1
  • 5
  • 22
  • it is the same, Dendrogram "Labels" need (y) as Lists so you are right, you can change it outside or inside the Dendrogram. The other changes to the code was just me making it more understandable – Sam.H Dec 25 '20 at 16:13
  • @Sam.H I only posted this answer because I thought your answer was a bit confusing and lacked an explanation. – Roy Cohen Dec 25 '20 at 22:11
0

Just in case someone else is searching for the same issue, by converting the labels to list, it will work.

samples= df_euro.iloc[:, 2:7].values[:42]
country_names= list(df_euro.iloc[:, 1].values[:42])

mergings = linkage(samples, method='single')

# Plot the dendrogram
fig, ax = plt.subplots(figsize=(15, 10))
fig  = dendrogram(mergings, labels=country_names)
plt.show()
Sam.H
  • 193
  • 2
  • 14
  • in OP's code, `linkage(x, method = 'complete')`. But in your code, `linkage(samples, method='single')`. Wouldn't that change the result? – Roy Cohen Dec 24 '20 at 20:32
  • @RoyCohen you're right it changes the results of the Dendrogram. As you know "method" is just different ways of calculating the distance between samples but my error came from "labels" not the "method" (I just tried different methods to see if error will change) I changed the names to more meaningful ones while I was debugging cause it makes it easier to spot the error. – Sam.H Dec 24 '20 at 21:03