0

I am drawing this scatterplot in python, and I would like to have the dot drawn in a different (contrasting) colour for each label. Each label has multiple points.

Seems like it could be something to feed to annotate, but I am not quite sure how, as I haven't been able to find it:

  for i, label in enumerate(labels):
    x, y = low_dim_embs[i, :]
    plt.scatter(x, y)
    plt.annotate(label,
                 xy=(x, y),
                 xytext=(5, 2),
                 textcoords='offset points',
                 ha='right',
                 va='bottom')

I can replace the above command by:

plt.scatter(x, y, color=mycolors)

Which will give me manually specified colors, but for each of the entries (and there are many repetitions per entry). Is there any automatic way?

My dataset looks like this:

x,y,label
1,2,label1
1,3,label1
2,-1,label1
4,1,label2
5,1,label2
...

Each coordinate belonging to labelx should have the same colour (I would probably also need those in a legend).

dorien
  • 5,265
  • 10
  • 57
  • 116

1 Answers1

1

The way to work around the issue of same labels to same colors is to write a script that generates a list of colors that assign a unique number to each unique value in your data (that line of code is explained in this answer):

import numpy as np
import matplotlib.pyplot as plt

line = plt.figure()

data = [[1,1.5,3,2.4,5],[2,4.1,2.4,1,3],["apple","banana","grape","apple","banana"]]    

colors = [{ni: indi for indi, ni in enumerate(set(data[2]))}[ni] for ni in data[2]]

plt.scatter(data[0], data[1], c=colors, cmap="plasma")    

for i in range(len(data[0])):    
    plt.annotate(str(data[2][i]),
                     xy=(data[0][i], data[1][i]),
                     xytext=(data[0][i], data[1][i]),
                     textcoords='offset points',
                     ha='right',
                     va='bottom')

plt.show()

enter image description here

Community
  • 1
  • 1
Vinícius Figueiredo
  • 6,300
  • 3
  • 25
  • 44
  • That gives me an error: c = tuple(map(float, c)) TypeError: 'numpy.int64' object is not iterable. Also, would this really make that if 10 entries have the same label they get the same colour? – dorien May 10 '17 at 02:59
  • 1
    @dorien No, it wouldn't meet that requirement, I wasn't aware of that before your edit, sorry. I'll try to update my answer. – Vinícius Figueiredo May 10 '17 at 03:05