16

I am currently plotting a scatterplot based on two columns of data. However, I would like to color the datapoints based on a class label that I have in a third column.

The labels in my third column are either 1,2 or 3. How would I color the scatter plot points based on the values in this third column?

plt.scatter(waterUsage['duration'],waterUsage['water_amount'])
plt.xlabel('Duration (seconds)')
plt.ylabel('Water (gallons)')
Gary
  • 2,137
  • 3
  • 23
  • 41

2 Answers2

23

The scatter function happily takes a list of numbers representing color. You can play with a colormap, too, if you want (but you don't have to):

plt.scatter(waterUsage['duration'], waterUsage['water_amount'],\
            c=waterUsage['third_column'], cmap=plt.cm.autumn)
DYZ
  • 55,249
  • 10
  • 64
  • 93
4

add another entry to your dictionary "color"

def addcolor(b):

    a=b
    for x in range(len(a['third_column'])):
        if a['third_column'][x]==1: a['color'][x]='rosybrown'
        elif a['third_column'][x]==2: a['color'][x]='papayawhip'
        elif a['third_column'][x]==3: a['color'][x]='chartreuse'
    return a

waterUsage = addcolor(waterUsage)

plt.scatter(waterUsage['duration'], 
            waterUsage['water_amount'],
            c=waterUsage['color'])

matplotlib accepts grayscale, rgb, hex, and html colors:

http://matplotlib.org/api/colors_api.html

html color list, by group:

https://www.w3schools.com/colors/colors_groups.asp

litepresence
  • 3,109
  • 1
  • 27
  • 35