2

I have been searching for solution on how to set points on a scatter plot depending on the value of a 'third' column. I could not find any ready-made solution, thus I have constructed my own, and I want to share it (maybe someone will make a use of it) :) If this is not a place to post it, then I am sorry and please remove it.

Lets assume that there is a data frame 'scatterData' as below:

    lad2014_name    Male    Female  Result
0   ABERDEEN CITY   95154   97421   -21.78
1   ABERDEENSHIRE   101875  105141  -13.10
2   ADUR    24047   26574   -16.16
3   ALLERDALE       38346   40192   -44.56
.
.
.
499 AMBER VALLEY    48720   51502   -3.56

I want plot the Male and Female on a scatter plot, however I also want to show whether the 'Result' was negative or positive by changing the colour of the marker. So I have done this:

def resultColour(z):
    colour = '#e31a1c'
    if z > 0:
        colour = '#1f78b4'
    return colour

#Plotting the scatter plot
plt.figure(figsize=(12,10))

for index, row in scatterData.iterrows(): 
    x = row.Male
    z = row.Result
    y = row.Female
    t = resultColour(z)
    plt.scatter(x, y, c=t,s=85)
plt.xlabel('X axis lable',fontsize=15)
plt.ylabel('Y axis lable',fontsize=15)

plt.title('Plot title',fontsize=18)

plt.plot()

It produces scatter as below

Scatter plot

Sylwek
  • 97
  • 1
  • 8

1 Answers1

0

You can actually provide an array of values for the c keyword argument in plt.scatter. They will be mapped to marker colors according to a colormap that you can set with the cmap keyword.

Example:

import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
from matplotlib import cm

np.random.seed(0)
df = pd.DataFrame({'x':np.random.randint(0,10,size=20),
                   'y':np.random.randint(0,10,size=20),
                   'z':np.random.uniform(-1,1,size=20)})

plt.figure(figsize=(5,5))
plt.scatter(df.x, df.y, c = df.z>0, cmap = cm.RdYlGn)

Produces

a colored scatterplot

The cm module has a large selection of colormaps. If you need your own exact colors then you could use a listed colormap like this:

cmap = matplotlib.colors.ListedColormap(['#e31a1c','#1f78b4'])
plt.scatter(df.x, df.y, c = df.z>0, cmap = cmap)
gereleth
  • 2,452
  • 12
  • 21
  • Thanks. that indeed produce the same outcome as my solution, yet is much simpler and elegant. – Sylwek Mar 26 '17 at 17:49