2

enter image description hereI'm using Pandas to make a scatter plot. My data look like this:



    Locations Lenovo Global Region Primary Function Subsidiaries Apr_2015_to_Mar_2016_[kWh] Apr_2015_to_Mar_2016_[MT] color MT/kWh
<!-- -->  

    263 United Kingdom - Hook EMEA Large Office (OFL) Lenovo 7.561727e+04 129.438515 r 0.001712  
    202 South Africa - Johannesburg/Bryanston EMEA Small Office (OSL) Lenovo 1.013746e+05 93.872885 r 0.000926  
    232 India - Chennai Factory Asia Pacific Manufacturing (MFG) Motorola Mobility 3.163600e+05 271.933953 g 0.000860  
    159 India - Pondicherry Asia Pacific    Manufacturing (MFG) Lenovo  1.074016e+06    907.869649  g   0.000845  
    242 Australia - Chatswood   Asia Pacific    Large Office (OFL)  Lenovo  3.001254e+05    239.500093  g   0.000798

define a function to use different color for different regions.

def colorpoint(row):
    if row['Lenovo Global Region'] == 'Asia Pacific':
        return('g')
    if row['Lenovo Global Region'] == 'EMEA':
        return('r')
    else:
        return('b')
test3['color'] = test3.apply (lambda row: colorpoint (row),axis=1)

define the scatter points that I want to plot.

y=test3['Apr_2015_to_Mar_2016_[MT]']
x=test3['Apr_2015_to_Mar_2016_[kWh]']
T = test3['color']
area= (y/x)*500000
xmax=1.1*max(test3['Apr_2015_to_Mar_2016_[kWh]'])
ymax=1.1*max(test3['Apr_2015_to_Mar_2016_[MT]'])

plot the figure. fig1 = plt.figure(figsize=(16,9), dpi=300) ax = plt.subplot(111)

plot=plt.scatter(x,y,alpha=0.6,c=T,s=area)
ax.grid(True)
ax.set_xlim([0,xmax])
ax.set_ylim([0,ymax])
ax.set_xlabel('Apr 2015 to Mar 2016 [kWh]')
ax.set_ylabel('Apr 2015 to Mar 2016 [MT]')
ax.set_title('Total Elec consumtion [kWh] VS CO2 emission [MT]')

Try to add legend. I want to show colors correspond to their "Lenovo Global Region", but it's not working, only showing one region "America Groups" as blue dot

legend=test3['Lenovo Global Region']
plt.legend(legend,loc=4)

Thanks if you have ideas!!

Yang
  • 177
  • 4
  • 20

3 Answers3

1

Maybe this is what you need in the last line:

plt.legend(legend.values,loc=4)
Gerges
  • 6,269
  • 2
  • 22
  • 44
1

You can try this:

for city,color in [('Asia Pacific', 'Green'), ('EMEA', 'Red'), ('rest', 'Blue')]:
    x = test3.loc[test3['Lenovo Global Region']==city]['Apr_2015_to_Mar_2016_[kWh]']
    y = test3.loc[test3['Lenovo Global Region']==city]['Apr_2015_to_Mar_2016_[MT]']
    area= (y/x)*500000
    plt.scatter(x, y, alpha=0.6,c=color,s=area, label=city)
plt.legend()
megamind
  • 78
  • 7
1

From this other question that might be the same: Matplotlib adding legend based on existing color series

"You can create the legend handles using an empty plot with the color based on the colormap and normalization of the scatter plot."

sc = plt.scatter(df['x'], df['y'], s=size, c=df['colors'], edgecolors='none')

lp = lambda i: plt.plot([],color=sc.cmap(sc.norm(i)), ms=np.sqrt(size), mec="none",
                        label="Feature {:g}".format(i), ls="", marker="o")[0]
handles = [lp(i) for i in np.unique(df["colors"])]
plt.legend(handles=handles)
plt.show()

So from their code to yours just change assign your plt.scatter to a variable like sc, and replace all the df['colors'] with T.

user2415706
  • 932
  • 1
  • 7
  • 19