0

I am trying to scatter plot lists x and y and color code their points corresponding to list z:

plt.scatter(x,y,c=z)

z is a list of strings with many repeats. example z:

z = ['Adelie', 'Adelie', 'Adelie', 'Adelie', 'Gentoo', 'Gentoo', 'Gentoo', 'Gentoo']

I believe a good way to do this would be by creating a list that transforms the string values of z into integer values, like so:

newz = [1, 1, 1, 1, 2, 2, 2, 2]

Then I believe I could achieve my goals with this code

 plt.scatter(x,y,c=newz)

What is the best way to create newz, and is this a good way to color my scatter plot points?

Dom
  • 13
  • 2

1 Answers1

0

I figured it out courtesy of this post: Map unique strings to integers in Python

My code:

import matplotlib.pyplot as plt
plist = csv2dict("penguins.csv")
x=[] #bill length list
y=[] #bill depth list
z=[] #species list
for count,pdict in enumerate(plist):
  try:
    x.append(float(pdict["bill_length_mm"]))
    y.append(float(pdict["bill_depth_mm"]))
    z.append(pdict['species'])
  except:
    print(f'row {count+1} had a bad value')
    print(pdict)

newz=dict([(y,x+1) for x,y in enumerate(sorted(set(z)))])

plt.scatter(x,y,c=[newz[d] for d in z])
Dom
  • 13
  • 2