0

I'm trying to write a simple program that reads in a CSV with various datasets (all of the same length) and automatically plots them all (as a Pandas Dataframe scatter plot) on the same figure. My current code does this well, but all the marker colors are the same (blue). I'd like to figure out how to make a colormap so that in the future, if I have much larger data sets (let's say, 100+ different X-Y pairings), it will automatically color each series as it plots. Eventually, I would like for this to be a quick and easy method to run from the command line. I did not have luck reading the documentation or stack exchange, hopefully this is not a duplicate!

I've tried the recommendations from these posts:

1)Setting different color for each series in scatter plot on matplotlib

2)https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.scatter.html

3) https://matplotlib.org/users/colormaps.html

However, the first one essentially grouped the data points according to their position on the x-axis and made those groups of data the same color (not what I want, each series of data is roughly a linearly increasing function). The second and third links seemed to have worked, but I don't like the colormap choices (e.g. "viridis", many colors are too similar and it's hard to distinguish data points).

This is a simplified version of my code so far (took out other lines that automatically named axes, etc. to make it easier to read). I've also removed any attempts I've made to specify a colormap, for more of a blank canvas feel:

''' Importing multiple scatter data and plotting '''

import pandas as pd
import matplotlib.pyplot as plt

### Data file path (please enter Dataframe however you like)
path = r'/Users/.../test_data.csv'

### Read in data CSV
data = pd.read_csv(path)

### List of headers
header_list = list(data)

### Set data type to float so modified data frame can be plotted
data = data.astype(float)

### X-axis limits
xmin = 1e-4;
xmax = 3e-3;

## Create subplots to be plotted together after loop
fig, ax = plt.subplots()

### Since there are multiple X-axes (every other column), this loop only plots every other x-y column pair

for i in range(len(header_list)):

    if i % 2 == 0:

        dfplot = data.plot.scatter(x = "{}".format(header_list[i]), y = "{}".format(header_list[i + 1]), ax=ax)

        dfplot.set_xlim(xmin,xmax) # Setting limits on X axis

plot.show()

The dataset can be found in the google drive link below. Thanks for your help!

https://drive.google.com/drive/folders/1DSEs8D7lIDUW4NIPBl2qW2EZiZxslGyM?usp=sharing

  • 1
    Possible duplicate of [How to pick a new color for each plotted line within a figure in matplotlib?](https://stackoverflow.com/questions/4971269/how-to-pick-a-new-color-for-each-plotted-line-within-a-figure-in-matplotlib) – Valentino Sep 05 '19 at 17:07
  • I believe I tried something similar to this (see the first link I posted in my description), but it did not color each series appropriately. I think it could work but it's not immediately clear to me how to implement it with the loop I've written. Admittedly, I don't fully understand how to properly call the cm.rainbow attribute. – tmoney5150 Sep 05 '19 at 18:00
  • 1
    I would suggest you spend more time trying to understand the linked post. It's pretty much exactly what you need, even with 3 possible realizations. – ImportanceOfBeingErnest Sep 05 '19 at 18:52

0 Answers0