1

I'm trying to create a simple plot with lines colored according to a factor variable using plotnine 0.6.0 in python 3.7.4.

import pandas as pd
import plotnine as pn
import datetime

# data
df = pd.DataFrame(
{'name': ('Eric', 'Eric', 'Eric', 'Eric', 'Eric', 'Eric', 'Nico', 'Nico',
   'Nico', 'Nico', 'Nico', 'Nico', 'Sanne', 'Sanne', 'Sanne', 'Sanne',
   'Sanne', 'Sanne'),
 'date': (datetime.date(2013, 8, 15), datetime.date(2013, 8, 15),
   datetime.date(2013, 8, 15), datetime.date(2013, 8, 16),
   datetime.date(2013, 8, 16), datetime.date(2013, 8, 16),
   datetime.date(2013, 8, 15), datetime.date(2013, 8, 15),
   datetime.date(2013, 8, 15), datetime.date(2013, 8, 16),
   datetime.date(2013, 8, 16), datetime.date(2013, 8, 16),
   datetime.date(2013, 8, 15), datetime.date(2013, 8, 15),
   datetime.date(2013, 8, 15), datetime.date(2013, 8, 16),
   datetime.date(2013, 8, 16), datetime.date(2013, 8, 16)),
 'altitude': ( 71,  68,  68,  92,  95, 104, 382, 197, 206, 157, 156, 157,  55,
    54,  55,  65,  62,  73)
 })

# summarize the data by date
summ = df.groupby(['name', 'date']).altitude.mean().reset_index(name = 'altitude')

# plot the data by "name"
pn.ggplot(mapping = pn.aes(x = 'date',
                      y = 'altitude',
                      color = 'name'),
     data = summ) +\
pn.geom_line()

This code creates the background that I expect: enter image description here

But throws the error:

C:\Anaconda3\lib\site-packages\plotnine\geoms\geom_path.py:83: 
PlotnineWarning: geom_path: Each group consist of only one observation. 
Do you need to adjust the group aesthetic?
"group aesthetic?", PlotnineWarning)

If I remove the color facet,

pn.ggplot(mapping = pn.aes(x = 'date',
                      y = 'altitude'),
     data = summ) +\
pn.geom_line()

I get:

enter image description here

I know my problem is related to this, but I don't want 1 line. I want a different line for each name.

filups21
  • 1,611
  • 1
  • 19
  • 22
  • 1
    By using the standard library datetime, you get an x-axis that is discrete, i.e.`df['date'].dtype` is an `object`. The grouping is determined by *all* the discrete mappings, unless `group` is mapped to directly. You can convert to pandas date representation with `df['date'] = pd.to_datetime(df['date'])`. Or you can set `group='name'`. – has2k1 Mar 29 '20 at 14:36

1 Answers1

0

Well, in the several hours I wasted on this, I stumbled on the solution: I must specify both the color grouping variable and the group grouping variable. (Note: this is not necessary when using ggplot2 in R, but apparently it is necessary in plotnine.)

pn.ggplot(mapping = pn.aes(x = 'date',
                      y = 'altitude',
                      color='name',
                      group='name'),
     data = summ) +\
pn.geom_line()

enter image description here

filups21
  • 1,611
  • 1
  • 19
  • 22