1

I have a df in this form:

{'content': {175: nan,
  176: nan,
  177: 'Address not Found',
  178: 'Delivery delayed-transport issues',
  179: nan,
  180: 'Parcel returned',
  181: 'Parcel lost in mail',
  182: 'Parcel received',
  183: 'Return requested',
  184: 'Repeat order placed},
 'sales': {175: 7.0,
  176: 7.0,
  177: 9.0,
  178: 13.0,
  179: 11.0,
  180: 9.0,
  181: 19.0,
  182: 14.0,
  183: 9.0,
  184: 9.0},
 'order_date': {175: Timestamp('2019-08-28 16:30:00'),
  176: Timestamp('2019-08-30 11:55:53'),
  177: Timestamp('2019-09-06 14:51:14'),
  178: Timestamp('2019-09-06 15:03:22'),
  179: Timestamp('2019-09-06 15:46:11'),
  180: Timestamp('2019-09-06 16:08:03'),
  181: Timestamp('2019-09-06 17:13:01'),
  182: Timestamp('2019-09-16 21:38:29'),
  183: Timestamp('2019-09-25 12:35:29'),
  184: Timestamp('2019-09-25 22:22:51')}}

This is in reference to this question: here

I want to plot a line chart with color andsymbol as the content column. However, when I do this:

fig = px.line(df, x='order_date', y='sales',color='content',symbol='content', color_discrete_sequence=px.colors.qualitative.Pastel,
              markers=True, line_shape='hvh')

I am getting separate lines for the content, and some of them are just dots in the graph: not being connected, I am not sure why this happens. I tried replacing the Nan values with None, and the error still remains.

Any help with this would be greatly appreciated.

  • 1
    the reason the lines aren't connected is because when you specify `color` or `symbol` using plotly express, this will cause plotly to separate the data into different traces. and these traces won't have any connection with each other. i think i have an idea for a possible workaround, and will post if i figure something out – Derek O Apr 25 '23 at 03:00

1 Answers1

1

When you use px.line in plotly express, specifying a color or symbol will cause plotly to split up your data by each unique color (or symbol), and plot these as separate traces. Lines are only drawn between traces with the same color.

Imagine your data looks like {'time': [1,2,3,4,5], 'values':[10,20,30,40,50], 'color': [a,b,c,a,b]}. When you define fig = px.line(df, x='time', y='values', color='color', plotly will draw a line from point a to point a, point b to point b, and render point c as a single point. What you want instead is a single line connecting all of your points in order by time.

One possible workaround would be to create fig1 where we use px.line to draw only the line with no markers. Then create fig2 where we use px.scatter to draw only the markers (with the colors and symbols). Then combine the data from both figures together into fig3.

Here is an example below:

import pandas as pd
import plotly.graph_objects as go
import plotly.express as px

Timestamp = pd.Timestamp
nan = float("nan")

data = {'content': {175: nan,
  176: nan,
  177: 'Address not Found',
  178: 'Delivery delayed-transport issues',
  179: nan,
  180: 'Parcel returned',
  181: 'Parcel lost in mail',
  182: 'Parcel received',
  183: 'Return requested',
  184: 'Repeat order placed'},
 'sales': {175: 7.0,
  176: 7.0,
  177: 9.0,
  178: 13.0,
  179: 11.0,
  180: 9.0,
  181: 19.0,
  182: 14.0,
  183: 9.0,
  184: 9.0},
 'order_date': {175: Timestamp('2019-08-28 16:30:00'),
  176: Timestamp('2019-08-30 11:55:53'),
  177: Timestamp('2019-09-06 14:51:14'),
  178: Timestamp('2019-09-06 15:03:22'),
  179: Timestamp('2019-09-06 15:46:11'),
  180: Timestamp('2019-09-06 16:08:03'),
  181: Timestamp('2019-09-06 17:13:01'),
  182: Timestamp('2019-09-16 21:38:29'),
  183: Timestamp('2019-09-25 12:35:29'),
  184: Timestamp('2019-09-25 22:22:51')}}

df = pd.DataFrame(data=data)
df.sort_values(by='order_date') 

fig1 = px.line(df, x='order_date', y='sales').update_traces(line_color='lightgrey')
fig2 = px.scatter(df, x='order_date', y='sales',color='content',symbol='content', color_discrete_sequence=px.colors.qualitative.Pastel)
fig3 = go.Figure(data=fig1.data + fig2.data)
fig3.show()

enter image description here

Derek O
  • 16,770
  • 4
  • 24
  • 43