7

I'm using plotly express timeline to produce a Gantt chart following this example: https://medium.com/dev-genius/gantt-charts-in-python-with-plotly-e7213f932f1e

It automatically sets the x-axis to use dates but I'd actually like to just use integers (i.e. Project Kick-Off +1, Project Kick-Off +6, etc).

Is there a way to make a timeline plot NOT use dates for the xaxis?

If I try using integers it'll treat them like milliseconds.

Rob
  • 1,336
  • 1
  • 15
  • 24

5 Answers5

12

The answer:

Yes, it's possible! Just give integers as start and end "dates", calculate the difference between them (delta), and make these changes to your fig:

fig.layout.xaxis.type = 'linear'
fig.data[0].x = df.delta.tolist()

Plot

enter image description here

The details:

There actually is a way to achieve this, although the docs state that:

The px.timeline function by default sets the X-axis to be of type=date, so it can be configured like any time-series chart.

And therefore every other functionality in px.timeline() seems to revolve around that fact. But if you just ignore that and use integers as values for Start and Finish, then you can tweak a few attributes to get what you want. You just need to calculate the differences between each Start and Stop. For example like this:

df = pd.DataFrame([
    dict(Task="Job A", Start=1, Finish=4),
    dict(Task="Job B", Start=2, Finish=6),
    dict(Task="Job C", Start=3, Finish=10)
])
df['delta'] = df['Finish'] - df['Start']

And then tweak a little further:

fig.layout.xaxis.type = 'linear'
fig.data[0].x = df.delta.tolist()

Complete code:

import plotly.express as px
import pandas as pd

df = pd.DataFrame([
    dict(Task="Job A", Start=1, Finish=4),
    dict(Task="Job B", Start=2, Finish=6),
    dict(Task="Job C", Start=3, Finish=10)
])
df['delta'] = df['Finish'] - df['Start']

fig = px.timeline(df, x_start="Start", x_end="Finish", y="Task")
fig.update_yaxes(autorange="reversed") 

fig.layout.xaxis.type = 'linear'
fig.data[0].x = df.delta.tolist()
f = fig.full_figure_for_development(warn=False)
fig.show()
vestland
  • 55,229
  • 37
  • 187
  • 305
7

I think this is much simpler solution than above when we have to specify color

for d in fig.data:
  filt = df['color'] == d.name
  d.x = df[filt]['Delta'].tolist()
JounghooLee
  • 81
  • 1
  • 1
  • 2
    I am sorry but you are not answering the question – godidier Aug 19 '21 at 09:39
  • FYI this is additional comment to vestland's answer, and if you try out yourself, you would find out it is much simpler than overwriting process_dataframe_timeline function like amos's answer. – JounghooLee Aug 22 '21 at 23:58
  • 2
    This is indeed a useful comment, as it simplifies the solution a lot! The problem in the initial solution is in `fig.data[0].x`, so this 0 index will fix only first series of bars and the rest of them will be invisible. A couple of lines mentioned here fixes this issue for me without patching plotly code. – Fedor Chervinskii Nov 04 '21 at 17:57
3

I tried the other answer listed here, but that doesn't work if I specify a color. If I try, the data fig.data has multiple Bar objects, and I don't think it contains the data necessary to assign all the deltas. However, I did find that I could monkeypatch plotly code to not convert it to time objects and I get the correct result:

import plotly.express as px
import pandas as pd

def my_process_dataframe_timeline(args):
    """
    Massage input for bar traces for px.timeline()
    """
    print("my method")
    args["is_timeline"] = True
    if args["x_start"] is None or args["x_end"] is None:
        raise ValueError("Both x_start and x_end are required")

    x_start = args["data_frame"][args["x_start"]]
    x_end = args["data_frame"][args["x_end"]]

    # note that we are not adding any columns to the data frame here, so no risk of overwrite
    args["data_frame"][args["x_end"]] = (x_end - x_start)
    args["x"] = args["x_end"]
    del args["x_end"]
    args["base"] = args["x_start"]
    del args["x_start"]
    return args
px._core.process_dataframe_timeline = my_process_dataframe_timeline

df = pd.DataFrame([
    dict(Task="Job A", Start=1, Finish=4, color="1"),
    dict(Task="Job B", Start=2, Finish=6, color="2"),
    dict(Task="Job C", Start=3, Finish=10, color="1")
])
df['delta'] = df['Finish'] - df['Start']

fig = px.timeline(df, x_start="Start", x_end="Finish", y="Task", color="color")
fig.update_yaxes(autorange="reversed") 

fig.layout.xaxis.type = 'linear'
fig.show()

Obviously not desirable to do this... it would be nice to get formal support.

amos
  • 5,092
  • 4
  • 34
  • 43
0

To illustrate how the answer from JounghooLee works, I want to add this example

import plotly.express as px
import pandas as pd

df = pd.DataFrame([
    dict(Task="Job A", Start=1, Finish=10),
    dict(Task="Job B", Start=3, Finish=6),
    dict(Task="Job C", Start=3, Finish=10)
])

df['delta'] = df['Finish'] - df['Start']

fig = px.timeline(df, x_start="Start", x_end="Finish", y="Task", color="Task")


print(fig.data[0].x)
print(fig.data[1].x)
print(fig.data[2].x)

a = [True, False, False]
b = [False, True, False]
c = [False, False, True]

aSeries = pd.Series(a)
bSeries = pd.Series(b)
cSeries = pd.Series(c)

fig.layout.xaxis.type = 'linear'

# the dataFrame df is filtered by the pandas series, to add the specific delta to each of the bars
fig.data[0].x = df[aSeries]['delta'].tolist()
fig.data[1].x = df[bSeries]['delta'].tolist()
fig.data[2].x = df[cSeries]['delta'].tolist()

print(fig.data[0].x)
print(fig.data[1].x)
print(fig.data[2].x)


fig.show()

The prints give the following output:

[0.]
[0.]
[0.]
(9,)
(3,)
(7,)

and the fig.data contains three bars

enter image description here

If there is no color assigned, it does not work, because then fig.data contains just 1 bar

enter image description here

The print(fig.data[0].x) returns

[0. 0. 0.]

in this case.

See also this https://stackoverflow.com/a/71141827/7447940

0

Another option is to use a plotly barplot and use the base argument to indicate where to put the bar start and the x value would be the duration of the task:

df = pd.DataFrame([
    dict(task="Job A", start=1, end=4),
    dict(task="Job B", start=2, end=6),
    dict(task="Job C", start=3, end=10)
])
df['delta'] = df['end'] - df['start']

fig = px.bar(df, 
base = "start",
x = "delta",
y = "task",
orientation = 'h'
)

fig.update_yaxes(autorange="reversed")
fig.show()

enter image description here

The barplot also accepts the color argument and correctly groups the tasks by the column you indicate, for example:

df = pd.DataFrame([
    dict(task="Job A", start=1, end=4, color = "A"),
    dict(task="Job B", start=2, end=6, color = "B"),
    dict(task="Job C", start=3, end=10, color = "A")
])
df['delta'] = df['end'] - df['start']

fig = px.bar(df, 
base = "start",
x = "delta",
y = "task",
color = "color",
orientation = 'h'
)

fig.update_yaxes(autorange="reversed")
fig.show()

enter image description here

symduk
  • 78
  • 1
  • 7