1

I want to plot a chart looks like a ridgeline/joyplot, but instead of displaying density, displays the volume of occurences over time (timeseries stacked area chart).

How can I plot the volume of each genre, instead of the density, but maintaining the look and feel of the ridgline/joyplot? I am struggling to do it using subplots, as I need to display the area below the line and the line should be smooth.

I have the following sample database, code and plot:

Sample database (to run the code, copy and paste it to a .csv file):

movie_genre date_released numerical_date
Horror 16/02/2019 43512
Horror 12/02/2019 43508
Animation 21/02/2019 43517
Horror 14/01/2019 43479
Horror 17/02/2019 43513
Comedy 24/02/2019 43520
Animation 15/01/2019 43480
Comedy 25/02/2019 43521
Horror 03/01/2019 43468
Comedy 25/02/2019 43521
Comedy 15/02/2019 43511
Horror 08/02/2019 43504
Horror 16/02/2019 43512
Documentary 20/02/2019 43516
Comedy 05/02/2019 43501
Drama 03/02/2019 43499
Comedy 19/01/2019 43484
Animation 13/02/2019 43509
Horror 16/01/2019 43481
Animation 03/02/2019 43499
Comedy 14/01/2019 43479
Comedy 20/01/2019 43485
Drama 10/01/2019 43475
Animation 28/01/2019 43493
Animation 29/01/2019 43494
Drama 21/02/2019 43517
Horror 18/01/2019 43483
Comedy 17/02/2019 43513
Documentary 30/01/2019 43495
Documentary 12/02/2019 43508
Documentary 04/01/2019 43469
Drama 27/02/2019 43523
Comedy 15/01/2019 43480
Animation 14/02/2019 43510
Drama 03/02/2019 43499
Drama 17/02/2019 43513
Documentary 22/02/2019 43518
Horror 25/01/2019 43490
Horror 21/01/2019 43486
Documentary 25/01/2019 43490
Documentary 12/02/2019 43508
Comedy 13/02/2019 43509
Drama 13/02/2019 43509
Drama 05/01/2019 43470
Horror 15/02/2019 43511
Drama 31/01/2019 43496
Comedy 26/02/2019 43522
Horror 27/02/2019 43523
Documentary 25/02/2019 43521
Drama 18/01/2019 43483
Horror 22/02/2019 43518
Documentary 10/02/2019 43506
Drama 25/01/2019 43490
Documentary 01/01/2019 43466
Documentary 21/01/2019 43486
Documentary 10/01/2019 43475
Drama 27/02/2019 43523
Drama 26/01/2019 43491
Animation 17/01/2019 43482

Code:

import joypy
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from matplotlib import cm
import matplotlib.dates as mdates

%matplotlib inline

df = pd.read_csv(r'C:\Users\YOUR_USER\Downloads\movies.csv', delimiter=';') #directory where you saved the table above

df['date_released'] = pd.to_datetime(df['date_released'], format='%d/%m/%Y')

fig, ax = joypy.joyplot(df, 
                        by = 'movie_genre', 
                        column='numerical_date', 
                        colormap= cm.autumn,
                        overlap=1,
                        figsize = (13,8),
                        x_range=[43466, 43525],
                        xrot=90,
                        alpha=0.7)

print(ax[-1].get_xticks())

new_dates = [mdates.num2date(x +  mdates.date2num(np.datetime64('1900-01-01'))) for x in ax[-1].get_xticks()]
new_dates = [t.strftime('%d.%m.%Y').lower() for t in new_dates]
ax[-1].set_xticklabels(new_dates)

Plot:

enter image description here

M_B
  • 47
  • 7
  • Maybe include an example of what you want this to look like? Your description of what you want is not clear to me. – Jody Klymak Jul 22 '22 at 23:37
  • Hi Jody! I want it to look exactly like the picture of the Joyplot in the question. However, by definition, the Joyplot plots the **density** of each line category throughout the columns (in this case, time). For my purposes, visualizing density can be misleading, as I want to **compare how each category behaved over time**. – M_B Jul 23 '22 at 08:47
  • Thus, I need it plot the **actual number of occurences**, just like a simple timeseries chart would, but with this smooth line, area below chart, and stacked categories look, once it is the best way I could think of to show my data. Perhaps area subplots could do it... but I could not work it out in a nice way! – M_B Jul 23 '22 at 08:47
  • This is still not well-defined. Your data is just raw dates of occurrence by genre. What do you mean by "actual number of occurrences"? The total? Occurrences per time block, ie. per month or per year? If its the latter you need to somehow calculate the total per month before you plot. Once you have that, you can just use ax.fill_between. – Jody Klymak Jul 23 '22 at 14:53
  • **Per month.** I'm presenting the above table in "plain" format, as it was used to plot the Joyplot. But yeah, perhaps I'll need to sum it, it'll depend on the solution. – M_B Jul 23 '22 at 21:01

0 Answers0