2

I'm trying to create a plot with multiple axis. But instead of putting Gene and db on the x axis and mutations on the y axis holoviews plots db on the y axis and gene on the x axis.

How can i get a multi categorial plot out of this?

mutated_positions = hv.Scatter(totaldf,
              ['gene', 'db'], 'mutations', xrotation=45).opts(size=10, color='#024bc2', line_color='#002869', jitter=0.2, alpha=0.5)

The current plot looks like this: https://i.stack.imgur.com/jje5h.jpg I'm trying to get the axis like this: https://i.stack.imgur.com/GBziX.jpg with mutations on the Y axis.

The dataframe I'm using looks like this:

      gene       db  mutations
0     IGHV1-3  G1K_CL2          6
1    IGHV1-58  G1K_CL2          2
2    IGHV1-58  G1K_CL2          3
3     IGHV1-8  G1K_CL2          2
4    IGHV3-16  G1K_CL2          3
..        ...      ...        ...
141  IGHV4-61  G1K_CL3         11
142  IGHV4-61  G1K_CL3         12
143  IGHV4-61  G1K_CL3         10
144  IGHV4-61  G1K_CL3         13
145  IGHV7-81  G1K_CL3          4
Sander van den Oord
  • 10,986
  • 5
  • 51
  • 96
Moopsish
  • 117
  • 10

1 Answers1

1

The code below is a way of putting your mutations on the y-axis, and putting db and/or gene on the x-axis.
It creates a Ndlayout, which means a separate plot is created for every gene.

# import libraries
import pandas as pd
import holoviews as hv
from holoviews import opts
hv.extension('bokeh')

# create dataframe
data = [
    ['IGHV4-61', 'G1K_CL2', 11],
    ['IGHV4-61', 'G1K_CL3', 12],
    ['IGHV4-61', 'G1K_CL3', 10],
    ['IGHV7-81', 'G1K_CL2', 13],
    ['IGHV7-81', 'G1K_CL3',  4],
]
df = pd.DataFrame(data, columns=['gene', 'db', 'mutations'])

# create layout plot with mutations on the y-axis
layout_plot = hv.Dataset(df).to.scatter('db', 'mutations').layout('gene')

# make plot look nicer
layout_plot = layout_plot.opts(opts.Scatter(size=10, ylim=(0, 15), width=250))

# show structure of holoviews layout plot
print(layout_plot)

# show plot in Jupyter
layout_plot

The structure of the plot looks as follows:

:NdLayout [gene]

:Scatter [db] (mutations)

The resulting plot looks like this:
Ndlayout for gene db and mutations

As an alternative you could also use library hvplot which is built on top of holoviews, it will give you the same as above. This works basically the same as pandas plotting, where you can use argument by='gene' and subplots='True' to create the Ndlayout.

# import libraries
import hvplot
import hvplot.pandas
hv.extension('bokeh')

# create layout plot with hvplot
layout_plot = df.hvplot(
    kind='scatter',
    x='db',
    y='mutations',
    by='gene',
    subplots=True,  # creates a layout
    size=100,  # marker size
    ylim=(0, 15), 
    width=250,  # width of plot
)

# show structure of holoviews layout plot
print(layout_plot)

# show plot in Jupyter
layout_plot
Sander van den Oord
  • 10,986
  • 5
  • 51
  • 96
  • Thank you for your answer but it's not exactly what I'm looking for. I'm trying to visualize it in one plot. The end goal is to add it on top of a boxplot which is formatted in the same way. – Moopsish Sep 23 '19 at 09:25