0

I've reached a point where I don't know what else to do.

I have this df

dff8df      !!!0 WWSF   VPIP   VPIP   VPIP
0         52.0  38.64  38.64  38.64
1         62.0  50.00  50.00  50.00
2         73.0  56.25  56.25  56.25
3         99.0  23.08  23.08  23.08
4         30.0  41.67  41.67  41.67
..         ...    ...    ...    ...
540       50.0  18.75  18.75  18.75
541       99.0  26.32  26.32  26.32
542       50.0  28.57  28.57  28.57
543       83.0  14.29  14.29  14.29
544       57.0  38.89  38.89  38.89

[545 rows x 4 columns]

and this code:

print("dff8df",dff8)

    figure = px.scatter(data_frame=dff8,
                   x=dff8[:,drop1],
                   y=dff8[:,drop2],
                   color=dff8[:,drop4],
                   size=dff8[:,drop3],
                   trendline="ols",
                   trendline_color_override="red",
                   title="%s vs %s"%(drop1, drop2),
                   )

and it gives this error with plotly in python:

pandas.errors.InvalidIndexError: (slice(None, None, None), 'VPIP')

How can I solve this problem?

Derek O
  • 16,770
  • 4
  • 24
  • 43

1 Answers1

0

A number of areas to consider

  • this is a pandas error not a plotly error
  • you have not defined drop1 etc so I can only assume you mean them to be column names. Hence have defined as first four columns of your sample data
  • your sample data looks problematic, multiple columns with name VPIP the data frame created from read_csv()
dff8df !!!0 WWSF VPIP VPIP.1 VPIP.2
0 52 38.64 38.64 38.64 nan
1 62 50 50 50 nan
2 73 56.25 56.25 56.25 nan
3 99 23.08 23.08 23.08 nan
  • dff8[:,drop1] is invalid pandas code to select a column. This should be dff8.loc[:, drop1]. Additionally why so slice in this way when you can just pass column names to plotly express

working code

import io
import pandas as pd
import plotly.express as px

dff8 = pd.read_csv(
    io.StringIO(
        """dff8df      !!!0 WWSF   VPIP   VPIP   VPIP
0         52.0  38.64  38.64  38.64
1         62.0  50.00  50.00  50.00
2         73.0  56.25  56.25  56.25
3         99.0  23.08  23.08  23.08
4         30.0  41.67  41.67  41.67
540       50.0  18.75  18.75  18.75
541       99.0  26.32  26.32  26.32
542       50.0  28.57  28.57  28.57
543       83.0  14.29  14.29  14.29
544       57.0  38.89  38.89  38.89"""
    ),
    sep="\s+",
    index_col=0,
)

drop1, drop2, drop3, drop4 = dff8.columns[0:4].tolist()

figure = px.scatter(
    data_frame=dff8,
    x=dff8.loc[:, drop1],
    y=dff8.loc[:, drop2],
    color=dff8.loc[:, drop4],
    size=dff8.loc[:, drop3],
    trendline="ols",
    trendline_color_override="red",
    title="%s vs %s" % (drop1, drop2),
)


figure
Rob Raymond
  • 29,118
  • 3
  • 14
  • 30
  • Hi Rob, I'm doing another graph in the same dashboard, only one variable, but gives me KeyError: 0 is it because of drop1 = dfff3.columns[0:1].tolist() (is there something wrong with this? – boxertrain Sep 12 '22 at 17:27
  • what is `drop1`? an argument to a **callback** that is the value of a dropdown? as I stated in answer, it's not clear why you are deciding to slice your dataframe in this unique way rather that just pass column names rather than **pandas** series to `px.scatter()`. I would expect `px.scatter(dfff3, y=drop1)` would work and be far simpler to code and avoid coding errors... the reason `drop1, drop2, drop3, drop4 = dff8.columns[0:4].tolist()` is in the answer is because the question didn't specify what they are so I had to initialise them to something... – Rob Raymond Sep 12 '22 at 17:35
  • the drop is an argument to a callback – boxertrain Sep 12 '22 at 17:40
  • then just go with the simple normal way to use **plotly express**. Use all the convenience capabilities of high level API – Rob Raymond Sep 12 '22 at 18:27
  • yes .... I changed the comment coz I did that after. anyway thanks – boxertrain Sep 12 '22 at 20:29