1

I suspect this is a simple coding issue, but I cannot figure out where I'm going wrong:

I'm running plotly in a Jupyter notebook and want the bubbles of a scatter plot to correspond to one of the features.

I include 5 features:

  1. Dimension A (x),
  2. Dimension B (y),
  3. Municipality (the bubbles),
  4. Province (colour),
  5. Municipal Size.

There are several municipalities per province; the code below nicely assigns the same colour to municipalities of the same province.

figure={'data': [
                go.Scatter(
                    x=df[df['Province'] == i]['Dimension A'],
                    y=df[df['Province'] == i]['Dimension B'],
                    text=df[df['Province'] == i]['Municipalities'],
                    mode='markers',
                    opacity=0.7,
                    marker={
                        'size': df['Size'],
                        'line': {'width': 0.5, 'color': 'white'}
                    },
                    name=i
                ) for i in df.Province.unique()
            ],
            'layout': go.Layout(               
                xaxis={'title': 'Dimension A'},
                yaxis={'title': 'Dimension B'},
                height=700,
                width= 1200,
                margin={'l': 40, 'b': 40, 't': 10, 'r': 10},
                legend={'x': 0, 'y': 1},
                hovermode='closest',
                autosize = False,
            )     
}
iplot(figure)

However, when I try to assign the size of the bubble to the "Size" feature, it goes weird. It does change the size of the first municipality's bubble. But the output also sees the first municipalities' value applied to every other province's first municipality (in the order it appears in the df). Maddening. Wisdom massively appreciated.

Here is a snippet of the data:

Municipalities,Province,Dimension B,Dimension A,Size //
!Kheis,NC,0.79,1.39,4.22 //
//Khara Hais,NC,0.61,4.75,4.97 //
Abaqulusi,KZN,0.44,4.43,5.32 //
Aganang,LP,0.55,2.56,5.12 //
Albert Luthuli,MP,0.66,3.56,5.27 //
Amahlathi,EC,0.20,3.33,5.09 //
Ba-Phalaborwa,LP,0.22,4.33,5.18 //
Baviaans,EC,0.10,1.95,4.25 //
Beaufort West,WC,0.68,3.04,4.70 //
Bela-Bela,LP,0.60,3.87,4.82 //
Bergrivier,WC,0.53,2.77,4.79 //
Bitou,WC,0.68,5.00,4.69 //
Blouberg,LP,0.37,4.34,5.21 //
Blue Crane Route,EC,0.70,2.20,4.56 //
Breede Valley,WC,0.88,4.67,5.22 //
Buffalo City,EC,0.67,6.57,5.88 //
Bushbuckridge,MP,0.87,5.14,5.73 //
Camdeboo,EC,0.10,1.61,4.71 //
Cape Agulhas,WC,0.42,3.00,4.52 //
Cederberg,WC,0.73,2.94,4.70 //
City of Cape Town,WC,0.73,7.74,6.57 //
City of Johannes,GA,0.65,7.81,6.65 //
sentence
  • 8,213
  • 4
  • 31
  • 40
RandomForestRanger
  • 257
  • 1
  • 5
  • 16
  • Please, provide an excerpt of our DataFrame in order to make your problem [reproducible](https://stackoverflow.com/help/minimal-reproducible-example). Thanks. – sentence Jun 10 '19 at 10:16
  • 1
    Thanks, @sentence. I'm not sure the data I provided above is an improvement? I think it's something with the loop... – RandomForestRanger Jun 10 '19 at 11:27
  • Have you tried `'size': df[df['Province'] == i]['Size']` inside the marker dictionary? – sentence Jun 10 '19 at 12:39

1 Answers1

1

You should modify the size key in marker dictionary from:

'size': df['Size'],

to:

'size': df[df['Province'] == i]['Size']

and you get:

enter image description here

sentence
  • 8,213
  • 4
  • 31
  • 40