1

Here is the code I am running, It creates a bar plot but i would like to group together values within $5 of each other for each bar in the graph. The bar graph currently shows all 50 values as individual bars and makes the data nearly unreadable. Is a histogram a better option? Also, bdf is the bids and adf is the asks.

import gdax
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from gdax import *
from pandas import *
from numpy import *
s= 'sequence'
b= 'bids'
a= 'asks'
public_client = gdax.PublicClient()
o = public_client.get_product_order_book('BTC-USD', level=2)
df = pd.DataFrame(o)
bdf = pd.DataFrame(o[b],columns = ['price','size','null'], dtype='float')
adf = pd.DataFrame(o[b],columns = ['price','size','null'], dtype='float')

del bdf['null'] bdf.plot.bar(x='price', y='size')
plt.show() 
pause = input('pause')

Here is an example of the data I receive as a DataFrame object.

       price       size
0   11390.99  13.686618
1   11389.40   0.002000
2   11389.00   0.090700
3   11386.53   0.060000
4   11385.26   0.010000
5   11385.20   0.453700
6   11381.33   0.006257
7   11380.06   0.011100
8   11380.00   0.001000
9   11378.61   0.729421
10  11378.60   0.159554
11  11375.00   0.012971
12  11374.00   0.297197
13  11373.82   0.005000
14  11373.72   0.661006
15  11373.39   0.001758
16  11373.00   1.000000
17  11370.00   0.082399
18  11367.22   1.002000
19  11366.90   0.010000
20  11364.67   1.000000
21  11364.65   6.900000
22  11364.37   0.002000
23  11361.23   0.250000
24  11361.22   0.058760
25  11360.89   0.001760
26  11360.00   0.026000
27  11358.82   0.900000
28  11358.30   0.020000
29  11355.83   0.002000
30  11355.15   1.000000
31  11354.72   8.900000
32  11354.41   0.250000
33  11353.00   0.002000
34  11352.88   1.313130
35  11352.19   0.510000
36  11350.00   1.650228
37  11349.90   0.477500
38  11348.41   0.001762
39  11347.43   0.900000
40  11347.18   0.874096
41  11345.42   7.800000
42  11343.21   1.700000
43  11343.02   0.001754
44  11341.73   0.900000
45  11341.62   0.002000
46  11341.00   0.024900
47  11340.00   0.400830
48  11339.77   0.002946
49  11337.00   0.050000

Is pandas the best way to manipulate this data?

2 Answers2

4

Not sure if I understand correctly, but if you want to count number of bids with a $5 step, here is how you can do it:

> df["size"].groupby((df["price"]//5)*5).sum()
price
11335.0     0.052946
11340.0     3.029484
11345.0    10.053358
11350.0    12.625358
11355.0     1.922000
11360.0     8.238520
11365.0     1.012000
11370.0     2.047360
11375.0     0.901946
11380.0     0.018357
11385.0     0.616400
11390.0    13.686618
Name: size, dtype: float64
Marat
  • 15,215
  • 2
  • 39
  • 48
2

You can using cut here

df['bin']=pd.cut(df.price,bins=3)
df.groupby('bin')['size'].sum().plot(kind='bar')

enter image description here

BENY
  • 317,841
  • 20
  • 164
  • 234