As you didn't present your input file, I prepared it on my own:
timestamp;bids
1579082401;[123.12, 300]
1579082461;[135.40, 220]
1579082736;[130.76, 20]
1579082801;[123.12, 180]
To read it I used:
orderbook = pd.read_csv('Input.csv', sep=';')
orderbook.timestamp = pd.to_datetime(orderbook.timestamp, unit='s')
Its content is:
timestamp bids
0 2020-01-15 10:00:01 [123.12, 300]
1 2020-01-15 10:01:13 [135.40, 220]
2 2020-01-15 10:05:36 [130.76, 20]
3 2020-01-15 10:06:41 [123.12, 180]
Now:
- timestamp has been converted to native pandasonic type of datetime,
- but bids is of object type (actually, a string).
and, as I suppose, this is the same when read from your input file.
And now the main task: The first step is to extract both numbers from bids,
convert them to float and int and save in respective columns:
orderbook = orderbook.join(orderbook.bids.str.extract(
r'\[(?P<bidprice>\d+\.\d+), (?P<bidvolume>\d+)]'))
orderbook.bidprice = orderbook.bidprice.astype(float)
orderbook.bidvolume = orderbook.bidvolume.astype(int)
Now orderbook contains:
timestamp bids bidprice bidvolume
0 2020-01-15 10:00:01 [123.12, 300] 123.12 300
1 2020-01-15 10:01:01 [135.40, 220] 135.40 220
2 2020-01-15 10:05:36 [130.76, 20] 130.76 20
3 2020-01-15 10:06:41 [123.12, 180] 123.12 180
and you can generate e.g. a scatter plot, calling:
orderbook.plot.scatter('bidprice', 'bidvolume');
or other plotting function.
Another possibility
Or maybe your orderbook_data is a dictionary? Something like:
orderbook_data = {
'timestamp': [1579082401, 1579082461, 1579082736, 1579082801],
'bids': [[123.12, 300], [135.40, 220], [130.76, 20], [123.12, 180]] }
In this case, when you create a DataFrame from it, the column types
are initially:
- timestamp - int64,
- bids - also object, but this time each cell contains a plain
pythonic list.
Then you can also convert timestamp column to datetime just like
above.
But to split bids (a column of lists) into 2 separate columns,
you should run:
orderbook[['bidprice', 'bidvolume']] = pd.DataFrame(orderbook.bids.tolist())
Then you have 2 new columns with respective components of the
source column and you can create your graphics jus like above.