-1

Let's say I have data like this :

[
 {'time': 1626459705; 'price': 278.989978}, 
 {'time': 1626459695; 'price': 279.437975}
]

Note : This is just a sample data I created myself. In actual there may be any number of transactions per minute. So, data will vary per minute.

How can I convert it into OHLC Candlestick data for say 1 or 3 or 5 Minutes by using Python without using any external library like Pandas? Is it possible to do in an easy way?

Thanks in Advance

Rakesh Poddar
  • 183
  • 3
  • 11
  • An "OHLC" chart is usually day by day. Otherwise, you don't really have "open" and "close". That looks like 10-second intervals, so you could certainly gather up "start", "max", "min", and "end" for an interval that's a multiple of 10 seconds. Graphing it would require an external library, of course. – Tim Roberts Dec 23 '21 at 04:07
  • I have just included a sample for data. In actual there is nothing like 10 sec or 3 sec interval. There may be 3 or 4 or more than that number of transactions in 1 minute. – Rakesh Poddar Dec 23 '21 at 04:16
  • Why can't we do with Pure Python? – Rakesh Poddar Dec 23 '21 at 04:18
  • Because it is pointless. Arithmetic is easy. Graphics is hard -- really hard. It would be dumb to waste your time writing a new graphics library for every project. Other people have done the hard work already. The code is there, tested and working, and available for you to use. Use it. You should focus on your problem, not on solving already solved problems. – Tim Roberts Dec 23 '21 at 04:26

1 Answers1

2

Here is code that generates random data and creates an OHLC table.

import random
import pprint

# Generate random walk data..

base = 1626459705
price = 278.989978
data = []
for i in range(600):
    data.append( {'time':base+10*i, 'price':price} )
    price += random.random() * 3 - 1.5
print(data)

# Produce 3 minute intervals.

ohlc = []
interval = 180

base = 0
# start time, open, high, low, close
rec = [ 0, 0, 0, 99999, 0 ]
ohlc = []
for row in data:
    rec[2] = max(rec[2],row['price'])
    rec[3] = min(rec[3],row['price'])
    if row['time'] >= base+interval:
        if rec[0]:
            rec[4] = row['price']
            ohlc.append( dict(zip(('time','open','high','low','close'),rec)) )
        base = rec[0] = row['time']
        rec[1] = rec[2] = rec[3] = row['price']

pprint.pprint(ohlc)

FOLLOWUP

OK, here's one that works with your data. I just copied that file to "mydata.json" (and removed the first "data ="). Note that this prints the output on actual 3-minute intervals, rather than basing it on each line of the input.

import pprint
import json
import time

# Produce 3 minute intervals.

data = json.load(open('mydata.json'))
data.reverse()

interval = 180
base = data[0]['time'] // interval * interval

# start time, open, high, low, close
rec = [ base, data[0]['price'], data[0]['price'], data[0]['price'], 0 ]

ohlc = []

i = 0
while i < len(data):
    row = data[i]

    # If this sample is beyond the 3 minutes:
    if row['time'] > rec[0]+interval:
        ohlc.append( dict(zip(('time','open','high','low','close'),rec)) )
        rec[0] += interval
        rec[1] = rec[2] = rec[3] = rec[4]
    else:
        rec[2] = max(rec[2],row['price'])
        rec[3] = min(rec[3],row['price'])
        rec[4] = row['price']
        i += 1

for row in ohlc:
    row['ctime'] = time.ctime(row['time'])
    print( "%(ctime)s: %(open)12f %(high)12f %(low)12f %(close)12f" % row )

Sample output:

Wed Dec 22 22:27:00 2021:   454.427421   454.427421   454.427421   454.427421
Wed Dec 22 22:30:00 2021:   454.427421   454.427421   454.427421   454.427421
Wed Dec 22 22:33:00 2021:   454.427421   454.427421   454.427421   454.427421
Wed Dec 22 22:36:00 2021:   454.427421   457.058452   453.411757   453.411757
Wed Dec 22 22:39:00 2021:   453.411757   455.199204   452.589304   455.199204
Wed Dec 22 22:42:00 2021:   455.199204   455.199204   455.199204   455.199204
Wed Dec 22 22:45:00 2021:   455.199204   455.199204   455.199204   455.199204
Wed Dec 22 22:48:00 2021:   455.199204   455.768577   455.199204   455.768577
Wed Dec 22 22:51:00 2021:   455.768577   455.768577   455.768577   455.768577
Wed Dec 22 22:54:00 2021:   455.768577   455.768577   452.348469   454.374116
Tim Roberts
  • 48,973
  • 4
  • 21
  • 30
  • Thanks, I will check it out right now :) – Rakesh Poddar Dec 23 '21 at 09:00
  • Hey.. It's not working with my data. Check data here : https://pastebin.com/mk7LyLi8 – Rakesh Poddar Dec 23 '21 at 10:44
  • It is not showing the OHLC data correctly to intervals between trades which are sometimes 3 or 4 minutes or more. For these intervals, it should show rather a dashed line which means, O = H = L = C. Currently, it's merging data. – Rakesh Poddar Dec 23 '21 at 20:53
  • Hi, Sorry I could not check the Follow Up before. Stack overflow did not notified me. I just saw it today and It's working perfectly. Thanks a lot :) – Rakesh Poddar Jan 01 '22 at 12:20
  • Just one question. Why are you dividing base by 180*180 here `base = data[0]['time'] // interval * interval` ? – Rakesh Poddar Jan 02 '22 at 07:04
  • I'm not. Those are done in left-to-right order, so it does an integer divide by 180, then a multiply by 180. The net effect is to round down to the nearest multiple of 180. So 400 // 180 = 2, x 180 = 360. – Tim Roberts Jan 02 '22 at 07:34