0

So I'd like to compress some data down into minutes. I was thinking of using a loop with time? To pull data for every minute range (aka 09:30:00-09:30:59) for the time between 09:30:00 and 04:00:00, use it for some math, and then save it in another pandas DataFrame. I have no clue how to do this, and no clue what exactly to Google for a time loop. enter image description here

#Refrences
from time import *
import urllib.request as web
import pandas as pd
import os

forToday = 'http://netfonds.no/quotes/tradedump.php?csv_format=csv&paper='

def pullToday(exchange,stock):
    dateToday = strftime("%Y-%m-%d", localtime())
    fileName=('data/'+exchange+'/'+stock+'/'+dateToday+'.txt')
    try:
        if not os.path.isdir(os.path.dirname(fileName)):
            os.makedirs(os.path.dirname(fileName))
    except OSError:
        print("Something went very wrong. Review the dir creation section")

    pageBuffer=web.urlopen(forToday+stock+'.'+exchange)
    pageData=pd.read_csv(pageBuffer,usecols=['time','price','quantity'])
    for i in pageData.index:
        pageData['time'][i]=pd.datetime.strptime(pageData['time'][i],'%Y%m%dT%H%M%S')
        pageData['time'][i]-=pd.datetime.strptime(dateToday+"TZ06","%Y-%m-%dTZ%H")

    print(pageData)

    dataFile = open(fileName,'w')
    dataFile.write('#Format: Timestamp;Volume;Low;High;Median\n')
    dataFile.close()
    pageData.to_csv(fileName,index=False,sep=';',mode='a',header=False)

def getList(fileName):
    stockList = []
    file = open(fileName+'.txt', 'r').read()
    fileByLines = file.split('\n')
    for eachLine in fileByLines:
        if '#' not in eachLine:
            lineByValues = eachLine.split('.')
            stockList.append(lineByValues)
    return stockList

start_time = time()

stockList = getList('stocks')
#for eachEntry in stockList:
#    pullToday(eachEntry[0],eachEntry[1])

pullToday('O','AAPL')

delay=str(round((time()-start_time)))
print('Finished in ' + delay)

What I would like to do:

for eachMinute in pageData:
    for eachTrade in eachMinute:
        avgPriceSum+=quantityOfTrade*priceOfTrade
        minuteVolume+=quantityOfTrade

     avgPriceSum/=minuteVolume
Samuel
  • 65
  • 1
  • 5
  • 1
    I think you are looking at [resampling](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.resample.html) see the [docs](http://pandas.pydata.org/pandas-docs/stable/timeseries.html#up-and-downsampling) for further examples – EdChum Oct 07 '14 at 19:52
  • Would I be able to use the quantity as the frequency list for prices rather than time if I used resampling? – Samuel Oct 07 '14 at 19:58
  • you're going to have to explain bit better, also post raw data that is copy and pastable rather than an image. Also post the code that you want to perform on the df rather than the whole script that downloads, parses it cleans it etc.. – EdChum Oct 07 '14 at 20:01
  • Thing is, I don't know how to begin writing it. What I want to do is find the average price for each minute interval, using the quantity column as a frequency list during those intervals. – Samuel Oct 07 '14 at 20:09
  • Added clarification to my main post. – Samuel Oct 07 '14 at 20:21

0 Answers0