0

I am using the code below to pull data from Google Finance. The timestamp is in POSIX form, so it is converted into data time. When I try to filter it based on a time criteria (14:35:00), it returns an empty table. I suspect it has to do with the POSIX/ datetime conversion, but have no idea how to resolve it.

def get_intraday_data(symbol, interval_seconds=301, num_days=10):
    # Specify URL string based on function inputs.
    url_string = 'http://www.google.com/finance/getprices?q={0}'.format(symbol.upper())
    url_string += "&i={0}&p={1}d&f=d,o,h,l,c,v".format(interval_seconds,num_days)

    # Request the text, and split by each line
    r = requests.get(url_string).text.split()

    # Split each line by a comma, starting at the 8th line
    r = [line.split(',') for line in r[7:]]

    # Save data in Pandas DataFrame
    df = pd.DataFrame(r, columns=['Datetime','Close','High','Low','Open','Volume'])

    # Convert UNIX to Datetime format
    df['Datetime'] = df['Datetime'].apply(lambda x: datetime.datetime.fromtimestamp(int(x[1:])))

    #Seperate Date and Time
    df['Time'],df['Date']= df['Datetime'].apply(lambda x:x.time()), df['Datetime'].apply(lambda x:x.date()) 

    #Convert 'Close','High','Low','Open', deleting 'Volume'

''''df['Close'] = df['Close'].astype('float64')
df['High'] = df['High'].astype('float64')
df['Low'] = df['Low'].astype('float64')
df['Open'] = df['Open'].astype('float64')'''
    del df['Volume']
    del df['Datetime']

    df[['Close','High','Low','Open']]  = df[['Close','High','Low','Open']].astype('float64')

   # Calculating %Change and Range 

    df['%pct'] = (df['Close'] - df['Open'])/df['Open']
    df['Range'] = df['High'] - df['Low']

    #Sort Columns


    return df

I have stored the results of this function as NAS

NAS = get_intraday_data('IXIC', interval_seconds=301, num_days= 100)

The Filter Criteria is:

NAS[NAS['Time'] == '14:35:00']

I will appreciate assistance on this.

Scott Boston
  • 147,308
  • 15
  • 139
  • 187
0xsegfault
  • 2,899
  • 6
  • 28
  • 58

2 Answers2

1

You can use this

NAS.query('Datetime.dt.hour==14 and Datetime.dt.minute==35 and Datetime.dt.second==0')

Edit: Applied dt on datetime series instead of time series

raw_data = {'Datetime': ['2015-05-01T14:35:00', '2016-07-04T02:26:00', '2013-02-01T04:41:00']}
df = pd.DataFrame(raw_data, columns = ['Datetime'])
df["Datetime"] = pd.to_datetime(df["Datetime"])
df['Time'],df['Date']= df['Datetime'].apply(lambda x:x.time()), df['Datetime'].apply(lambda x:x.date()) 
df = df.set_index(df["Datetime"])
df['hour']=df['Datetime'].dt.hour
df['minute']=df['Datetime'].dt.minute
df['second']=df['Datetime'].dt.second
df.query('Datetime.dt.hour==14 and Datetime.dt.minute==35 and Datetime.dt.second==0')
Saurabh
  • 7,525
  • 4
  • 45
  • 46
1

I see that you are converting timestamp to datetime incorrectly. You are calling datetime twice.

Replace

df['Datetime'] = df['Datetime'].apply(lambda x: datetime.datetime.fromtimestamp(int(x[1:])))

with

df['Datetime'] = df['Datetime'].apply(lambda x: datetime.fromtimestamp(int(x[1:])))

In 2nd part of your question:

NAS = get_intraday_data('IXIC', interval_seconds=301, num_days= 100)

NAS[NAS['Time'] == '14:35:00']

You care comparing instance of datetime.time with string, which is not correct. Try

NAS[NAS['Time'] == datetime.strptime('14:35:00', '%H:%M:%S').time()]

it should work as expected.

Update:

Running script with suggested changes will display data as:

Close High Low Open Time Date %pct
60 5162.448 5165.124 5162.448 5165.057 14:35:00 2016-07-29 -0.000505
138 5181.768 5183.184 5181.193 5181.404 14:35:00 2016-08-01 0.000070
216 5130.514 5131.933 5130.434 5131.893 14:35:00 2016-08-02 -0.000269
294 5146.608 5146.608 5143.827 5144.788 14:35:00 2016-08-03 0.000354
372 5163.854 5164.154 5162.997 5164.021 14:35:00 2016-08-04 -0.000032
450 5221.624 5221.911 5220.658 5220.789 14:35:00 2016-08-05 0.000160
528 5204.111 5204.240 5202.476 5202.865 14:35:00 2016-08-08 0.000239
. . . 3648 5282.999 5283.017 5279.008 5279.340 14:35:00 2016-10-04 0.000693
3726 5324.450 5325.375 5323.628 5324.129 14:35:00 2016-10-05 0.000060
3804 5310.945 5311.454 5310.194 5310.558 14:35:00 2016-10-06 0.000073
3882 5295.064 5295.080 5292.184 5292.327 14:35:00 2016-10-07 0.000517

Saleem
  • 8,728
  • 2
  • 20
  • 34
  • Thanks for helping. This method doesnt work. The problem is what the next line which strips the datetime into a date and a time. The method I have used converts it from an object64 date time into time. I think the solution revolves around converting it back into an object 64. Unfortunately the time object cannot be converted directly into an object 64. A possible solution might be to convert this back into a string , and them a date time. Seems convoluted. There must be a more direct method @Saleem – 0xsegfault Oct 08 '16 at 14:01
  • 1
    @Data_Kid what do you mean it's not working? Have you gave it a try? I'm able to filter and display dataframe for give time – Saleem Oct 08 '16 at 15:17
  • that's strange. When I took out one date time it threw out an attribute error. I have found a work around that works..will post it later as I have run into other issues I am trying to resolve now – 0xsegfault Oct 08 '16 at 15:20
  • I have just seen the edit. It works like a charm!! thanks! – 0xsegfault Oct 08 '16 at 15:24
  • 1
    @Data_Kid, asking for upvote is not a good idea. Please see http://meta.stackoverflow.com/questions/323847/is-it-okay-to-ask-for-upvotes-in-a-question – Saleem Oct 08 '16 at 15:32
  • very new to the platform. Thanks for the heads up – 0xsegfault Oct 08 '16 at 16:34