-2

I need help to understand the output of this code. Why am I getting Nan instead of float value? Please suggest necessary amendments require:

import matplotlib.pyplot as plt
from scipy import stats
import pandas as pd
import fix_yahoo_finance as fyf
from pandas_datareader import data as pdr
import numpy as np
fyf.pdr_override()
p=pdr.get_data_yahoo('IBM',start ='2009-01-01',end ='2013-01-01')
p.to_csv('YF_IBM_2009_2013.csv')
print(p.head())
ret = (p.Close[1:]-p.Close[:-1])/p.Close[1:]
print ('ticker=','IBM','W-test, and P-value')
print (stats.shapiro(ret))

And output is:

ret = (p.Close[1:]-p.Close[:-1])/p.Close[1:]

print ('ticker=','IBM','W-test, and P-value')

print (stats.shapiro(ret))

ticker= IBM W-test, and P-value

(nan, 1.0)
halfer
  • 19,824
  • 17
  • 99
  • 186

1 Answers1

0

There is a small issue with your code. When you directly subtract two pandas series, the index comes along. Below is the output for

p.Close[1:]

enter image description here

Having index along with values is the reason you're getting nan values. To select only the values from a pandas series, you have to do

p.Close[1:].values

so the ret = line now is

ret = ((p.Close[1:].values-p.Close[:-1].values)/(p.Close[1:].values))

This should do what you're looking for. Comment if anything else is needed.

gauravtolani
  • 130
  • 8