0

I'm trying to fetch weekly EOD data in python using yfinance module. I wanna use the same format of the excel code given below. gives me weekly closes (Friday)

=GOOGLEFINANCE("NIFTY_50", "close", DATE(2021,2,15), DATE(2022,5,16), "weekly")

Date Column from Excel output

2/19/2021 15:30:00
2/26/2021 15:30:00
3/5/2021 15:30:00
3/12/2021 15:30:00
3/19/2021 15:30:00
3/26/2021 15:30:00
4/1/2021 15:30:00
4/9/2021 15:30:00
4/16/2021 15:30:00
4/23/2021 15:30:00
4/30/2021 15:30:00
5/7/2021 15:30:00
5/14/2021 15:30:00
5/21/2021 15:30:00
5/28/2021 15:30:00
6/4/2021 15:30:00
6/11/2021 15:30:00
6/18/2021 15:30:00
6/25/2021 15:30:00
7/2/2021 15:30:00
7/9/2021 15:30:00
7/16/2021 15:30:00
7/23/2021 15:30:00
7/30/2021 15:30:00
8/6/2021 15:30:00
8/13/2021 15:30:00
8/20/2021 15:30:00
8/27/2021 15:30:00
9/3/2021 15:30:00
9/9/2021 15:30:00
9/17/2021 15:30:00
9/24/2021 15:30:00
10/1/2021 15:30:00
10/8/2021 15:30:00
10/14/2021 15:30:00
10/22/2021 15:30:00
10/29/2021 15:30:00
11/12/2021 15:30:00
11/18/2021 15:30:00
11/26/2021 15:30:00
12/3/2021 15:30:00
12/10/2021 15:30:00
12/17/2021 15:30:00
12/24/2021 15:30:00
12/31/2021 15:30:00
1/7/2022 15:30:00
1/14/2022 15:30:00
1/21/2022 15:30:00
1/28/2022 15:30:00
2/4/2022 15:30:00
2/11/2022 15:30:00
2/18/2022 15:30:00
2022-02-25
3/4/2022 15:30:00
3/11/2022 15:30:00
3/17/2022 15:30:00
3/25/2022 15:30:00
4/1/2022 15:30:00
4/8/2022 15:30:00
4/13/2022 15:30:00
4/22/2022 15:30:00
4/29/2022 15:30:00
5/6/2022 15:30:00
5/13/2022 15:30:00

I have tried to reproduce the same in python

attempt

periods=pd.date_range(start='2021-2-19',periods=67,freq='W')

start = periods[0].strftime('%Y-%m-%d')
end = periods[-1].strftime('%Y-%m-%d')


#y=x+pd.offsets.WeekOfMonth(week=0,weekday=4) #week0=first week ; week3=last week
#periods = pd.date_range(end=datetime.datetime.today(), periods=60, freq='M')
start = periods[0].strftime('%Y-%m-%d')
end = periods[-1].strftime('%Y-%m-%d')
#print(start, end)
print(start,end)



with open('/Users/ifinder/Coding/Projects/DEMO/Stock_market/Nifty_500_candlestick-screener/datasets/symbols.csv') as f:  
        for line in f:
            if "," not in line:
                continue
            symbol = line.split(",")[0]

            df = yf.download(symbol, start=start, end=end,retry_count=3,interval="1wk") 
            #print(df)

Problem : Yfinance interval"1wk" defaulted to Monday where as Google Finance dataset is based on Friday closing data. So I am looking for some help to display Friday weekly closing prices instead of Monday.

Additional Condition : I need the row count to be exactly as in the excel sheet and needs to be populated every week.

Start= -65th friday closing price (eg: 2/19/2021 15:30:00)

End= Current week's friday close (eg: 5/13/2022 15:30:00)

So on a typical Sat night , when I run the program, I'll be able to see the last 65 weeks Closing prices(FRIDAY) as an output.

Thanks!

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
Harry
  • 13
  • 4
  • "I need the row count to be exactly as in the excel sheet and needs to be populated every week." What currently happens when you use your code? How is that different from the requirement? – Karl Knechtel May 15 '22 at 02:51
  • By specifying Friday of the week as the frequency in the `pd.date_range()` function, a time series index for every Friday can be obtained. `pd.date_range(start='2021-2-19', periods=67, freq='W-FRI')` – r-beginners May 15 '22 at 03:54
  • Yes, I have tried that. Problem is with this line. df = yf.download(symbol, start=start, end=end,retry_count=3,interval="1wk") Yahoo finance weekly interval default setting is Monday. So even if I chose W-Fri in the PD.datarange, it gets nullified when it comes to yf.download line. – Harry May 15 '22 at 08:40
  • @KarlKnechtel Here's the python Output! Still prints the Monday values. – Harry May 15 '22 at 09:09
  • Try running for daily data within the `df = yf.download()`, then filter for index in periods, i.e. `df = df[df.index.isin(periods)]`. – Rawson May 15 '22 at 14:51
  • @Rawson Yes it works. Thanks a Lot! One of the issues that I noticed is that if Friday is holiday, that row gets skipped and it makes dataframe incomplete.What I need is to assign the latest data available as weekend closing data. For Example, If friday is holiday,thursday's data will be the weekly closing data. If friday and thursday are holidays, Wed data will replace. like that! Is it possibile ? Thanks for your help – Harry May 16 '22 at 12:42

1 Answers1

0

Whilst running daily data within df = yf.download(), then filtering for index in periods, i.e. df = df[df.index.isin(periods)] solved the "Friday's only" problem, you experienced that there were missing weeks.

To solve this you could add an additional line:

df = yf.download("AAPL", start=start, end=end, retry_count=3, interval="1d") 
# resample your data for the Friday, taking the last data point in the
df = df.resample("W-FRI").last()

This will return only Friday dates, and when there is no Friday it will still use the Friday date, but with the last data point in the week (so Thursdays data, for example).

If, however, you want the date of the last data point in the week, you could instead use a .groupby() by week, then take the .tail(1):

df = df.groupby(pd.Grouper(freq="W")).tail(1)

You shouldn't need to sort these, as the data will already be sorted from yf. But! If you need to, then you can create a function as the answer does here.

Rawson
  • 2,637
  • 1
  • 5
  • 14