0

I got Amazon Stock data from Yahoo finances, but there are some missing days that have no data.

I was wondering if there was a way, using google collab, to then "create" those missing days and fill them in with the average of whatever the adjacent days are.

I want to use the closest days with data, before and after the missing days, to create data for the missing days.

I have looked at many answers of slack, but I can't find a specific answer to my solution. The command that seems the closest is: ws = Amazon.worksheet('Amazon') idx = pd.date_range(start = '05-15-1997', end = '07-05-2019') Amazon_df = get_as_dataframe(ws) AMZ = pd.DataFrame(Amazon_df) AMZ.index = pd.DatetimeIndex(AMZ.index) AMZ = AMZ.reindex(idx, fill_value=np.nan)

The problem with this command is that I will have to manually add the missing days and with Amazon stock, this will take a really long time.

I can't seem to figure out how to solve this problem. A link to the spreadsheet is https://docs.google.com/spreadsheets/d/1fLicjjVRTchd8ps6aiVsGfP1GVFfvJN2rgfoYxxSHZk/edit?usp=sharing

I want to figure out this data so I will be able to graph is without random 'missing' days. I would like to fill the "missing" days with the average values of the days before and after, that actually have data.

UserX
  • 105
  • 1
  • 10
  • You should extract a few data from the full dataset with missing days, and explain **and** show how you would like to fill missing values. – Serge Ballesta Jul 11 '19 at 15:36

1 Answers1

0

I think you could use simulation for fill missing values. I have a function rts_clean() but in R code (GeoRTS package), it's based on the STL-loess decomposition (trend, stationality and noise, that admit missing values), first decompose the time series in those 3 components, then you just simulate the values for noise using it's distribution (for available data). Then you obtain somthing like this:

Example: https://github.com/InstitutoInvestigacionesEconomicasPUCE/geortsBeta/blob/master/man/figures/rearme_img1.png

Code: https://github.com/InstitutoInvestigacionesEconomicasPUCE/geortsBeta/blob/master/R/rts_clean.R