1

I'm quite new to Python and have mostly targeted learning the language exactly to automate some processes and update / populate excel spreadsheets with realtime data. Is there a way (e.g. through openpyxl) to update specific cells with data that's extracted through python packages such as pandas or web scraping through BeautifulSoup ?

I already have the necessary code to extract the data-series that I need for my project in Python but am stuck entirely on how to link this data to excel.

import pandas as pd
import pandas_datareader.data as web
import datetime as dt

start = dt.datetime(2000,1,1)
end = dt.datetime.today()

tickers = [
    "PYPL",
    "BABA",
    "SAP"
]

    df = web.DataReader (tickers, 'yahoo', start, end)

print (df.tail(365)['Adj Close'])
phk31
  • 11
  • 1
  • 2
  • 1
    If you're actually needing *real-time* data, you will need to implement an [RTD server](https://github.com/SublimeText/Pywin32/blob/master/lib/x32/win32com/demos/excelRTDServer.py) that's accessible through COM, which can then be called from the RTD worksheet function in Excel ([more on RTD server](https://learn.microsoft.com/en-us/office/troubleshoot/excel/set-up-realtimedata-function)). Alternatively, you'll need a socket connection to the external server data, and you could pump those changes into a CSV, database, etc., and create a Connection QueryTable in Excel, etc. – David Zemens Aug 09 '19 at 14:18
  • Thanks a lot. I have to be honest, that's way beyond where my programming level is at. And its more about having the automation of daily price updates in the excel sheet and creating a one/two year rolling price window. For that I now just used a loop and the pandas excel export method. The only things im left wondering is whether there is a way to only extract the Adj Close to the excel sheet and if I can numerate the Ticker symbols in python so that they are also numerated in the updating excel. – phk31 Aug 09 '19 at 14:46
  • well you can index the dataframe as you've done in your print statement, and just send the necessary data to excel rather than the entire dataframe. Something like `pd.DataFrame(df.tail(365)['Adj Close']).to_excel(filename)`. – David Zemens Aug 09 '19 at 15:00
  • Worked great. Thanks a lot. Just for curiosity, if I wanted both Open and Adj Close, is there a syntax format I have to use in order to specify that because I tried `df.tail(365)['Adj Close','Open'].to_excel(filename)` as well as `df.tail(365)['Adj Close']['Open'].to_excel(filename)` and both don't seem to work – phk31 Aug 09 '19 at 15:45
  • You would need to use: `df.tail(365)[['Adj Close', 'Open']]`. – David Zemens Aug 09 '19 at 16:40

4 Answers4

1

One option is to run your python script run on a schedule and output to .csv or another format that Excel can link to. This option allows the data to be updated whenever the python script is executed.

Setup:

  1. Output your dataframe to csv/database or other Excel readable format
  2. Setup your python file to run on a schedule (either by scheduling, or a loop with a delay)
  3. Create a data connection from Excel to your python outputted file/database
  4. Build pivot tables based on table in Excel
  5. Refresh data connection/pivot tables in Excel to get the new data
chucklukowski
  • 1,996
  • 2
  • 13
  • 13
  • Thanks! I got what you mean, but for point 2., I don't have a task scheduler so would have to refer to loops in order to schedule the .py file. Any advice regarding that? – phk31 Aug 09 '19 at 14:04
  • For the loops, use time.sleep(num_seconds) inside of your loop. For scheduling, depends on operating system. On windows, Task Scheduler. – chucklukowski Aug 09 '19 at 14:10
0

Pandas has a method to export a Dataframe to Excel. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_excel.html

filename = "output.xlsx"
df.to_excel(filename)
jdpy19
  • 375
  • 1
  • 13
  • Thanks a lot. Would you know if there is a way to only have a rolling window of df.tail(365) as the output for e.g. only the Adj Close or the Open of the given ticker. Also would this script update the same excel file for the 1 year rolling window? – phk31 Aug 09 '19 at 13:58
0

One easy solution is using xlwings library

import xlwings as xw
..

xw.Book(file_path).sheets['Sheet_name'].range('A1').value = df 

this would print out your df to cell A1 of an excel file, via COM - which means it actually writes the values while file is open.

Hope this is helpful

Lorenzo Bassetti
  • 795
  • 10
  • 15
0

(Appreciate that this is an old question). Real time data in Excel is possible with xlOil. xlOil allows you to very easily define an Excel RTD (real time data) function in python. Excel's RTD functions operate outside the normal calc cycle and can push data onto a sheet.

Your example could be written as:

import xloil, datetime as dt, asyncio
import pandas_datareader.data as web

start = dt.datetime(2000,1,1)

@xloil.func
async def pyGetTickers(names:list, fetch_every_secs=10):
    while True:
        yield web.DataReader(
            names, 'yahoo', start, dt.datetime.now())
        await asyncio.sleep(fetch_every_secs)
 

Which would appear as a worksheet function pyGetTickers.

stevecu
  • 336
  • 2
  • 7