1

I am uploading several excel files from a folder into PowerQuery and transforming them into an appropriate format. We have an excel file for every week. So there are 52 files in total. PowerQuery/ Excel needs for now round about 5min to refresh this connection(s). If the week is finished, the file will be no more be edited. So my objective is, that PowerQuery will only refresh the current week. The files have the following title: "Department Marketg (1)"

Does someone have any idea how to improve this connection?

Best regards Joshua

  • 1
    without knowing what is involved in the "transforming" there would be too much guess-work involved. Also, given the amount of files that you are transforming, how do you know that 5 minutes is unreasonable? Can you provide more details on your code? – Sam Jan 17 '19 at 13:36
  • The folder includes 52 files. The transformation is vary basically. Just removing some rows and colums, filtering and replacing values. The refresh for just a single table needs maybe 10 seconds. –  Jan 17 '19 at 13:43
  • 1
    52 * 10 seconds is over 8 minutes, so 5 minutes seems like a reasonable time for refreshing all of them. Unfortunately, I don't think there's a good way to do an incremental refresh in Excel. – Alexis Olson Jan 17 '19 at 16:07
  • 1
    I could imagine a setup where you setup an initial load of 52 weeks of data as a local table in the file, convert it to a self-linked query (where the source and output are the same table). Then you setup a query to your external data for only the two most recent weeks, have that appended to your self-linked query, have it control for duplicates/updated data, and then have it also drop off data older than a 52 week period if you want. But uh... I can't say this is a "good" way to do it. – Wedge Jan 17 '19 at 17:40
  • 1
    @joshua1990, you may receive better/more relevant suggestions if you post your `M`/query code code. Maybe your code can be optimised (10 seconds per file for just some filtering and replacing seems a lot on a modern computer, but guess it depends on size of file/contents too). If you have formulas that are linked to the table you're loading (especially if your output table has a lot of rows), it may be worth "breaking" the formulas before the refresh, then restoring them after the refresh (which indicates if your query is slow, or if it's just the loading to sheet that's the bottleneck). – chillin Jan 17 '19 at 21:37

1 Answers1

0

You can filter the files by their date of refresh/modification and then append this new data to your existing data to avoid loading it in from multiple files.

Filtering code:

Source = Folder.Files("FILEPATH"),

lastrefresh = Excel.CurrentWorkbook(){[Name="LastRefresh"]}[Content]{0}[LastRefresh],

#"Filtered Rows" = Table.SelectRows(Source, each [Date modified] > lastrefresh),
will1329
  • 173
  • 2
  • 14