0

Hello all you power query wizards,

I have a similar question to this question: Timeseries with overlapping timeframes, using just the most recent in Excel Power Query, except my column isn't just a date column, but instead a date/time column. I am bringing together a directory of files that look like this and have overlapping times but I only want to keep the newer data instead of combining them together:

List A
List A

List B
List B

Does anyone have a strategy to accomplish this goal or is this something I should do outside of Power Query, such as python?

Many thanks in advance for any insight you can provide!


let
    Source = Folder.Files("C:\Users\xxxx\OneDrive\Documents\Atom Projects\10MinOrtho\2. Orthometric\2021-06\10MinOrthos"),
    #"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
    #"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File (2)", each #"Transform File (2)"([Content])),
    #"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
    #"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File (2)"}),
    #"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File (2)", Table.ColumnNames(#"Transform File (2)"(#"Sample File (2)"))),
    #"Changed Type" = Table.TransformColumnTypes(#"Expanded Table Column1",{{"Source.Name", type text}, {"Column1", type date}, {"Column2", type time}, {"Column3", type number}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Source.Name"}),
    #"Merged Date and Time" = Table.CombineColumns(#"Removed Columns", {"Column1", "Column2"}, (columns) => List.First(columns) & List.Last(columns), "Merged"),
    #"Sorted Rows" = Table.Sort(#"Merged Date and Time",{{"Merged", Order.Ascending}})
in
    #"Sorted Rows"

  • It is unclear why you highlighted the ones you did, which set you want to retain, and why – horseyride Jul 09 '21 at 12:42
  • My apologies, I have a series of 25+ files that have similar overlap to the two above. I am looking to combine them all into two columns one that is datetime and one that is the float value. I would like to retain the overlapping data from the second files(which are newer in time) and strip out the overlapping data from the first file. Then, I need it to iterate through all the files in a folder completing the same process for all files. – ForeverLostAtSea Jul 10 '21 at 01:48
  • I've shown how to handle the overlapping dare issue you asked about. You should be able to repeat that for each file you download. – Ron Rosenfeld Jul 10 '21 at 02:12
  • Hey Ron, thanks for the quick response! This is obviously a dumb question, but although I have a decent understanding of VBA, Power Query is super new to me, but anyway, where does this bit of code go? I have added the rest of my code that I am using in my question. Do I need to set up variables? Bring these in two files at a time? Again, I apologize, but this is a learning process. – ForeverLostAtSea Jul 10 '21 at 06:12
  • I do not get notified of comments you make to your own question. If you have a question about my answer, please post it as a comment to that answer. My answer was in response to the information you posted, showing two lists, and, as you asked, demonstrating a strategy for handling the problem. I suppose you could conceptualize your real problem as comparing the data you have already downloaded with the file coming in. So I would cycle through your files in date order, and delete the relevant lines from your previous data before merging the new. – Ron Rosenfeld Jul 10 '21 at 10:11

1 Answers1

0

You don't describe exactly what you want to do with the overlapped times.

I suggest

  • remove the entries from List A that are in the overlap region with List B.
  • This can be done with a simple filter based on the first time listed in List B
    • I have assumed that List B is in date/time sorted order. If not a minor code change will be required
  • Then append the two lists

M Code

let
    Source = Excel.CurrentWorkbook(){[Name="ListA"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date/Time", type datetime}, {"Value", type number}}),
    Source2 = Excel.CurrentWorkbook(){[Name="ListB"]}[Content],
    #"Changed Type2" = Table.TransformColumnTypes(Source2,{{"Date/Time", type datetime}, {"Value", type number}}),

//overlap starts at the first date from the second list
overlapStart = #"Changed Type2"[#"Date/Time"]{0},

//Filter list A to end before start time in List B
    filteredA = Table.SelectRows(#"Changed Type", each [#"Date/Time"] < overlapStart),

//now combine the two lists
    combLists = Table.Combine({filteredA,#"Changed Type2"})

in
    combLists

Lists A & B
enter image description here

Combined
enter image description here

Ron Rosenfeld
  • 53,870
  • 7
  • 28
  • 60