0

What I am trying to achieve is to obtain "matches/pairs" from two tables. One (source 1)is data table with Date/Time and Pressure value columns and the other (source 2) is like Date/Time and Info value Columns. Second table has so called "pairs" , start and stop in certain time. I want to get exact matches when is found in source 1 or approximate match when is not exact as in source 1 (seconds can be a problem).

Lets say you are matching/lookup two tables, give me everything that falls between for instance 15.01.2022 06:00:00 and 15.01.2022 09:15:29.

Where I have a problem is more likely exact match and seconds. It is skipping or cant find any pair if the seconds are not matching. So my question is how to make if not seconds then lookup for next availablee match, can be a minute too as long as they are in the given range (start stop instances). That is a reason I am getting this Expression error. Or is there a way to skip that error and proceed with Query??

Link to download the data:

https://docs.google.com/spreadsheets/d/1Jv5j7htAaEFktN0ntwOZCV9jesF43tEP/edit?usp=sharing&ouid=101738555398870704584&rtpof=true&sd=true

On the code below is what I am trying to do:

    let

//Be sure to change the table names in the Source= and Source2= lines to be the actual table names from your workbook
    Source = Excel.CurrentWorkbook(){[Name="Parameters"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date/Time", type datetime}, {"P7 [mbar]", Int64.Type}}),
       
//get start/stop times table
    Source2 = Excel.CurrentWorkbook(){[Name="Log_Original"]}[Content],
    typeIt = Table.TransformColumnTypes(Source2, {"Date/Time", type datetime}),
    #"Filtered Rows" = Table.SelectRows(typeIt, each ([#"Date/Time"] <> null)),
    #"Added Index" = Table.AddIndexColumn(#"Filtered Rows", "Index", 0, 1),

#"Added Custom" = Table.AddColumn(#"Added Index", "NextLineStart", each if Text.Contains([Info],"start", Comparer.OrdinalIgnoreCase) = true 
                and Text.Contains(#"Added Index"[Info]{[Index]+1},"start",Comparer.OrdinalIgnoreCase) = true 
            then "delete" 
            else null),
    #"Filtered Rows1" = Table.SelectRows(#"Added Custom", each ([NextLineStart] = null)),
    #"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows1",{"Index", "NextLineStart"}),

//create a list of all the relevant start/stop times
     filterTimes = List.Combine(
                        List.Generate(
                            ()=> [times = List.DateTimes(#"Removed Columns1"[#"Date/Time"]{0}, 
                                                        Duration.TotalSeconds(#"Removed Columns1"[#"Date/Time"]{1}-#"Removed Columns1"[#"Date/Time"]{0})+1,
                                                        #duration(0,0,0,1)), IDX = 0],
                            each [IDX] < Table.RowCount(#"Removed Columns1"),
                            each [times = List.DateTimes(#"Removed Columns1"[#"Date/Time"]{[IDX]+2}, 
                                                        Duration.TotalSeconds(#"Removed Columns1"[#"Date/Time"]{[IDX]+3}-#"Removed Columns1"[#"Date/Time"]{[IDX]+2})+1,
                                                        #duration(0,0,0,1)), IDX = [IDX]+2],
                            each [times]
                                    )
                            ),

//filter the table using the list
filterTimesCol = Table.FromList(filterTimes,Splitter.SplitByNothing()),
filteredTable = Table.Join(#"Changed Type","Date/Time",filterTimesCol,"Column1",JoinKind.Inner),
    #"Removed Columns" = Table.RemoveColumns(filteredTable,{"Column1"}),
    #"Added Custom1" = Table.AddColumn(#"Removed Columns", "Custom", each DateTime.ToText([#"Date/Time"],"dd-MMM-yy")),
    #"Filtered Rows2" = Table.SelectRows(#"Added Custom1", each [#"Date/Time"] > #datetime(2019, 01, 01, 0, 0, 0)),
    #"Sorted Rows" = Table.Sort(#"Filtered Rows2",{{"Date/Time", Order.Ascending}})
in
    #"Sorted Rows"
MmVv
  • 553
  • 6
  • 22

1 Answers1

0

I set up the below to return a sorted table with all results between the start and ending date/times. You can then select the first or middle or bottom row of each table if you want from this point. Its hard to tell from your question if you are looking for the value closest to the start value, closest to the end value or something inbetween. You can wrap my Table.Sort with a Table.FirstN or Table.LastN to pick up the first or last row.

I left most of your starting code alone

let Source = Table.Buffer(T1),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date/Time", type datetime}, {"P7 [mbar]", Int64.Type}}),
//get start/stop times table
Source2 = T2,
typeIt = Table.TransformColumnTypes(Source2, {"Date/Time", type datetime}),
#"Filtered Rows" = Table.SelectRows(typeIt, each ([#"Date/Time"] <> null)),
#"Added Index" = Table.AddIndexColumn(#"Filtered Rows", "Index", 0, 1),
// shift Info up one row for comparison
    shiftedList = List.RemoveFirstN(  #"Added Index"[Info],1),
    custom1 = Table.ToColumns(  #"Added Index") & {shiftedList},
    custom2 = Table.FromColumns(custom1,Table.ColumnNames(  #"Added Index") & {"NextInfo"}),
#"Filtered Rows2" = Table.SelectRows(custom2, each not (Text.Contains([Info],"start", Comparer.OrdinalIgnoreCase) and Text.Contains([NextInfo],"start", Comparer.OrdinalIgnoreCase))),
#"Added Custom3" = Table.AddColumn(#"Filtered Rows2", "Type", each if Text.Contains(Text.Lower([Info]),"start") then "start" else if Text.Contains(Text.Lower([Info]),"finished") then "finished" else null),
 #"Removed Columns2" = Table.RemoveColumns(#"Added Custom3",{"Info", "NextInfo"}),
#"Added Custom1" = Table.AddColumn(#"Removed Columns2", "Custom", each if [Type]="start" then [Index] else null),
#"Filled Down" = Table.FillDown(#"Added Custom1",{"Custom"}),
#"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Index"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Type]), "Type", "Date/Time"),
#"Added Custom2" = Table.AddColumn(#"Pivoted Column","Table",(i)=>Table.Sort(Table.SelectRows(T1, each [#"Date/Time"]>=i[start] and [#"Date/Time"]<=i[finished]),{{"Date/Time", Order.Ascending}}) , type table )
in #"Added Custom2"

enter image description here

horseyride
  • 17,007
  • 2
  • 11
  • 22
  • Oh my man! You did a good approach! What I realised up till now, the loading time is horrible :D to check the table content as your second screenshot is taking minutes.. To explain it shortly on example: START: 06.08.2022 11:05:09 /// FINISHED: 06.08.2022 11:37:10 means that I wanna have all the pressures values within that time! Exactly as they are in second table where those pairs are. So you did actually good thing for a first step. I have two more questions but in second message cuz it goes over the limit. – MmVv Feb 08 '22 at 13:49
  • When you asked about which value I need, well from the start like 5th minute and like previous value to the end, closest to the end as you wrote. So I can manage that with First/LastN functions? – MmVv Feb 08 '22 at 13:51
  • edited code so it hopefully runs faster. Added a Table.Buffer and replaced the code to find the next row value with a more efficient version. As for the other part, still not what value you are looking to extract. Perhaps give a table of specific examples for the start/stop combinations with explanation – horseyride Feb 08 '22 at 14:15
  • Sorry for delay answer. I was checking queries..and also edited link where you can check new WB especially on sheet 2 for explanation and example what I wanna try. – MmVv Feb 08 '22 at 15:12
  • If you want the 5th row add a column = Table.AddColumn(#"Added Custom2", "Custom.1", each [Table]{4}) or change Order.Ascending}}) , type table ) to Order.Ascending}}){4} , type table ) – horseyride Feb 08 '22 at 15:25
  • Thanks for the input. I will give it a try a bit later now. I think you solved my issue, at least I am getting values from the query that I didnt manage to get anything but error. I will test it with new data sets tomorrow and will officially respond and accept your solution. – MmVv Feb 08 '22 at 15:34
  • Question, can I use table.buffer more times within one query? It is taking quite long to load it actually, even though there is like 1-2k of data. But it is taking from data from.folder -> convert -> then loading. – MmVv Feb 09 '22 at 08:55
  • Well, something is "wrong" in your code, or better, some steps or functions are simply taking long to do the action. I am "expanding table" already 20min, it is loading and loading...with my steps it is taking 2-3 mins. Still I am talking about 2k of data. Everything untill expanding is fine but after, and also once is in Excel sheet then still the sam thing, taking simply super long. – MmVv Feb 09 '22 at 11:51
  • Ok, well, then hopefully someone has some better code for you – horseyride Feb 09 '22 at 12:03
  • Btw, can you try why query is not functioning now with new data set? I uploaded the link and with your code. Something with "null and errors" inside. I tried workaround but at the end I am having error as last cell... – MmVv Feb 10 '22 at 12:59
  • Sorry you are having difficulties. Hopefully someone else has code that runs faster with your preferred data set – horseyride Feb 10 '22 at 14:00
  • I do have, right. But if you can give your tip what gone wrong would be nice. I am looking for optimal solution. – MmVv Feb 10 '22 at 14:04