-1

I have a data set which consists of Date/Time, Pressure and Custom Column. This represents pressure over time data, where I wanna know my starting point (after 5 minutes) and ending point of -before last value (row) within one month. To help you a bit out, usually the measurements are taking roughly 30-40 mins what you can see on this example down. So it means the amount of data can vary.

The Time column is calculated using:

=([@[Date/Time]]-I5)*1440+L5

This data set represents whole data and all the months with values, and I need separated (filtered) months with these starting/ending points as on the screenshot. I used Power Query a lot to play with data, but maybe there is another method to obtain those values...and make them dynamic when possible for future data.

enter image description here

I will also upload my dummy workbook with whole data set (all the months), filter table with months if needed for your infos and test.

https://docs.google.com/spreadsheets/d/1LGl-eri6ewCni2NJ2wGeoYIf-40KO2Lr/edit?usp=sharing&ouid=101738555398870704584&rtpof=true&sd=true

MmVv
  • 553
  • 6
  • 22
  • What do you mean by *-before last value (row) within one month*? The value you selected is 2 before the last value of the month. For every five minutes, you can change your formula `L5: =MOD(INT(([@[Date/Time]]-$I$4)*1440),5)` Fill down and filter on `0` – Ron Rosenfeld Jan 19 '22 at 12:50
  • @RonRosenfeld Not every 5 minutes but after 5th minute I need a value..as you can see in Time column there is progressive time values that are going till 40 minutes and only once you have 5,0 minute (5th minute) but for one month. I need that "5th minute" for every month. And I dont wanna last value in the month but one value before - that means "before last..." – MmVv Jan 19 '22 at 13:03
  • But the last value in Jan is `1/2/2020 13:03`. The next-to-last is `1/2/2020 13:02`. But you are selecting `1/2/2020 13:01` which is **not** the next-to-last. And just to confirm, you only want that very **first** five minute reading in each month – Ron Rosenfeld Jan 19 '22 at 13:30
  • @RonRosenfeld Ah yeah, my bad, I misstyped it...you are right, it 13:02... Exactly, I need that 5th minute value (just that value not all the reading within 5 minutes), and next-to-last value as you stated, but per month... – MmVv Jan 19 '22 at 13:44
  • What do you expect to return if there is more than one sample during the fifth minute? – Ron Rosenfeld Jan 19 '22 at 15:52
  • @RonRosenfeld Hmm so far that cant be happening, because the equation (Time column) from my post is always calculating different time, so I am sure it is gonna be one value..but in an opposite case just take the first value, usually the values (pressure) is mostly constant for some period of time. – MmVv Jan 19 '22 at 16:00

1 Answers1

1

In Power Query:

Based on your shared workbook and what you have written, it seems that for any given month, you


edit: minor change in algorithm

  • start the minute count after excluding the first entry in the month.
    • If that is a typo/error, just remove the function that removes that first line
  • with that second entry = minute 0, return the first entry in or after minute 5 as well as the next to last entry in the table.
  • Note that I started with just the Date and Pressure columns

Algorithm

  • Add a column of monthYear
  • GroupBy monthYear
    • Custom aggregation to
      • Remove the first and last rows of the table
      • Create a list of durations in minutes of each time compared with the first time in month. This will be a minute + fraction of a minute
      • Add that list as a column to the original table
      • Determine the first entry in or after the fifth minute
      • Determine the last entry
      • Filter the month subtable to return those two entries.

If you want to see the result for just a given month, you can filter the result in the resultant Excel table.

M Code
please read the comments and examine the Applied Steps to better understand the algorithm

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date/Time", type datetime}, {"P7 [mbar]", Int64.Type}}),

//add month/year column for grouping
    #"Added Custom" = Table.AddColumn(#"Changed Type", "month Year", 
        each Number.ToText(Date.Month([#"Date/Time"]),"00") & Number.ToText(Date.Year([#"Date/Time"]),"0000")),
    #"Grouped Rows" = Table.Group(#"Added Custom", {"month Year"}, {
        //elapsed minutes column
        {"Elapsed Minutes", (x)=> let

        //remove first and last rows from table  
            t=Table.RemoveColumns(Table.RemoveFirstN(Table.RemoveLastN(x)),"month Year"),

        //add a column with the elapsed minutes
            TableToFilter = Table.FromColumns(
                Table.ToColumns(t) 
                & {List.Generate(
                    ()=>[em=null, idx=0],
                    each [idx]< Table.RowCount(t),
                    each [em=Duration.TotalMinutes(t[#"Date/Time"]{[idx]+1} - t[#"Date/Time"]{0}), idx=[idx]+1],
                    each [em])}, type table[#"Date/Time"=datetime, #"P7 [mbar]"=number, elapsed=number]),

        //filter for last entry (which would be next to last in the month
            maxMinute = List.Max(TableToFilter[elapsed]),

        //filter for first entry in the 5th minute
            fifthMinute = List.Select(TableToFilter[elapsed], each Number.IntegerDivide(_,1)>=5){0},

        //select the 5th minute and the last row
            FilteredTable = Table.SelectRows(TableToFilter, each [elapsed]=fifthMinute or [elapsed]=maxMinute)
            in FilteredTable,type table[#"Date/Time"=datetime, #"P7 [mbar]"=number, elapsed=number]}
            }),

    //remove uneeded column and expand the others
    #"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"month Year"}),
    #"Expanded Elapsed Minutes" = Table.ExpandTableColumn(#"Removed Columns", "Elapsed Minutes", {"Date/Time", "P7 [mbar]"}, {"Date/Time", "P7 [mbar]"})
in
    #"Expanded Elapsed Minutes"

Results from your shared workbook data
enter image description here

In Office/Excel 365

Filter Column (eg for January 2020)

E4: 1/1/2020
E5: 1/1/2020

Results

F4 (date/time 5th minute): =IF(COUNTIFS(Table1[Date/Time],">="&E4,Table1[Date/Time],"<" & EDATE(E4,1))=0,"",
LET(x,FILTER(Table1[Date/Time],(Table1[Date/Time]>=E4)*(Table1[Date/Time]<EDATE(E4,1))),
          y, (x-INDEX(x,2))*1440,
          z, XMATCH(5,y,1),
          INDEX(x,z,1)))

G4: (Pressure 5th minute): =IF(F4="","",
LET(x,FILTER(Table1,(Table1[Date/Time]>=E4)*(Table1[Date/Time]<EDATE(E4,1))),
          y, (INDEX(x,0,1)-INDEX(x,2,1))*1440,
          z, XMATCH(5,y,1),
          INDEX(x,z,2)))

F5: (Date next to last): =IF(COUNTIFS(Table1[Date/Time],">="&E5,Table1[Date/Time],"<" & EDATE(E5,1))=0,"",
LET(x,FILTER(Table1[Date/Time],(Table1[Date/Time]>=E5)*(Table1[Date/Time]<EDATE(E5,1))),
          INDEX(x,COUNT(x)-1)))

G5: (Pressure next to last):=IF(F5="","",
LET(x,FILTER(Table1,(Table1[Date/Time]>=E5)*(Table1[Date/Time]<EDATE(E5,1))),
          INDEX(x,COUNT(INDEX(x,0,1))-1,2)))

enter image description here

Ron Rosenfeld
  • 53,870
  • 7
  • 28
  • 60
  • Wow...didnt even know this was possible in PQ :/ look my example works as a charm, but I refreshed the link (added sheet 2) with the same workbook, where it gives me an error even tough there are pairs... can you have a peek? The functions I will test right now. Huge thanks! – MmVv Jan 20 '22 at 09:01
  • @Mirza The error is because during June 2021, there is NO entry that occurs at the relevant time. Perhaps I misunderstood something. The second data is collected at `6/10/2021 13:22:43` Fifth minute would be something in the range of `13:22:43 - 13:27:42` Since there is no data with that time stamp, the routine errors. What would you like to occur? – Ron Rosenfeld Jan 20 '22 at 12:38
  • Hmm, but the routine should go for another values that got "matches"? I mean it stopps on February 2021..but there is way go to November 2021...can it be done for all? As you can see in my dummy workbook where the "filter table" is, there you have June data.. – MmVv Jan 20 '22 at 12:53
  • @Mirza I relied on your description of what you wanted, and your data in the first two columns of the worksheets. I thought you wanted an entry in the fifth minute after the start of data collection. If you want the first entry after five minutes of data collection, even if it is not during the fifth minute, that is an easy change. I have no idea of the relevance of your "filter table to anything". At least for June 2021, the entries do not correspond to any entries in columns A:B on either sheet. – Ron Rosenfeld Jan 20 '22 at 13:16
  • Can you give me a tweak to check it? Hopefully we understood each other. I mean this works from you, but I added some new data and all of a sudden I got an error...it will always be different data, where I need those "pairs". – MmVv Jan 20 '22 at 13:25
  • And btw, with functions its working superbly! – MmVv Jan 20 '22 at 13:25
  • @Mirza To return the first entry **after** five minutes of data collection, starting with zero time being the second entry of the month, you merely need to change the `fifthMinute` comparison from `=5` to `>=5` – Ron Rosenfeld Jan 20 '22 at 13:28
  • Now we are talking. This seems good now, same results as with functions but with my tweak around. I think this is it, will test it today with a bunch new data. Should work. Thanks a lot!!! – MmVv Jan 20 '22 at 13:40
  • Ah its me again, I have noticed that it still has an issues finding >=5 min from the list even tough they are existing. It improved much more but still is missing values...with functions is all good! – MmVv Jan 20 '22 at 15:29
  • @Mirza An example that reproduces your issue please. – Ron Rosenfeld Jan 20 '22 at 18:07
  • The link has been updated. Sheet 3 is newest. – MmVv Jan 20 '22 at 18:35
  • @Mirza And which values are missing? – Ron Rosenfeld Jan 20 '22 at 19:15
  • When you go Query Solution table....there you have some pairs like: 18.02 / 19.02.2021.....27.06 / 28.06.... and so on, it is only written one value, so 5th minute, but not end value or "last". As you can see on left table values, there are actually values inside, but on right table it does show only one value. – MmVv Jan 20 '22 at 19:44
  • @Mirza That type of return is what you specified. *starting point (after 5 minutes) and ending point of -before last value (row) within one **month**.* Both 18.02.21 and 19.02.21 are in the same **month** (February). – Ron Rosenfeld Jan 20 '22 at 20:44
  • ehh, I forgot the pressure can and might be measured more times in a month.... I was thinking that with this Query I will get all the pairs, so if there are more days within a month so show them all....I am sorry for not clearing it 100%, but I didnt know I will bump on this examples... can you tweak it somehow? As I say, functions are totally fine, they are giving 100% correct values and pairs. – MmVv Jan 20 '22 at 20:54
  • @mirza If you are sure that's what you really want, then just group by the **date** instead of the **month/year** – Ron Rosenfeld Jan 20 '22 at 21:41
  • Yeah, I checked the data and upcoming data, it is actually important to have all pairs within month, and not only monthly basis...but all the days too, because the system (reactor) directions can be changed and thats a reason it might give different pressure/days. So thats the only tweak? – MmVv Jan 20 '22 at 21:51
  • @Mirza What happened when you tried that? Did you get the results you now say you want? If not, what is the problem? – Ron Rosenfeld Jan 20 '22 at 23:45
  • Hm not sure if I can solve it properly, it gives me an errors big time. Need to change many things, data types, and I got "nulls". – MmVv Jan 21 '22 at 07:45
  • @Mirza All you need to change is the formula in the added column `month year` to `=Date.From([Date])` Nothing else. – Ron Rosenfeld Jan 21 '22 at 08:25
  • Yeah...so much about my PQ experience :/ it works very good now. I am getting all the infos that I will later use for another data... Thanks a lot one more time! You made a an amazing effort for this. Power Query is precious. – MmVv Jan 21 '22 at 09:35
  • YOu gonna hate me. I tested couple of examples and it is mostly okay, but question: Expression.Error: There weren't enough elements in the enumeration to complete the operation. Details: [List] Is there a way to ignore/remove error within query but not to stop the whole query of loading data? There is an example where I have like 4 values (and it should start from 5th) for one day, and then going to another day where I have enough pairs...but the query stopped because of the error. – MmVv Jan 21 '22 at 13:51
  • Is it possible to omit the errors (Expression Error) within a table after expanding step? They are supressing the query of processing the later data that are not errors? Is there any special function or step to solve it? – MmVv Jan 26 '22 at 13:07
  • @Mirza How to handle that depends on exactly what you want to do in the event that there are less than five minutes of data collection in the day. I believe the error is coming from the `fifthMinute` line. You could wrap that in a `try ... otherwise` and depending on the `otherwise` return either the first and last values, or just the last values. Or, with more complicated testing, you could return nothing for that day. – Ron Rosenfeld Jan 26 '22 at 13:42
  • Hmm that approach I used before, but on Source step (not in this example), but now I am not sure where or how to put it/wrap it, It gives me an token errors that something is missing all the time – MmVv Jan 26 '22 at 15:37
  • @Mirza As I wrote, you put it on the `fifthMinute` line: ` fifthMinute = try List.Select(TableToFilter[elapsed], each Number.IntegerDivide(_,1)>=5){0} otherwise null,` – Ron Rosenfeld Jan 26 '22 at 16:59
  • Seems to work now, I tried on all examples what I have a got infos what I need. Thanks a lot. – MmVv Jan 27 '22 at 06:13
  • Sorry again, but now is FUNCTIONS thing, I realised that G5 calculation (end time or finished in my case) is giving me very wrong values when is matching per day I think it is giving me MAX of whole Month, and this is not what I wanna, but last value -1 for specific day. – MmVv Jan 27 '22 at 15:51
  • 1
    @Mirza As we did in the Power Query script, if you want to use the worksheet functions, you will need to modify the data so as to look only for specific dates, instead of month/years. I will leave you to work out the specifics but suggest you start with how we modified the PQ code. – Ron Rosenfeld Jan 28 '22 at 00:07
  • I will try to find out of the jungle – MmVv Jan 28 '22 at 09:46
  • NAh....so good in Excel Iam not...this what you did with =LET and so on, I never used before.... :/ I was trying to do with =max(if) between dates but I am not getting it properly. – MmVv Jan 30 '22 at 16:14