1

I don't want an answer based on DAX or Excel formulas.

I have two tables both with Id and cumulative data. Both fields are table independent.

I need to create a new table, table3, which is a join between table1 and the cumulative2 field from table2. The join criteria is to find the first cumulative2 value superior to cumulative1, as shown in example:

enter image description here

TylerH
  • 20,799
  • 66
  • 75
  • 101
JoaoSa
  • 117
  • 1
  • 1
  • 6

1 Answers1

1

I don't think you can do this with a "join" per se, but you can definitely do it with a custom column.

let
    Source = Table.FromRows({{1,1000},{2,2000},{3,3000},{4,4000}},{"ID1","Cumulative1"}),
    #"Added Custom" = Table.AddColumn(Source, "Cumulative2", (T1) => List.Min(Table.SelectRows(Table2, each [Cumulative2] >= T1[Cumulative1])[Cumulative2]), type number)
in
    #"Added Custom"

or formatted a bit

let
    Source = Table.FromRows({{1,1000},
                             {2,2000},
                             {3,3000},
                             {4,4000}},
                            {"ID1","Cumulative1"}
             ),
    #"Added Custom" = Table.AddColumn(Source, "Cumulative2",
                          (T1) => List.Min(
                                      Table.SelectRows(Table2, 
                                          each [Cumulative2] >= T1[Cumulative1]
                                      )[Cumulative2]
                                  ),
                          type number
                      )
in
    #"Added Custom"

So in each row in Table1, I take Table2 and select just the rows where

Table2[Cumulative2] >= Table1[Cumulative1]

and then return only the Cumulative2 column from that table (and a single column is a list).

Since I want the first value in that list, I use List.Min.


Edit: It might be a bit cleaner and more efficient with this instead:

(T1) => List.First(List.Select(Table2[Cumulative2], each _ >= T1[Cumulative1])),
Alexis Olson
  • 38,724
  • 7
  • 42
  • 64
  • there's just an efficence problem. It takes about 2 or 3 seconds per line to run the query. Any suggestion? Thanks. – JoaoSa Feb 20 '19 at 15:13
  • 1
    The following might be a bit faster: `(T1) => List.First(List.Select(Table2[Cumulative2], each _ >= T1[Cumulative1]))` – Alexis Olson Feb 20 '19 at 15:58
  • Thank you, but the performance issues still remain, I also tried a Buffer but it didnt get better. – JoaoSa Feb 20 '19 at 17:57
  • How many buckets are in `Table2` in your actual data and how static is that table? If it's a small table that doesn't change much, then there are better options. – Alexis Olson Feb 20 '19 at 18:44
  • Both table are for daily data input, In this moment I just added 30 records for each one but probably by the end of the year it will have 1000 lines. Sorry, I dont quite know what you mean with buckets. Thanks – JoaoSa Feb 21 '19 at 08:31
  • So `Table2` grows significantly too? Can you give any insight into how the tables are related? Do the IDs correspond to anything? What do the cumulative columns add up? – Alexis Olson Feb 21 '19 at 14:50
  • The real case is that I have a material table which has a specific reference for quality (ID) - Table2. With that material you produce some diferent type of finished product - Table1. F.e, yesterday I produced 10un and today 30un of finshed product. I need to know the quality of the material i am using in the finished product. As it has no waist I can search for cumulative weight. Of course I will have another problem because 1un of finished product can have 2 quality type in the material transition (but that is other problem I have to deal with). – JoaoSa Feb 21 '19 at 15:54
  • Sorry, do you have any further suggestion? – JoaoSa Feb 27 '19 at 14:58
  • Why is quality a cumulative value? – Alexis Olson Feb 27 '19 at 15:35
  • Sorry, I didn't make myself clear. Table 2 - Input (index is the raw material used which has a weight. I also have quality parameters in other columns associated with index) Table 1 - Output (index is the finished product which has a weight) For both tables, I can calculate the cumulative values for input/output. With cumulative value I can relate them: first input or raw material will produce n outputs or finished products based on the cumulative value) – JoaoSa Feb 27 '19 at 15:59
  • I'm not quite following. I'd recommend posting a new question where you clarify the context more fully and use more meaning column names. – Alexis Olson Feb 27 '19 at 16:38