0

I have a problem filtering a DataTable on different criteria. I know the first where-clause

where row.Field<TimeSpan>("DateDifference") >= TimeSpan.Zero

is why the third criterion isn't met. Is there any way to change my query to meet all requirements?

  1. DateDifference should be positive.
  2. The smallest DateDifference should be selected.
  3. All InventoryChanges must be in the result. So a negative DateDifference is allowed if there is no positive DateDiff. The smallest negative DateDiff should be selected.

    ArticleNo   Article     Price   PriceSet    InventoryChange DateDifference  StockDifference
            1   Article A   10      01.01.2012  02.01.2012      1               -2
            1   Article A   11      01.06.2012  02.01.2012      -151            -2
            2   Article B   14      01.01.2012  05.01.2012      4               1
            2   Article B   14      01.01.2012  04.10.2012      277             -3
            2   Article B   13      01.06.2012  05.01.2012      -148            1
            2   Article B   13      01.06.2012  04.10.2012      125             -3
            3   Article C   144     01.04.2012  28.02.2012      -33             -1
            3   Article C   124     01.05.2012  28.02.2012      -63             -1
    
    My result:          
            1   Article A   10      01.01.2012  02.01.2012      1               -2
            2   Article B   14      01.01.2012  05.01.2012      4               1
            2   Article B   13      01.06.2012  04.10.2012      125             -3
    
    What I want to have is a table where the last row, where there is no positive DateDifference, is added.
    The row with the smallest DateDifference should be selected:
            1   Article A   10      01.01.2012  02.01.2012      1               -2
            2   Article B   14      01.01.2012  05.01.2012      4               1
            2   Article B   13      01.06.2012  04.10.2012      125             -3
            3   Article C   144     01.04.2012  28.02.2012      -33             -1
    

My query so far:

var query = from row in InventoryChanges.AsEnumerable()
                    where row.Field<TimeSpan>("DateDifference") >= TimeSpan.Zero
                    group row by new
                    {
                        ArticleNo = row.Field<Int32>("ArticleNo"),
                        Article = row.Field<String>("Article"),
                        InventoryChange = row.Field<DateTime>("InventoryChange"),
                        StockDifference = row.Field<Int32>("StockDifference")
                    }
                    into grp
                    select new
                    {
                        ArticleNo = grp.Key.ArticleNo,
                        Article = grp.Key.Article,
                        InventoryChange = grp.Key.InventoryChange,
                        PriceSet = grp.Where(r => r.Field<TimeSpan>("DateDifference") == grp.Select(min => min.Field<TimeSpan>("DateDifference")).Min())
                            .Select(r => r.Field<DateTime>("PriceSet")).FirstOrDefault(),
                        DateDifference = grp.Select(r => r.Field<TimeSpan>("DateDifference")).Min(),
                        StockDifference = grp.Key.StockDifference,
                        Price = grp.Where(r => r.Field<TimeSpan>("DateDifference") == grp.Select(min => min.Field<TimeSpan>("DateDifference")).Min())
                            .Select(r => r.Field<Decimal>("Price")).FirstOrDefault(),
                    };

Any help is appreciated!

DataTable InventoryChanges = new DataTable("InventoryChanges");

            InventoryChanges.Columns.Add("ArticleNo", typeof(Int32));
            InventoryChanges.Columns.Add("Article", typeof(String));
            InventoryChanges.Columns.Add("Price", typeof(Decimal));
            InventoryChanges.Columns.Add("PriceSet", typeof(DateTime));
            InventoryChanges.Columns.Add("InventoryChange", typeof(DateTime));
            InventoryChanges.Columns.Add("DateDifference", typeof(TimeSpan));
            InventoryChanges.Columns.Add("StockDifference", typeof(Int32));

            DataRow dr = InventoryChanges.NewRow();
            dr.ItemArray = new object[] { 1, "Article A", 10, new DateTime(2012, 1, 1), new DateTime(2012, 1, 2), new TimeSpan(1, 0, 0, 0), -2 };
            InventoryChanges.Rows.Add(dr);
            dr = InventoryChanges.NewRow();
            dr.ItemArray = new object[] { 1, "Article A", 11, new DateTime(2012, 6, 1), new DateTime(2012, 1, 2), new TimeSpan(-151, 0, 0, 0), -2 };
            InventoryChanges.Rows.Add(dr);
            dr = InventoryChanges.NewRow();
            dr.ItemArray = new object[] { 2, "Article B", 14, new DateTime(2012, 1, 1), new DateTime(2012, 1, 5), new TimeSpan(4, 0, 0, 0), 1 };
            InventoryChanges.Rows.Add(dr);
            dr = InventoryChanges.NewRow();
            dr.ItemArray = new object[] { 2, "Article B", 14, new DateTime(2012, 1, 1), new DateTime(2012, 10, 4), new TimeSpan(277, 0, 0, 0), -3 };
            InventoryChanges.Rows.Add(dr);
            dr = InventoryChanges.NewRow();
            dr.ItemArray = new object[] { 2, "Article B", 13, new DateTime(2012, 6, 1), new DateTime(2012, 1, 5), new TimeSpan(-148, 0, 0, 0), 1 };
            InventoryChanges.Rows.Add(dr);
            dr = InventoryChanges.NewRow();
            dr.ItemArray = new object[] { 2, "Article B", 13, new DateTime(2012, 6, 1), new DateTime(2012, 10, 4), new TimeSpan(125, 0, 0, 0), -3 };
            InventoryChanges.Rows.Add(dr);
            dr = InventoryChanges.NewRow();
            dr.ItemArray = new object[] { 3, "Article C", 144, new DateTime(2012, 4, 1), new DateTime(2012, 2, 28), new TimeSpan(-33, 0, 0, 0), -1 };
            InventoryChanges.Rows.Add(dr);
            dr = InventoryChanges.NewRow();
            dr.ItemArray = new object[] { 3, "Article C", 124, new DateTime(2012, 5, 1), new DateTime(2012, 2, 28), new TimeSpan(-63, 0, 0, 0), -1 };
            InventoryChanges.Rows.Add(dr);
ekad
  • 14,436
  • 26
  • 44
  • 46
abeldenibus
  • 128
  • 7
  • It would really help if you'd provide the sample data as code if that's viable. `var tbl=new DataTable;tbl.Columns.Add(..);tbl.Rows.Add(...);...` – Tim Schmelter Sep 26 '13 at 10:08
  • The data is read from a .csv-file and the DataTable is created dynamically. But I'll create a DataTable having the same structure. – abeldenibus Sep 26 '13 at 10:13
  • are we not taking only positive DateDifference here ..."where row.Field("DateDifference") >= TimeSpan.Zero" ? – Gopesh Sharma Sep 26 '13 at 10:16
  • That's what I'd like to change. As I said I know that this line is why the third criterion isn't met but I didn't get any closer to the result I want to have than this. – abeldenibus Sep 26 '13 at 10:25

1 Answers1

1

Maybe there are more elegant approaches but this should work:

var query = InventoryChanges.AsEnumerable()
.GroupBy(r => new
{
    ArticleNo = r.Field<Int32>("ArticleNo"),
    Article = r.Field<String>("Article"),
    InventoryChange = r.Field<DateTime>("InventoryChange"),
    StockDifference = r.Field<Int32>("StockDifference")
})
.Select(grp =>
{
    IEnumerable<DataRow> rows = grp;
    bool anyPositiveDateDiff = grp.Any(r => r.Field<TimeSpan>("DateDifference") >= TimeSpan.Zero);
    if (anyPositiveDateDiff)
        rows = grp.Where(r => r.Field<TimeSpan>("DateDifference") >= TimeSpan.Zero);
    var firstRow = rows
        .OrderBy(r => r.Field<TimeSpan>("DateDifference").Duration()).First();        
    return new
    {
        ArticleNo = grp.Key.ArticleNo,
        Article = grp.Key.Article,
        InventoryChange = grp.Key.InventoryChange,
        PriceSet = firstRow.Field<DateTime>("PriceSet"),
        DateDifference = rows.Min(r => r.Field<TimeSpan>("DateDifference")),
        StockDifference = grp.Key.StockDifference,
        Price = firstRow.Field<Decimal>("Price")
    };
});

I'm checking if there are rows in the group with positive timespans at bool anyPositiveDateDiff. Then i replace the rows of the group with the positive timespan rows.

Note also that i have simplified and improved the sub-queries in the select where you create the anonymous types.

Edit This is the result of above query according to your provided sample data:

{ ArticleNo = 2, Article = Article B, InventoryChange = 05.01.2012 00:00:00, PriceSet = 01.01.2012 00:00:00, DateDifference = 4.00:00:00, StockDifference = 1, Price = 14 }
{ ArticleNo = 2, Article = Article B, InventoryChange = 04.10.2012 00:00:00, PriceSet = 01.06.2012 00:00:00, DateDifference = 125.00:00:00, StockDifference = -3, Price = 13 }
{ ArticleNo = 3, Article = Article C, InventoryChange = 28.02.2012 00:00:00, PriceSet = 01.04.2012 00:00:00, DateDifference = -63.00:00:00, StockDifference = -1, Price = 144 }
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • This solution is very close but it does not select the smallest DateDifference when there is no positive one. – abeldenibus Sep 26 '13 at 10:59
  • @abeldenibus: Edited my answer at `var firstRow = ...`. Imho you can simply use `TimeSpan.Duration` as `OrderBy` argument since that it the absolute timespan. – Tim Schmelter Sep 26 '13 at 11:06
  • I just had to change Decimal in 'PriceSet' TimeSpan and changed the selection of DateDifference to `DateDifference = anyPositiveDateDiff ? rows.Min(r => r.Field("DateDifference")) : rows.Max(r => r.Field("DateDifference")),`. You really helped me a lot! It's always nice to learn from others. Thanks for cleaning up my sub-query as well ;) – abeldenibus Sep 26 '13 at 11:22
  • @abeldenibus: I've already changed the type in one of my edits ;) I don't know why you need that conditional operator since my `TimeSpan.Duration` does exactly this. It orders the rows by the absolute timespan and takes the smallest. So if the TimeSpan is positive it takes the smallest and if it's negative it takes the largest. – Tim Schmelter Sep 26 '13 at 11:26
  • I changed it again to `DateDifference = firstRow.Field("DateDifference"),` because in your subquery the .Min() of all rows in grp was selected. Have a look at your result. The DateDifference is -63 where it should be -33. You're correct that the sorting was correct so I wondered why the DateDiff wasn't. Taking the value from the firstRow instead of getting the minimum of the group does the job. Thanks again! – abeldenibus Sep 26 '13 at 11:47