It seems that you’d like to group your data and delete duplicates (that with same ProductIdentifier value and lower timestamp) from each group. As far as I know, currently GroupBy does not be supported in DocumentDB. But you can group the data and get the ProductIdentifier from each group via LINQ, and then query documents with same ProductIdentifier and delete the duplicates.
var query = client.CreateDocumentQuery<MyDoc>(UriFactory.CreateDocumentCollectionUri("testdb", "testcoll")).Where(d => d.ProductIdentifier != "");
List<MyDoc> list1 = query.ToList();
var result = list1.GroupBy(item => new
{
ProductIdentifier = item.ProductIdentifier,
ProductTitle = item.ProductTitle
})
.Select(group => new
{
ProductIdentifier = group.Key.ProductIdentifier,
ProductTitle = group.Key.ProductTitle
});
foreach (var item in result)
{
var query1 = client.CreateDocumentQuery<MyDoc>(UriFactory.CreateDocumentCollectionUri("testdb", "testcoll")).Where(d => d.ProductIdentifier == item.ProductIdentifier && d.ProductTitle == item.ProductTitle);
if (query1.Count() > 1)
{
//delete duplicates from a group
}
}
Besides, as Larry Maccherone said in this thread, documentdb-lumenize is an aggregation library for DocumentDB written as a stored procedure, which can help us perform GroupBy.
string configString = @"{
cubeConfig: {
groupBy: 'ProductIdentifier',
field: '_ts',
f: 'max'
},
filterQuery: 'SELECT * FROM c'
}";
Object config = JsonConvert.DeserializeObject<Object>(configString);
dynamic result = await client.ExecuteStoredProcedureAsync<dynamic>(UriFactory.CreateStoredProcedureUri("testdb", "testcoll", "cube"), config);
//get group info form result.Response