0

I have hundreds of records like these in a mongodb collection:

{
    "city" : "London",
    "eventTime": 1582866000,
    "createdTime": 1582900145,
    "eventDetails: {
        ...
    }
}

This is corresponding the Event class

public class Event {
    private String city;
    private long eventTime;
    private long createdTime;
    private EventDetails eventDetails;

    //getters and setters
}

-eventTime (all time values are in local epoch/unix time) will always be equal to an hour
-createdTime is being used as version control (greater the createdTime, more recent the version of this information)
- not using update/upsert as I have uses for keeping older information

I want to be able to query something like this:

1) given a version timestamp - return a list of objects (the entire object) for each city, for each eventTime, where timestamp is closest to (floor or celing) createdTime
2) same as 1) but for max(createdTime) - for each city and every eventTime

Essentially - there will be a lot of duplicates for each city, for each eventTime due to multiple versions, and I only want the version that I specify, and the latest version

I'm trying to do these two queries using mongoTemplate in java (spring mongo) I've tried messing around with Query, Aggregation, AggregationOperator.Max but nothing seems to be working

Aggregation aggregation = Aggregation.newAggregation(
Aggregation.match(Criteria.where("city").is(city)),
Aggregation.match(Criteria.where("eventTime").gte(startTime).lte(endTime)),
Aggregation.group("$$ROOT").max("createdTime").as("createdTime")
);
AggregationResults<Event> groupResults = mongoTemplate.aggregate(aggregation, "collection-name", Event.class);
result = groupResults.getMappedResults();

sample data and expected outputs: https://script.google.com/d/1JgOFzdVBGiueYpXT8u-R6AJpbX6FxxtkVwq_NpLraZeS19Pxh6zmnATb/edit?usp=sharing

I have also tried doing these queries in mongodb shell and haven't been successful with that either, it's similar but it's not returning the object structure that I want

db.getCollection('collection-name').aggregate([{$group:{_id : "$$ROOT", createdTime : {$max:"$createdTime"}}}])
Valijon
  • 12,667
  • 4
  • 34
  • 67
Prady
  • 165
  • 1
  • 15

1 Answers1

1

Max

db.collection.aggregate([
  {
    $group: {
      _id: {
        city: "$city",
        eventTime: "$eventTime"
      },
      max: {
        $max: "$createdTime"
      },
      data: {
        $push: "$$ROOT"
      }
    }
  },
  {
    $project: {
      max_events: {
        $filter: {
          input: "$data",
          cond: {
            $eq: [
              "$max",
              "$$this.createdTime"
            ]
          }
        }
      }
    }
  },
  {
    $unwind: "$max_events"
  },
  {
    $replaceRoot: {
      newRoot: "$max_events"
    }
  }
])

MongoPlayground

MongoTemplate

import static org.springframework.data.mongodb.core.aggregation.Aggregation.*;

Aggregation agg = Aggregation.newAggregation(
    group("city", "eventTime").max("createdTime").as("max").push("$$ROOT").as("data"),
    project().and(new AggregationExpression() {
        @Override
        public Document toDocument(AggregationOperationContext context) {
            return new Document("$filter",
                    new Document("input", "$data")
                         .append("cond", new Document("$eq", Arrays.asList("$max", "$$this.createdTime"))));
        }
    }).as("max_events"),
    unwind("max_events"),
    replaceRoot("max_events")
);

AggregationResults<Event> events = mongoTemplate.aggregate(agg, mongoTemplate.getCollectionName(Event.class), Event.class);

for(Event ev: events) {
    System.out.println(ev);
}

Closest

db.collection.aggregate([
  {
    $group: {
      _id: {
        city: "$city",
        eventTime: "$eventTime"
      },
      data: {
        $push: "$$ROOT"
      }
    }
  },
  {
    $project: {
      closest: {
        $reduce: {
          input: "$data",
          initialValue: {
            $arrayElemAt: [
              "$data",
              0
            ]
          },
          in: {
            $cond: [
              {
                $lt: [
                  {
                    $abs: {
                      $subtract: [
                        "$$this.eventTime",
                        "$$this.createdTime"
                      ]
                    }
                  },
                  {
                    $abs: {
                      $subtract: [
                        "$$value.eventTime",
                        "$$value.createdTime"
                      ]
                    }
                  }
                ]
              },
              "$$this",
              "$$value"
            ]
          }
        }
      }
    }
  },
  {
    $unwind: "$closest"
  },
  {
    $replaceRoot: {
      newRoot: "$closest"
    }
  }
])

MongoPlayground

MongoTemplate

Aggregation agg = Aggregation.newAggregation(
    group("city", "eventTime").push("$$ROOT").as("data"),
    project().and(new AggregationExpression() {
        @Override
        public Document toDocument(AggregationOperationContext context) {
            return new Document("$reduce",
                    new Document("input", "$data")
                     .append("initialValue", new Document("$arrayElemAt", Arrays.asList("$data", 0)))
                     .append("in", new Document("$cond", Arrays.asList(
                            new Document("$lt",  Arrays.asList(
                                new Document("$abs", new Document("$subtract", Arrays.asList("$$this.eventTime", "$$this.createdTime"))), 
                                new Document("$abs", new Document("$subtract", Arrays.asList("$$value.eventTime", "$$value.createdTime"))))), 
                            "$$this",
                            "$$value"))));
        }
    }).as("closest"),
    unwind("closest"),
    replaceRoot("closest")
);

AggregationResults<Event> events = mongoTemplate.aggregate(agg, mongoTemplate.getCollectionName(Event.class), Event.class);

for(Event ev: events) {
    System.out.println(ev);
}
Valijon
  • 12,667
  • 4
  • 34
  • 67
  • I got to run this query, and I think I messed up in my explanation. so each there is a one-to-many relationship between eventTime and createdTime, so each eventTime can have multiple createdTime, because createdTime is being used as a record-version. And I want to return a list of Event objects where for each eventTime in the range provided, it returns the max createdTime for it's corresponding eventTime and the closest createdTime for each corresponding eventTime – Prady Mar 02 '20 at 15:22
  • @Prady Post please sample data with expected result – Valijon Mar 02 '20 at 15:26
  • here is a link with sample data, expected output for MAX and for CLOSEST in each file, I've added the link in the question – Prady Mar 02 '20 at 15:47
  • @Prady checking your samples – Valijon Mar 02 '20 at 19:15
  • 1
    @Prady check if this solution meets your requirements https://mongoplayground.net/p/mnnOLV1ZOrf. We can split this query into 2 individual queries – Valijon Mar 02 '20 at 20:56
  • Yes, this is what I'm looking for, thank you so much! Can we split this to two queries and find a way to do it through spring mongoTemplate? Also I tried running this on NoSQL Manager and it gave me a "Unrecognized pipeline stage name: 'replaceWith'" – Prady Mar 02 '20 at 21:10
  • 1
    You can change `$replaceWith` (since v4.2) to `$replaceRoot:{newRoot:"$max_events"}` – Valijon Mar 02 '20 at 21:12
  • @Prady Ready. Try to execute these queries – Valijon Mar 02 '20 at 23:36
  • works well, thank you very much. You've been incredibly helpful! I don't know how long it would've taken me to figure this out myself. Also, if I want to limit the `eventTime` range, do I just add a `Aggregation.match(Criteria.where("eventTime").gte(startTime).lte(endTime))` in your query? – Prady Mar 03 '20 at 14:29
  • @Prady yes. You may add several conditions to match criteria – Valijon Mar 03 '20 at 16:08