0

Problem situation: I have an incredibly high number of records all marked with a timestamp. I'm looping through all of them to do this and that but I need to detect when the day has changed.

Right now for each loop I'm doing:

cal.setTimeInMillis(record.time);
int currentDay = cal.get(Calendar.DAY_OF_WEEK); 

Is this as slow as I imagine it is when it's running hundreds of thousands of times? I imagine I'm missing a really simple modulo answer or something.

Edit: Time zone does not matter, the information I'm collecting more resolves around a consumable report for someone. 24 hours per report is more accurate, so realistically I don't have to worry about whether or not that's 5am - 5am or 3pm - 3pm, just that I was able to gather 24H worth of info.

Thanks all

macmeyers50
  • 309
  • 2
  • 15
  • Well, how exact do you want this to behave, how to deal with leap years, leap seconds, etc. a date library takes care of all that, but it will certainly take few nanoseconds to do that - have you actually measured the performance in a meaningful way, is it actually slow? – luk2302 Jan 26 '21 at 17:08
  • 1
    It's probably more efficient to convert the end of the day to a timestamp and compare that with the `record.time` – timsmelik Jan 26 '21 at 17:09
  • https://ideone.com/23p0Y6 1M iterations took 0.76s. Doesn't seem slow. I'd be more concerned about the correctness of it (e.g. are you calculating the day in the correct timezone). – Andy Turner Jan 26 '21 at 17:11
  • Added an edit, I don't actually have to worry about time zone really, I just want to split into 24 hour chunks. – macmeyers50 Jan 26 '21 at 17:25
  • 1
    What are the limits of your 24-hour chunks? Do you mean to start with the current moment and go forward/backward in 24-hour increments from there? You need to think this through more clearly. Saying “when the day has changed” while also saying “split into 24-hour chunks” is a contradiction. Voting to close as unclear. – Basil Bourque Jan 26 '21 at 18:12
  • I was unclear. I get the records back in order (newest to oldest). The 24 hour chunks could start from the first record, it would not be a big deal to dispose of the last chunk that wouldn't be a perfect 24 hours. – macmeyers50 Jan 26 '21 at 18:32
  • It seems that you are using `Calendar`. I recommend you don’t. That class is poorly designed and long outdated. Instead use `Instant` and other classes from [java.time, the modern Java date and time API](https://docs.oracle.com/javase/tutorial/datetime/). – Ole V.V. Jan 27 '21 at 03:50

1 Answers1

0

After Andy Turner’s time test I am not necessarily convinved that you need any optimized solution. In any case, timsmelik’s suggestion is pretty straightforward: convert the time when the day changes to a count of milliseconds since the epoch so you only need to compare long values. I don’t find that it hurts readability very badly. So here it is in code. I am using and warmly recommending java.time, the modern Java date and time API, if only for the conversion from hours to milliseconds and for printing the results. Even when such a conversion seems trivial, it’s always best to leave to the standard library to do it. It’s more self-explanatory and less error-prone, and it’s easier for the reader to convince oneself that it’s correct.

    final long twentyfourHoursAsMillis = Duration.ofHours(24).toMillis();
    
    // Times are already sorted descending (from newest to oldest)
    long[] times = { 1_611_718_370_000L, 1_611_632_000_000L,
                     1_611_631_970_000L, 1_611_459_150_000L };
    List<List<Long>> chunks = new ArrayList<>();
    List<Long> currentChunk  = new ArrayList<>();
    
    // Process first time separately to get started
    currentChunk.add(times[0]);
    long timeOfNextChunk = times[0] - twentyfourHoursAsMillis;
    // Process remaining times
    for (int i = 1; i < times.length; i++) {
        long currentTime = times[i];
        if (currentTime <= timeOfNextChunk) {
            chunks.add(currentChunk);
            currentChunk = new ArrayList<>();
            do {
                timeOfNextChunk -= twentyfourHoursAsMillis;
            } while (currentTime <= timeOfNextChunk);
        }
        
        currentChunk.add(currentTime);
    }
    // Save last chunk, why not?
    chunks.add(currentChunk);
    
    // Print result
    for (List<Long> chunk : chunks) {
        String chunkAsString = chunk.stream()
                .map(Instant::ofEpochMilli)
                .map(Instant::toString)
                .collect(Collectors.joining(", "));
        System.out.println(chunkAsString);
    }

Output is:

2021-01-27T03:32:50Z, 2021-01-26T03:33:20Z
2021-01-26T03:32:50Z
2021-01-24T03:32:30Z

I am printing Instant objects. They always print in UTC. For your situation you may want to do otherwise if you need to print the times at all.

You should add a check of your assumption that the times come in sorted order.

I have taken your word for it and broken into chunks at 24 hours. 24 hours may not even mean 5am - 5am but could mean for instance from 5 AM EST on March 13 to 6 AM EDT on March 14 because summer time (DST) has begun in the meantime. If you prefer to split at the same clock hour, the code can be modified to do that.

Ole V.V.
  • 81,772
  • 15
  • 137
  • 161