4

We have a huge list which updates daily.

We get daily files multiple times a day with same names and each file has different timestamps (format can be any way but it includes dd,mm,yy hour, minutes, seconds) like:

ABC_2013-07-25T00:00:00
BBC_2013-07-25T01:00:00
ABC_2013-07-25T02:00:00
BBC_2013-07-25T02:00:00
ABC_2013-07-26T00:00:00
BBC_2013-07-26T01:00:00
BBC_2013-07-26T02:00:00
and so on.....

I want to use Java collections and want to get File with latest timestamp for each day like

For 31st latest one are 
ABC_2013-07-25T02:00:00
BBC_2013-07-25T02:00:00

For 1st latest one files are
ABC_2013-07-26T00:00:00
BBC_2013-07-26T02:00:00

How can we achieve this easily using Java Collections?

An idea comes into mind is using Collections.sort(), compare and find max but is there any straight forward way in Java collections to achieve this where we do not require manual comparison?

fatherazrael
  • 5,511
  • 16
  • 71
  • 155
  • What about a `Map` with the day as the key and the file as the value? For each file `F` in the list, look up the file corresponding file in the map based on `F`'s day, and if either none found pr `F`'s timestamp is later, store `F` in the Map under `F`s day. – Kevin Anderson Oct 31 '19 at 04:03
  • @KevinAnderson: Arraylist is huge. Also in one day we get multiple file names so not sure if we can put two similar dates in map? – fatherazrael Oct 31 '19 at 04:16
  • 4
    This is a requirement which would be better handled from a database, not from Java. – Tim Biegeleisen Oct 31 '19 at 04:21
  • @TimBiegeleisen: Oops, not thought of that group by having..... But even if i think then i have no choice. I mean i have no DB so need to handle this scenario in an elegant way in collections only as per requirement. Need to meditate bit more... List of List or List of Map or something like that. Thanks for suggestion. – fatherazrael Oct 31 '19 at 04:25
  • 1
    Perhaps you should explain the mysterious format of your apparently embedded timestamps. I count differing numbers of digits, so I cannot even guess. – Basil Bourque Oct 31 '19 at 04:34
  • @BasilBourque: Updated Time Format. Actually sorry for that, my aim was just to tell that timestamp is there. But i will try to keep question more clear in future. – fatherazrael Oct 31 '19 at 04:36
  • 1
    And do those values represent DDMMYYYY'T'HH:MM:SS ? If so, say so, by editing your Question. – Basil Bourque Oct 31 '19 at 04:46
  • @BasilBourque: I request you to think of any date format which consists of day, month, year and hour minute seconds. The name of file and underscore is separating name of file and time stamp. Could you please assist with way. I have edited another timestamp list for you – fatherazrael Oct 31 '19 at 05:01
  • 1
    That’s an important point. Your examples are in the ISO 8601 format, which has the neat property that you can do operations like sorting or finding the maximum without the need to parse the strings. This doesn’t apply to other date/time formats. Besides that, the question is moot. When you have to find the max value a single time only (because the next time, you have different data), no data structure can optimize that. Searching for the max value unoptimized is a linear operation. Filling whatever data structure before looking up the max value can never be better than that. – Holger Oct 31 '19 at 10:27
  • FYI, colons are not allowed in file names in some file systems such as HFS+ on the Mac. Better to use the “basic” variation of ISO 8601 that minimizes the use of delimiters: YYYYMMDD’T’HHMMSSZ. – Basil Bourque Oct 31 '19 at 16:38

3 Answers3

1

I can suggest you to create a HashMap. You can put the key as file first string(ABC_,BBC,..etc.) and add the timestamps to a treeset which is initialised as descending set.

Sample:

Map<String, TreeSet<String>> map = new HashMap<>();
    try (BufferedReader br = new BufferedReader(new FileReader(new File(Fileuri)))) {
        String line;
        String key;
        while ((line = br.readLine()) != null) {
            key = line.split("_")[0];
            map.computeIfAbsent(key, x -> new TreeSet<>(Comparator.reverseOrder())).add(line);
        }
        System.out.println(map.toString());
    }
0

if your data is not duplicate (or if it is ok to eliminate the duplicated one) you can try TreeSet, it implements SortedSet and automatically sorts the collection at insertion. But you should consider performance as well, if data volume is large.

Trong Hoang
  • 86
  • 1
  • 11
  • As mentioned in question list would be huge. Thanks for response. Could you please give way to implement TreeSet in above scenario. – fatherazrael Oct 31 '19 at 08:39
  • First you will have to cast you data into a type (Class) that implement `equal` and `hashcode`, then you just have to initialize new TreeSet `Set tree = new TreeSet<>()` and add data to it. Since it automatically sorts the collection at insertion, you are free to call `first()`,`last()` to get biggest and lowest value in set – Trong Hoang Oct 31 '19 at 08:48
0

You can use TreeSet for it using below steps.

  1. Create a class which contains file name and timestamp.
  2. Implement equals and hashcode methods in above class.
  3. Create a comparator to sort the list of above class.
  4. Create instance of TreeSet and pass the comparator created in above step.
  5. Use tailSet() and headSet() to get the upper bound or lower bound list.
Ashok Prajapati
  • 374
  • 2
  • 7