0

I'm working on a tool to count archived files from another program. Therefor I'm using the DirectoryStream and filter subdirectories and some files with a simple if-clause (shown below).

For the statistics I would like to know, how many documents were created per hour on average.

I'm not very experienced in working with files and directories, but I guess there is some sort of "getLastModified", getting the Timerange from oldest to youngest and the calculate the average number of docs per hour?

Jason Aller
  • 3,541
  • 28
  • 38
  • 38
T_Ix
  • 66
  • 8
  • 1
    As you are counting the files that do not match the if condition, is there a reason you did not invert the if condition? This way you can move the count in the if block and remove the then empty else block. – Leon Apr 05 '17 at 13:22
  • I just wrote it down like I was thinking ... but good point! thank you – T_Ix Apr 05 '17 at 13:38

1 Answers1

1

Well, files have a lastModified() method, returning the timestamp of last modification. It returns 0 if the file does not exist or an I/O error occurred. To convert a Path to a File you can use the toFile() method. With that, it will be rather easy to calculate the files/hour average:

long minTimestamp = Long.MAX_VALUE; // definitely greater than any timestamp you will ever find
long maxTimestamp = 0;
int count = 0;

try (DirectoryStream<Path> directoryStream = Files.newDirectoryStream(Paths.get("DIRECTORY PATH"))) {
    for(Path path: directoryStream) {
        if (!(Files.isDirectory(path) || path.toString().endsWith("\\databaseinfo.xml") || path.toString().endsWith(".log"))) {
            long lastModified = path.toFile().lastModified();
            if (lastModified > 0L) { // check that no error occurred
                if (lastModified < minTimestamp) minTimestamp = lastModified; // new minimum
                if (maxTimestamp < lastModified) maxTimestamp = lastModified; // new maximum
            }
            count = count + 1;
        }
    }

} catch (IOException e) {
    e.printStackTrace();
}
System.out.println(count);
double filesPerHour = 0;
if (maxTimestamp != minTimestamp) { // avoid division by 0
    filesPerHour = (double) count * 60 * 60 * 1000 / (maxTimestamp - minTimestamp); // 60 * 60 * 1000 = milliseconds in one hour
}
System.out.println(filesPerHour);

Edit: Inverted the if condition, to avoid the empty if statement which had code in the else block

Leon
  • 2,926
  • 1
  • 25
  • 34
  • I guess this one comes close to solve my problem. There is "filesPerHour = (double) ... " missing in the last if-statement. – T_Ix Apr 05 '17 at 13:51
  • The output is now -> 8.730948492447396 and this wont fit to my situation as mentioned my comment to freedevs. I got around 1483 documents created at the same day... First document created at 09:35 and the last at 11:27. the current result would fit to Files/minute, because there are several files created at the same minute. – T_Ix Apr 05 '17 at 13:54
  • Can you please check if `minTimestamp`, `maxTimestamp` and `count` are correct or just post them? Also, 1483 as file count is without counting `.log` files and the `databaseinfo.xml`? – Leon Apr 05 '17 at 13:59
  • I just printed to console .... `System.out.println(minTimestamp/1000/60/60); System.out.println(maxTimestamp/1000/60/60); System.out.println(maxTimestamp/1000/60/60 - minTimestamp/1000/60/60);` – T_Ix Apr 05 '17 at 14:09
  • 407335 407505 170 – T_Ix Apr 05 '17 at 14:09
  • This makes sense: 170 hours are 7 days and 2 hours (2 hours is the period files are created in), hence we are looking at the files created over 8 days. I suspect the count is 1483. So we have 1483 files created in 170 hours meaning an average of 1483/170 (which is ~8.72352941176) files per hour, which matches the result you get. – Leon Apr 05 '17 at 14:17
  • :D yes you're right... just looked at my testfiles created at March 20th. and just one file was March 27th. ... Sorry! – T_Ix Apr 05 '17 at 14:34