0

The purpose is to do statistics on user count of my website. I want to display current user count and previous 10 days user count. Here is a test I did:

RrdDef rrdDef = new RrdDef("test", 60*60*24);
rrdDef.setStartTime(Util.getTime() - 60*60*24*11);
rrdDef.addDatasource(Stats.USER_COUNT.name(), DsType.GAUGE, 60*60*24, 0.0, Double.NaN);
rrdDef.addArchive(ConsolFun.LAST, 0.5, 1, 10);
RrdDb rrdDb = new RrdDb(rrdDef);
rrdDb.close();

rrdDb = new RrdDb("test");
Calendar cal = Calendar.getInstance();
cal.add(Calendar.DATE, -7);
rrdDb.createSample().setAndUpdate(String.format("%d:%d", (Util.getTimestamp(cal)), 1));
cal.add(Calendar.DATE, 1);
rrdDb.createSample().setAndUpdate(String.format("%d:%d", (Util.getTimestamp(cal)), 2));
cal.add(Calendar.DATE, 1);
rrdDb.createSample().setAndUpdate(String.format("%d:%d", (Util.getTimestamp(cal)), 3));
cal.add(Calendar.DATE, 1);
rrdDb.createSample().setAndUpdate(String.format("%d:%d", (Util.getTimestamp(cal)), 4));
cal.add(Calendar.DATE, 1);
rrdDb.createSample().setAndUpdate(String.format("%d:%d", (Util.getTimestamp(cal)), 5));
cal.add(Calendar.DATE, 1);
rrdDb.createSample().setAndUpdate(String.format("%d:%d", (Util.getTimestamp(cal)), 6));
cal.add(Calendar.DATE, 1);
rrdDb.createSample().setAndUpdate(String.format("%d:%d", (Util.getTimestamp(cal)), 7));
rrdDb.close();

rrdDb = new RrdDb("test");
FetchRequest fetchRequest = rrdDb.createFetchRequest(ConsolFun.LAST, Util.getTime() - 60*60*24*7, Util.getTime());
FetchData fetchData = fetchRequest.fetchData();
System.out.println(fetchData.dump());
rrdDb.close();

Here is the output

1404345600:  NaN  
1404432000:  +2.0000000000E00  
1404518400:  +2.4654861111E00  
1404604800:  +3.4654861111E00  
1404691200:  +4.4654861111E00  
1404777600:  +5.4654861111E00  
1404864000:  +6.4654861111E00  
1404950400:  NaN  
1405036800:  NaN  

Here is what I was expecting

1404345600:  NaN  
1404432000:  +1.0000000000E00  
1404518400:  +2.0000000000E00  
1404604800:  +3.0000000000E00  
1404691200:  +4.0000000000E00 
1404777600:  +5.0000000000E00 
1404864000:  +6.0000000000E00 
1404950400:  +7.0000000000E00 
1405036800:  NaN  

Where am I wrong?

Jérôme Herry
  • 187
  • 2
  • 7

1 Answers1

1

You are falling afoul of Data Normalisation.

While the LAST type RRA is holding value of the last of the PDP (primary data points) that make up each CDP (consolidated data point), you have forgotten two things.

First, since your RRA is set so that 1cdp = 1pdp, there is in fact no consolidation going on at all (in this case, LAST, MAX, MIN and AVG will all do the same thing when presented with a single PDP to consolidate).

Secondly, your incoming data are not coming in on an interval boundary, and so are being normalised to fit into the boundary.

The internal intervals in RRDtool are always based on GMT (UCT) midnight; this doesn't matter much if you're dealing with intervals measured in minutes or seconds, but your interval is a whole day. You are using the Calendar object to create your base date/time as 'now' and are then incrementing by a day at a time. However your base date and time is not on an interval boundary, so the value ends up being split between two separate intervals, causing the decimal you see when fetching the data; also your timezone is likely different and so your midnight is not GMT's midnight.

See Alex van den Bogaerdt's tutorial on Data Normalisation for more technical details on how this works.

Steve Shipway
  • 3,754
  • 3
  • 22
  • 39