Im using an RRD to monitor a data source. We are seeing many occasions where the RRD stores a NaN result despite the fact that we know data was received as we are also appending the received data to a file for testing. When we examine the difference we see the following:
I tried to paste the data as two columns but it hasnt structured properly but in essence what we see below is two columns of a spreadsheet. The left column is the rrd dump and the right column is the actual data that arrived at that time.
" <!-- 2017-09-28 06:00:00 UTC / 1506578400 --> <row><v>1.1999200000e+06</v></row>" 1506578412:1202000
" <!-- 2017-09-28 06:05:00 UTC / 1506578700 --> <row><v>1.2538400000e+06</v></row>" 1506578712:1256000
" <!-- 2017-09-28 06:10:00 UTC / 1506579000 --> <row><v>1.2310400000e+06</v></row>" 1506579012:1230000
" <!-- 2017-09-28 06:15:00 UTC / 1506579300 --> <row><v>1.2415200000e+06</v></row>" 1506579312:1242000
" <!-- 2017-09-28 06:20:00 UTC / 1506579600 --> <row><v>1.2304800000e+06</v></row>" 1506579612:1230000
" <!-- 2017-09-28 06:25:00 UTC / 1506579900 --> <row><v>1.2357600000e+06</v></row>" 1506579912:1236000
" <!-- 2017-09-28 06:30:00 UTC / 1506580200 --> <row><v>1.1284800000e+06</v></row>" 1506580212:1124000
" <!-- 2017-09-28 06:35:00 UTC / 1506580500 --> <row><v>1.2238400000e+06</v></row>" 1506580512:1228000
" <!-- 2017-09-28 06:40:00 UTC / 1506580800 --> <row><v>NaN</v></row>" 1506580813:1222000
" <!-- 2017-09-28 06:45:00 UTC / 1506581100 --> <row><v>1.2400000000e+06</v></row>" 1506581112:1240000
" <!-- 2017-09-28 06:50:00 UTC / 1506581400 --> <row><v>1.2284800000e+06</v></row>" 1506581412:1228000
" <!-- 2017-09-28 06:55:00 UTC / 1506581700 --> <row><v>8.9392000000e+05</v></row>" 1506581712:880000
" <!-- 2017-09-28 07:00:00 UTC / 1506582000 --> <row><v>NaN</v></row>" 1506582014:1000000
" <!-- 2017-09-28 07:05:00 UTC / 1506582300 --> <row><v>NaN</v></row>" 1506582315:738000
" <!-- 2017-09-28 07:10:00 UTC / 1506582600 --> <row><v>1.1760000000e+06</v></row>" 1506582613:1176000
" <!-- 2017-09-28 07:15:00 UTC / 1506582900 --> <row><v>1.1874800000e+06</v></row>" 1506582912:1188000
" <!-- 2017-09-28 07:20:00 UTC / 1506583200 --> <row><v>1.2033600000e+06</v></row>" 1506583212:1204000
" <!-- 2017-09-28 07:25:00 UTC / 1506583500 --> <row><v>1.2097600000e+06</v></row>" 1506583512:1210000
" <!-- 2017-09-28 07:30:00 UTC / 1506583800 --> <row><v>1.0717600000e+06</v></row>" 1506583811:1066000
" <!-- 2017-09-28 07:35:00 UTC / 1506584100 --> <row><v>NaN</v></row>" 1506584112:1222000
" <!-- 2017-09-28 07:40:00 UTC / 1506584400 --> <row><v>1.1760000000e+06</v></row>" 1506584412:1176000
" <!-- 2017-09-28 07:45:00 UTC / 1506584700 --> <row><v>1.2048000000e+06</v></row>" 1506584712:1206000
" <!-- 2017-09-28 07:50:00 UTC / 1506585000 --> <row><v>1.0255200000e+06</v></row>" 1506585012:1018000
" <!-- 2017-09-28 07:55:00 UTC / 1506585300 --> <row><v>1.2004000000e+06</v></row>" 1506585312:1208000
" <!-- 2017-09-28 08:00:00 UTC / 1506585600 --> <row><v>1.1676800000e+06</v></row>" 1506585612:1166000
" <!-- 2017-09-28 08:05:00 UTC / 1506585900 --> <row><v>1.2024800000e+06</v></row>" 1506585912:1204000
" <!-- 2017-09-28 08:10:00 UTC / 1506586200 --> <row><v>1.2116800000e+06</v></row>" 1506586212:1212000
" <!-- 2017-09-28 08:15:00 UTC / 1506586500 --> <row><v>NaN</v></row>" 1506586513:886000
" <!-- 2017-09-28 08:20:00 UTC / 1506586800 --> <row><v>1.1940000000e+06</v></row>" 1506586812:1194000
" <!-- 2017-09-28 08:25:00 UTC / 1506587100 --> <row><v>1.1959200000e+06</v></row>" 1506587112:1196000
" <!-- 2017-09-28 08:30:00 UTC / 1506587400 --> <row><v>NaN</v></row>" 1506587413:1206000
" <!-- 2017-09-28 08:35:00 UTC / 1506587700 --> <row><v>1.1440000000e+06</v></row>" 1506587712:1144000
" <!-- 2017-09-28 08:40:00 UTC / 1506588000 --> <row><v>NaN</v></row>" 1506588013:668000
" <!-- 2017-09-28 08:45:00 UTC / 1506588300 --> <row><v>1.2080000000e+06</v></row>" 1506588312:1208000
" <!-- 2017-09-28 08:50:00 UTC / 1506588600 --> <row><v>NaN</v></row>" 1506588613:1156000
" <!-- 2017-09-28 08:55:00 UTC / 1506588900 --> <row><v>1.2080000000e+06</v></row>" 1506588912:1208000
" <!-- 2017-09-28 09:00:00 UTC / 1506589200 --> <row><v>1.1945600000e+06</v></row>" 1506589212:1194000
" <!-- 2017-09-28 09:05:00 UTC / 1506589500 --> <row><v>1.1786400000e+06</v></row>" 1506589512:1178000
" <!-- 2017-09-28 09:10:00 UTC / 1506589800 --> <row><v>1.1396000000e+06</v></row>" 1506589811:1138000
" <!-- 2017-09-28 09:15:00 UTC / 1506590100 --> <row><v>NaN</v></row>" 1506590113:1006000
" <!-- 2017-09-28 09:20:00 UTC / 1506590400 --> <row><v>1.1780000000e+06</v></row>" 1506590412:1178000
" <!-- 2017-09-28 09:25:00 UTC / 1506590700 --> <row><v>1.1799200000e+06</v></row>" 1506590712:1180000
" <!-- 2017-09-28 09:30:00 UTC / 1506591000 --> <row><v>1.1953600000e+06</v></row>" 1506591012:1196000
" <!-- 2017-09-28 09:35:00 UTC / 1506591300 --> <row><v>1.1806400000e+06</v></row>" 1506591312:1180000
" <!-- 2017-09-28 09:40:00 UTC / 1506591600 --> <row><v>1.1588800000e+06</v></row>" 1506591612:1158000
" <!-- 2017-09-28 09:45:00 UTC / 1506591900 --> <row><v>1.2002400000e+06</v></row>" 1506591912:1202000
" <!-- 2017-09-28 09:50:00 UTC / 1506592200 --> <row><v>1.0656800000e+06</v></row>" 1506592212:1060000
" <!-- 2017-09-28 09:55:00 UTC / 1506592500 --> <row><v>1.2078400000e+06</v></row>" 1506592512:1214000
" <!-- 2017-09-28 10:00:00 UTC / 1506592800 --> <row><v>1.1640800000e+06</v></row>" 1506592812:1162000
" <!-- 2017-09-28 10:05:00 UTC / 1506593100 --> <row><v>1.1754400000e+06</v></row>" 1506593112:1176000
We can see the occasions where the data seems not to be accepted are almost always when the time it arrives is somewhat outside the trend.
How can we go about widening the acceptance criteria so that all of these datapoints are accepted?
RRD info for the RRD in question is shown below:
root@ra:/var/www/genie/public_html# rrdtool info /an/data/SI1.rrd
filename = "/an/data/SI1.rrd"
rrd_version = "0003"
step = 300
last_update = 1506594312
header_size = 1000
ds[probe1-temp].index = 0
ds[probe1-temp].type = "GAUGE"
ds[probe1-temp].minimal_heartbeat = 300
ds[probe1-temp].min = 0.0000000000e+00
ds[probe1-temp].max = 5.0000000000e+06
ds[probe1-temp].last_ds = "1226000"
ds[probe1-temp].value = NaN
ds[probe1-temp].unknown_sec = 12
rra[0].cf = "MIN"
rra[0].rows = 1440
rra[0].cur_row = 238
rra[0].pdp_per_row = 12
rra[0].xff = 5.0000000000e-01
rra[0].cdp_prep[0].value = 1.1754400000e+06
rra[0].cdp_prep[0].unknown_datapoints = 2
rra[1].cf = "MAX"
rra[1].rows = 1440
rra[1].cur_row = 1220
rra[1].pdp_per_row = 12
rra[1].xff = 5.0000000000e-01
rra[1].cdp_prep[0].value = 1.2140000000e+06
rra[1].cdp_prep[0].unknown_datapoints = 2
rra[2].cf = "AVERAGE"
rra[2].rows = 1440
rra[2].cur_row = 1205
rra[2].pdp_per_row = 1
rra[2].xff = 5.0000000000e-01
rra[2].cdp_prep[0].value = NaN
rra[2].cdp_prep[0].unknown_datapoints = 0
root@ra:#