Error when trying the remove NaN

Question

I'm using the Rrd package for R and I'm importing an rrd file, and I wish to delete all record that have NaN as a result.

 head(rra)

                timestamp curr_proc_units entitled_cycles capped_cycles
1480982460 2016-12-05 18:01:00             NaN             NaN           NaN
1480982520 2016-12-05 18:02:00             NaN             NaN           NaN
1480982580 2016-12-05 18:03:00             NaN             NaN           NaN
1480982640 2016-12-05 18:04:00             NaN             NaN           NaN
1480982700 2016-12-05 18:05:00             NaN             NaN           NaN
1480982760 2016-12-05 18:06:00             NaN             NaN           NaN
       uncapped_cycles
1480982460             NaN
1480982520             NaN
1480982580             NaN
1480982640             NaN
1480982700             NaN
1480982760             NaN

The head is all NaN but the rest are not.

#!/usr/bin/env Rscript

# libraries
library(lubridate, quietly = TRUE)
library(plyr, quietly = TRUE)
library(dplyr, quietly = TRUE)
library(chron, quietly = TRUE)
library(ggplot2, quietly = TRUE)
library(Rrd, quietly = TRUE)
library(plyrmr, quietly = TRUE)

rra = importRRD("/kathryn/rdc1vsip8/rdc1vsiphmc3/rdc1vpc1lpr56.rrm", "AVERAGE", 1480982400, 1486598400, 2)

rra$timestamp <- as.POSIXct(as.numeric(rra$timestamp), origin = "1970-01-01")

rra = rra[!is.nan(rra)];

My error is: Error in is.nan(rra) : default method not implemented for type 'list'

So how do I convert my list into something which I can take out the NaN values?

I'm assuming you want to pass certain columns into is.nan. If there is one column that will work to identify the nan cases then you can just change the is.nan(rra) to is.nan(rra$yourcolumn) but you'll probably want to use row indexing so it should look like `rra <- rra[!is.nan(rra$yourcolumn), ]` (note the comma) — Dason, Feb 24 '17 at 15:25
It's all columns apart from timestamp. Would that be possible? — Kathryn Withers, Feb 24 '17 at 15:27
You could do something like apply(rra[,-1], 1, function(x){any(is.nan(x))}) to get an index for the rows that contain nan values. The "-1" tells it to exclude the first column (which is the timestamp) when applying the function to each row. — Dason, Feb 24 '17 at 15:36

score 0 · Answer 1 · edited Feb 24 '17 at 16:03

0

Fixed with @Dason answer above rra <- rra[!is.nan(rra$yourcolumn), ] after I specified one column it read the others as well. Thank you for the help.

edited Feb 24 '17 at 16:03

Jaap

81,064
34
182
193

answered Feb 24 '17 at 15:31

Kathryn Withers

35
6

score 0 · Accepted Answer · answered Feb 24 '17 at 15:57

0

Here's a reproducible version of your dataset.

timestamps <- seq(Sys.time() - 3600, Sys.time(), by = "1 min")
n <- length(timestamps)
rra <- data.frame(
  timestamp = timestamps,
  curr_proc_units = runif(n),
  entitled_cycles = runif(n)
)
rra <- within(
  rra,
  {
    curr_proc_units[sample(n, 10)] <- NaN
    entitled_cycles[sample(n, 10)] <- NaN
  }
)

Here's a solution using dplyr's filter() function.

library(dplyr)
rra %>% 
  filter(
    !is.nan(curr_proc_units),
    !is.nan(entitled_cycles)
  )

answered Feb 24 '17 at 15:57

Richie Cotton

118,240
47
247
360

@KathrynWithers If you found this answer useful, click the up arrow next to the score to up vote it. Thanks! – Richie Cotton Feb 24 '17 at 21:06
Okay @RichieCotton because I have less than 15 score I have voted it up but doesn't show publically. – Kathryn Withers Feb 27 '17 at 17:00

Error when trying the remove NaN

2 Answers2