R dcast fill with a sample number

Question

I wonder if there is a way to fill with random numbers each individual missing value when using dcast (library reshape2 or data.table). Example:

ID = c('AA', 'AA', 'BB', 'BB', 'CC', 'CC', 'CC', 'DD', 'DD')
Replica = c('H1','H3','H1','H2','H1','H2','H3','H2','H3')
Value = c(1.3, 2.5, 1.4, 3.7, 9.5, 7.4, 7.1, 1.8, 8.4)

example <- data.frame(ID=ID, Replica = Replica, Value = Value)

Doing a simple dcast

dfdc <- dcast(data=example, ID~Replica, value.var = 'Value', fill = sample(1:10, 1))

notice how some of the values are missed:

  ID  H1  H2  H3
1 AA 1.3  NA 2.5
2 BB 1.4 3.7  NA
3 CC 9.5 7.4 7.1
4 DD  NA 1.8 8.4

I would like to fill up each of those missing values with random numbers, something like:

dfdc <- dcast(data=example, ID~Replica, value.var = 'Value', fill = sample(1:10, 1))

which gives as a result:

  ID  H1  H2  H3
1 AA 1.3 2.0 2.5
2 BB 1.4 3.7 2.0
3 CC 9.5 7.4 7.1
4 DD 2.0 1.8 8.4

However, all the missing values have been replaced by the same random number (2 in this case).

Would it be possible to apply the function individually to each missing value and, therefore, fill the missing values with different random numbers?

Thanks in advance!

Rich Scriven · Answer 1 · 2017-02-22T03:03:57.650

If you're not concerned with a warning, you could just do fill = sample(10), and the unused values will be dropped. You will still receive three random numbers. Just make sure you're certain the sample is higher than the expected number of NA values.

dcast(example, ID ~ Replica, fill = sample(10))
#   ID   H1  H2  H3
# 1 AA  1.3 4.0 2.5
# 2 BB  1.4 3.7 1.0
# 3 CC  9.5 7.4 7.1
# 4 DD 10.0 1.8 8.4
# Warning message:
# In ordered[is.na(ordered)] <- fill :
#   number of items to replace is not a multiple of replacement length

Of course, you could simply wrap that with suppressWarnings() as well.

suppressWarnings(dcast(example, ID ~ Replica, fill = sample(10)))
#   ID  H1  H2  H3
# 1 AA 1.3 6.0 2.5
# 2 BB 1.4 3.7 5.0
# 3 CC 9.5 7.4 7.1
# 4 DD 9.0 1.8 8.4

akrun · Answer 2 · 2017-02-22T02:35:02.937

3

Here is an option using tidyverse

library(tidyverse)
complete(example, ID, Replica) %>%
    mutate(Value = coalesce(Value, as.numeric(sample(1:10, n(), replace=TRUE))))  %>%       
    spread(Replica, Value)
# A tibble: 4 × 4
#      ID    H1    H2    H3
#* <fctr> <dbl> <dbl> <dbl>
#1     AA   1.3   2.0   2.5
#2     BB   1.4   3.7   1.0
#3     CC   9.5   7.4   7.1
#4     DD   8.0   1.8   8.4

edited Feb 22 '17 at 02:35

answered Feb 22 '17 at 02:29

akrun

874,273
37
540
662

R dcast fill with a sample number

2 Answers2