0

I have a big data set with columns Id, Vg, Device, Die, W ,L and others (not relevant to this question). I want to interpolate Vg at a given value of Id but this operation has to be performed on data grouped by column Device and Die.

My sample data looks like

Die     Device      Id      Vg     W   L 
  1    Device1       1       0    10   1  
  1    Device1     1.2     0.1    10   1  
  1    Device1     1.3     0.2    10   1
  1    Device2       1       0    10   2
  1    Device2     1.2     0.1    10   2  
  1    Device2     1.3     0.2    10   2
  1    Device3       1       0    10   3
  1    Device3     1.2     0.1    10   3  
  1    Device3     1.3     0.2    10   3

Each die has 22 unique devices. There are 67 dies and 22 Device names on each die are the same. Therefore if I interpolate Vg for Id=1.25, I expect to get 22*67 values of Vg for Id=1.25.

Here is the code I am trying

data_tidy%>%
  group_by(Die,Device)%>% #Die is numeric, Device is factor
  mutate(Vt=approx(x=log10(Id),y=Vg,xout=log10(3e-8*W/L))$y)

This is similar to what is suggested here and I am copying the suggested code from the link below

df %>%
  group_by(variable) %>%
  arrange(variable, event.date) %>%
  mutate(time=seq(1,n())) %>%
  mutate(ip.value=approx(time,value,time)$y) %>%
  select(-time)

However, when I execute my code above I get an error message saying

Error: impossible to replicate vector of size 18

Community
  • 1
  • 1
beeprogrammer
  • 581
  • 1
  • 7
  • 18
  • does this help? http://stackoverflow.com/questions/27115589/dplyr-adding-replicated-vector-via-mutate – hrbrmstr Sep 11 '15 at 01:18
  • Thanks for your input and I did check the thread mentioned by. However, it seems like the first piece of code I posted works fine. It's embarrassing the amount of time I spent trying to do this and turned out I was making a few syntax errors and could not narrow down the problem. – beeprogrammer Sep 11 '15 at 19:00
  • That happens to all of us (the one i hate the most is for getting `=` in `>=` leaving `>` and functional but wrong/busted code) – hrbrmstr Sep 11 '15 at 19:16

1 Answers1

1

Here's a data.table solution:

library(data.table)
f <- function(x) setDT(df)[,approx(Id,Vg,x), by=list(Device,Die)]
f(1.25)
#     Device Die    x    y
# 1: Device1   1 1.25 0.15
# 2: Device2   1 1.25 0.15
# 3: Device3   1 1.25 0.15

Here the column y is the interpolated value.

jlhoward
  • 58,004
  • 7
  • 97
  • 140
  • I tried the following `> View(test_data_d2) > library(data.table) > f<-function(x) setDT(test_data_d2)[,approx(Id,Vg,x),by=list(device,die)] > f(1.25)` however I end up with another error " Error in `[.tbl_df`(setDT(test_data_d2), , approx(Id, Vg, x), by = list(device, : unused argument (by = list(device, die))" – beeprogrammer Sep 11 '15 at 18:54
  • Try using `test_data_d2 <- as.data.frame(test_data_d2)` first. I think this is a problem with `tidy`: it produces a `tbl_df` object which is not really a `data.frame` and there are many functions in R, which expect `data.frame`s, that can't handle that format. Just a guess though. – jlhoward Sep 11 '15 at 19:05
  • When I demoed your example I imported it as a true `data.frame` using `read.table(...)`. – jlhoward Sep 11 '15 at 19:06