I have run up against the wall regarding the application of lapply() in a simulation study. The data are designed to help us understand how a standardization formula impacts the outcomes of a proposal ratings exercise.
There are three conditions for raters: no bias, uniform bias (bias increases across raters), and bidirectional bias (bias is balanced positive and negative across raters).
The true value for a proposal is assumed known.
We would like to produce a set replicate datasets within each bias condition so that the datasets would emulate a single proposal evaluation period (a panel). We would then like to replicate panels to simulate having many proposal evaluation periods.
Here is schematic of the data structure:
The data structure looks like this:
p = number of proposals
r = number of raters
n.panels = number of replicate panels
t.reps = list of several replicate panels
three bias conditions: n.bias - no bias
u.bias - uniform bias (raters higher than previous rater)
b.bias - bidirectional bias (balanced up and down bias)
-|
t 1 |..| --> 10*(n.bias(p*r)) + 10*(u.bias(p*r)) + 10*(b.bias(p*r) {panel replication 1}
. 2 |..| --> 10*(n.bias(p*r)) + 10*(u.bias(p*r)) + 10*(b.bias(p*r) {panel replication 2}
r : : : : :
e : : : : :
p n.panels |..| --> 10*(n.bias(p*r)) + 10*(u.bias(p*r)) + 10*(b.bias(p*r) {n. panels replications}
s
_|
The following R code generates data correctly:
########## start of simulation parameters
set.seed(271828)
means <- matrix(c(rep(50,3), rep(60,3), rep(70,4) ), ncol = 1) # matrix of true proposal values
bias.u <- matrix(c(0,2,4,6,8), nrow=1) # unidirectional bias
bias.b <- matrix(c(0,3,-3, 5, -5), nrow=1) # bidirectional bias
ones.u <- matrix(rep(1,ncol(bias.u)), nrow = 1) # number of raters is the number of columns (r)
ones.b <- matrix(rep(1,ncol(bias.b)), nrow = 1)
ones.2 <- matrix(rep(1,nrow(means)), ncol = 1) # number of proposals is the number of rows (p)
true.ratings <- means%*%ones.u # gives matrix of true proposal value for each rater (p*r)
uni.bias <- ones.2%*%bias.u
bid.bias <- ones.2%*%bias.b # gives matrix of true rater bias for each proposal (p*r)
n.val <- nrow(means)*ncol(ones.u)
# true.ratings
# uni.bias
# bid.bias
library(MASS)
#####
##### generating replicate data...
#####
##########-------------------- analyzing mse of adjusted scores across replications
##########-------------------- developing random replicates of panel data
##########----- This means that there are (reps) replications in each of the bias conditions
##########----- to represent a plausible set of ratings in a particular collection
##########----- of panels. So for one proposal cycle (panel) , there are 3 * (reps) * nrow(means)
##########----- number of proposal ratings.
##########-----
##########----- There are (n.panels) replications of the total number of proposal ratings placed in a list
##########----- (t.reps).
n.panels <- 2 # put in the number of replicate panels that should be produced
reps <- 10 # put in the number of times each bias condition should be included in a panel
t.reps <- list()
n.bias <- list()
u.bias <- list()
b.bias <- list()
for (i in 1:n.panels)
{
{
for(j in 1:reps)
n.bias[[j]] <- true.ratings + matrix(round(rnorm(n.val,4,2), digits=0), nrow = nrow(means))
for(j in 1:reps)
u.bias[[j]] <- true.ratings + uni.bias + matrix(round(rnorm(n.val,4,2), digits=0), nrow = nrow(means))
for(j in 1:reps)
b.bias[[j]] <- true.ratings + bid.bias + matrix(round(rnorm(n.val,4,2), digits=0), nrow = nrow(means))
}
t.reps[[i]] <- list(n.bias, u.bias, b.bias)
}
# t.reps
Each element in the list (t.reps) is a random replication of a panel of reviewers for an entire set of proposals.
I would like to apply the following function to "adjust" the scores within a panel using characteristics of the ENTIRE set of proposal scores (across all raters and proposals) to adjust values within a rater. The idea is to correct for any bias one way or another (e.g. being overly harsh or overly easy when rating proposals).
The adjustment should be applied for each of the (reps) datasets.
So, for one panel, there would be 30 replicate datasets (10 for each bias condition) and each replicate dataset would have 10 proposals rated by 5 raters, resulting in 300 proposals total.
So the idea is to have random replications to understand how the adjusted scores compare to the unadjusted scores.
I had tried to use the lapply() function across the lists within the (t.reps) list, and it did not work.
adj.scores <- function(x, tot.dat)
{
t.sd <- sd(array(tot.dat))
t.mn <- mean(array(tot.dat))
ones.t.mn <- diag(1,ncol(x))
p <- nrow(x)
r <- ncol(x)
ones.total <- matrix(1,p,r)
r.sd <- diag(apply(x,2, sd))
r.mn <- diag(apply(x,2, mean))
den.r.sd <- ginv(r.sd)
b.shift <- x%*%den.r.sd
a <- t.mn*ones.t.mn - den.r.sd%*%r.mn
a.shift <- ones.total%*%a
l.x <- b.shift + a.shift
return(l.x)
}
########## I would like to do something like this...
########## apply the function to each element in the list t.reps
dat.1 <- matrix(unlist(t.reps[[1]]), ncol=5)
adj.rep.1 <- lapply(t.reps[[1]], adj.scores, tot.dat = dat.1)
I am open to other methods / workarounds that will allow for rater adjustment within a set of proposal ratings using statistics from the entire set of ratings. There may be some R functionality that I just don't know about or haven't come across.
Also, if anyone can recommend a book for programming data structures like this (in R, Perl, or Python), it would be most appreciated. The texts that I have found thus far do not address these issues in detail.
Many, many thanks in advance.
-Jon