I have a problem that is very very similar to this: Populating a data frame in R in a loop. I cannot seem to populate my matrix in a loop.
myDF <- read.csv('corpusFiltered.txt.gz', header = TRUE, sep = '\t')
phylum = sort(unique(myDF$PHYLUM))
myDF.mean = ddply(myDF, .(ENVIRONMENT, FILENAME, PHYLUM), summarize, MeanX = mean(X, na.rm=TRUE) )
df_all = myDF.mean[c(4, 3)] #select only the X and Phylum
c_all = unstack(df_all) #restructure dataframe
columnPhylum1 = matrix(ncol=1, nrow=length(phylum))
GET_X = function(dataset)
{
for (i in 1:length(phylum))
{
print(phylum[i])
columnPhylum1[i,] <- phylum[i] #this does not populate the matrix. still 'NA'
}
}
GET_X(c_all)
print('')
print(columnPhylum1)
This does not work. The output is:
[1] Actinobacteria
Levels: Actinobacteria Bacteroidetes Chlamydiae Crenarchaeota Deinococcus-Thermus Euryarchaeota Firmicutes Proteobacteria Spirochaetes Tenericutes ***
[1] Bacteroidetes
[1] Chlamydiae
[1] Crenarchaeota
[1] Deinococcus-Thermus
[1] Euryarchaeota
[1] Firmicutes
[1] Proteobacteria
[1] Spirochaetes
[1] Tenericutes
[1] ""
[,1]
[1,] NA
[2,] NA
[3,] NA
[4,] NA
[5,] NA
[6,] NA
[7,] NA
[8,] NA
[9,] NA
[10,] NA
***For the purpose of brevity, I removed subsequence "Levels" info from all but the first prokaryote (Actinobacteria).
However, if I make a faux matrix...
sig= matrix(ncol=1, nrow=length(phylum))
for (i in 1:length(phylum)){sig[i,]<-i}
print(sig)
This works like a charm.
[,1]
[1,] 1
[2,] 2
[3,] 3
[4,] 4
[5,] 5
[6,] 6
[7,] 7
[8,] 8
[9,] 9
[10,] 10
Perhaps I cannot see the forest for the tree; I have checked for obvious things (e.g. correct variable names) and I have been unable to find any problems. The only difference I can see is that the top calls the loop from a function. I don't understand why I am getting different behaviors from 'identical' code. Any assistance is greatly appreciated.