So, I have a data.frame object called "DATA". This object contains one column called "Point"(DATA$Point). Since there are some duplicates on this particular column, I would like to build a function that sample only one row among these duplicates in DATA.
I've been trying to do it this way:
sort.song<-function(DATA){
Point<-levels(DATA$Point)
DATA.NEW<-DATA[1:length(Point),]
#Ideally DATA.NEW should have an empty dataframe with nrow=length(Point) and the same columns
#as in DATA. But I THINK it will work (I don't know how to do the "ideally" way)
for(i in 1:dim(DATA)[1]){ #dim(DATA)[1] always bigger than length(Point)
SUBDATA<-DATA[which(DATA$Point%in%Point[i]),]
#I need to sample one row of the original data set only of the duplicates of the same value.
#So if there isn't a duplicate of one particular value, move on. Otherwise sample one between
#those duplicates.
l<-dim(SUBDATA)[1]
if (l==1){DATA.NEW[i,]<-SUBDATA[l,]}else{lc<-sample(1:l,1)}
DATA.NEW[i,]<-SUBDATA[lc,]
}
return(DATA.NEW)
}
test<-sort.song(DATA)
But it doesn't work! :( I get the following error message:
Error in `[<-.factor`(`*tmp*`, iseq, value = integer(0)) :
replacement has length zero
It may be a silly question, but I'm kind of without options here (total R beginner)
Any help will be highly appreciated!!!!