-1

I am having difficultly with my R code. I am trying to create a new dataframe, based on a dataframe I already have, where each duplicate value is individually multiplied by 1000 and added by 1 in order. For example, the values in my dataframe range from 3869014 to 4524673 and there are multiple values of each number (up to 100). Ex: [3869014, 3869014, 3869014, 3869014, 3869014, 3869014, 3869014, 3869014, 3869015, 3869015, 3869015, 3869015, 3869016, 3869016, 3869016, 3869016, etc...]. What I want is: [3869014001, 3869014002, 3869014003, 3869014004, 3869014005, 3869014006, 3869014007, 3869014008, 3869015001, 3869015002, 3869015003, 3869015004, 3869016001, 3869016002, 3869016003, 3869016004, etc...]

I tried the following code, but it multiplies each number by 1000 and adds one regardless of duplicates. It also only adds one, rather than adding a count (ex: 1,2,3,4,etc...). So the output is [3869014001, 3869014001, 3869014001, 3869014001, etc... which is not what I want. I am somewhat new to looping in R dataframes. Thanks for the help.

setwd("F:/TimData/SPAM/Ethiopia")
#clear all variables
rm(list=ls())

#install packages
install.packages(c("spatstat","maptools","lattice","sp","RColorBrewer","splancs","maps", "plyr"))
install.packages(c("rgdal","raster","R.utils","spsurvey", "xlsx", "rJava", "foreign"),dep=TRUE)

#load libraries
library(spatstat); library(maptools); library(lattice); library(sp); 
library(RColorBrewer); library(splancs); library(maps)
library(rgdal); library(raster); library(R.utils); library(spsurvey); library(foreign);
library(rJava)
library(xlsx)
library(plyr)

#creating a custom 1km spatial grid

kmgrid = readGDAL("EthiopiaBuffer1km.tif")

#convert raster to data frame
kmgridx= as.data.frame(kmgrid, row.names=NULL, optional=FALSE, xy=FALSE, na.rm=TRUE)

#specify column containing raster values
x=kmgridx$band1

#setting counter for while statement, based on actual min/max values of raster #grid
start = 3869014
finish = 4525673

#setting loop to multiply each duplicate by 1000 and add one, doesn't work

while (start < finish) {
    if (start) {
        for (i in 1:length(x)) {y=(x*1000)+1} 
        start=start +1 }
    }
timpjohns
  • 599
  • 4
  • 13
  • 1
    Are you sure your example fits your problem description? Why do you install an load a million packages that don't seem relevant for the problem? Try to make the example minimal. And as a hint: you might want to look at `?ave` and `seq_along` in base R. – talat Mar 02 '15 at 20:50

3 Answers3

1

This may be what you are looking for.

id<-c(rep(1,5),rep(2,5),rep(3,5))
y<-rnorm(15)
df<-data.frame(id=id,y=y)
seq_along_mult<-function(x){ 

    y<-x*1000+seq_along(x) #creating your new id variable
    return(y)
}

df$number <- with(df, ave(id, id, FUN=seq_along_mult))

    id         y  number
1   1  0.1872768   1001
2   1  1.9137194   1002
3   1 -0.6226594   1003
4   1 -1.0641839   1004
5   1 -0.3422707   1005
6   2 -0.1013222   2001
7   2  0.5783932   2002
8   2  0.8276480   2003
9   2  1.3111752   2004
10  2  0.1783597   2005
11  3  1.7036697   3001
12  3 -0.5759164   3002
13  3 -0.7028795   3003
14  3 -0.2590082   3004
15  3  1.9239665   3005
Jason
  • 1,559
  • 1
  • 9
  • 14
1

Here's a version with tapply...

a <- c(3869014, 3869014, 3869014, 3869014, 3869014, 3869014, 3869014, 3869014, 3869015, 3869015, 3869015, 3869015, 3869016, 3869016, 3869016, 3869016)
a <- as.character(a)
aa <- unname(unlist(tapply(a, a, function(x)paste0(x, 1000+(1:length(x))))))
> aa
[1] "38690141001" "38690141002" "38690141003" "38690141004" "38690141005" "38690141006"
[7] "38690141007" "38690141008" "38690151001" "38690151002" "38690151003" "38690151004"
[13] "38690161001" "38690161002" "38690161003" "38690161004"
cory
  • 6,529
  • 3
  • 21
  • 41
  • This will create a character vector. Not sure if that's what they want(ed) – talat Mar 02 '15 at 21:11
  • That's easily fixed with as.numeric(). Did you have a better suggestion? – cory Mar 02 '15 at 21:14
  • 1
    Nice usage of tapply! I'd like to fix a small part of it. Instead of adding "1001" "1002"..., it should add "001" "002". `aa <- unname(unlist(tapply(a, a, function(x)paste0(x, str_pad(1:length(x), 3, pad = "0")))))` – Enis Mar 02 '15 at 21:37
1

Using dplyr

library(dplyr)
set.seed(1)
df <- data.frame(id = c(rep(1,5), rep(2,5), rep(3,5)), y = rnorm(15))

df %>% group_by(id) %>% mutate(number = (id * 1000) + 1:n())

You get:

#Source: local data frame [15 x 3]
#Groups: id
#
#   id          y number
#1   1 -0.6264538   1001
#2   1  0.1836433   1002
#3   1 -0.8356286   1003
#4   1  1.5952808   1004
#5   1  0.3295078   1005
#6   2 -0.8204684   2001
#7   2  0.4874291   2002
#8   2  0.7383247   2003
#9   2  0.5757814   2004
#10  2 -0.3053884   2005
#11  3  1.5117812   3001
#12  3  0.3898432   3002
#13  3 -0.6212406   3003
#14  3 -2.2146999   3004
#15  3  1.1249309   3005
Steven Beaupré
  • 21,343
  • 7
  • 57
  • 77