2

I have a large data frame where I would like to append characters to row names based on a condition. I have the following example:

trees <- data.frame(char = c('flower', 'cone', 'flower', 'cone'), number = c(3, 3, 5, 6))
rownames(trees) <- c('birch', 'pine', 'maple', 'redwood')

This is what I'm going for, a 'c' next to pine and redwood:

           char    number
birch      flower  3
pine c     cone    3
maple      flower  5
redwood c  cone    6

I know I can use paste to append characters:

# this gives the output I am looking for
paste(rownames(trees[trees$char == 'cone',]), 'c')

[1] "pine c"    "redwood c"

However, when I try this following line of code, the changes don't appear in my data frame:

rownames(trees[trees$char == 'cone',]) <- paste(rownames(trees[trees$char == 'cone',]), 'c')
user16217248
  • 3,119
  • 19
  • 19
  • 37
Danny
  • 101
  • 1
  • 8

3 Answers3

2

trees$char is a vector ( 1 dimensional). So no need of a [,]. This is not worth an answer, but difficult to mention in a comment so just posted.

Hey! bdw i realised now, another point to mention is that in your code you are not assigning it back to the original data.frame trees but to the subsetted data.frame right so that doesn't get reflected

rownames(trees)[trees$char == "cone"] <- paste(rownames(trees)[trees$char == "cone"], "c")
joel.wilson
  • 8,243
  • 5
  • 28
  • 48
  • @Danny mind sharing why this answer wasn't acceptable.. with the fact that i also answered it first. just inquisitive – joel.wilson Feb 05 '17 at 19:15
  • Joel, thank your for taking the time to respond to my question today; I should have left a comment earlier. You did point out that I was applying my name changes to a subsetted data set, which was indeed helpful, but overall I have about 32,000 different row names where 'pine' and 'redwood' are. I wasn't sure if there was a quicker way to cat them together after %in%, or if your code required the names to be catted together manually. – Danny Feb 06 '17 at 03:01
  • @Danny ohh!! that was my mistake... i actually had editted my answewr to make a change on both LHS and RHS but somehow forgot to make it to RHS.. just for your reference edited – joel.wilson Feb 06 '17 at 06:04
1

You can use the ifelse function to define the rownames: if the char value is "cone", paste "c" to the end of the current rowname, else use the existing rowname.

rownames(trees) <- ifelse(trees$char=="cone",paste(rownames(trees), 'c'),rownames(trees))
MPhD
  • 456
  • 2
  • 9
  • I like this solution because the code is clear to me, but it's been running for 25 minutes now on my real data set. Is this normal? I should have mentioned that the real data frame is 32105 observations with 657 variables, but only one of those variables has 'cone' (amongst three other factor levels). – Danny Feb 05 '17 at 16:36
  • Hi, Danny! I've used ifelse like this on data frames of that size to generate new columns (though never rownames), and it's never taken anywhere near that long. So I'm not sure that's normal...but I'm afraid I don't have an explanation! Still learning here, too! :) – MPhD Feb 06 '17 at 06:01
1

One option is

library(stringr)
x1 <- str_extract(trees$char, "^c")
row.names(trees) <- trimws(paste(row.names(trees), replace(x1, is.na(x1), "")))
trees
#            char number
#birch     flower      3
#pine c      cone      3
#maple     flower      5
#redwood c   cone      6

Another option is

row.names(trees) <- paste(row.names(trees), c("", "c")[(trees$char == "cone")+1])
akrun
  • 874,273
  • 37
  • 540
  • 662