You can also make a copy before applying you condition. You need to specify the columns to process only those column. If you have just two columns, you can do it manualy like this:
# Create copy
test <- df
# Update specific column
test$plot1[(as.numeric(test$plot1)) > 1] <- 1
test$plot2[(as.numeric(test$plot2)) > 1] <- 1
test
# species rarness endangered plot1 plot2
# 1 Pinus halepensis F 0 1 0
# 2 Majorana syriaca CC 0 1 1
# 3 Iris palaestina F 0 1 0
# 4 Velezia fasciculata O 6.8 1 0
Generalisation:
Now, suppose you want to process a set of columns. You can re-use the previous tips in a function that you apply to all columns. I suggest you to have a look at the apply
family. Here a nice explanation. In our task, lapply
seems appropriated (doc).
# Your dataframe
species<- c("Pinus halepensis", "Majorana syriaca", "Iris palaestina","Velezia fasciculata")
rarness<- c("F", "CC", "F", "O")
endangered<-c(0,0,0,6.8)
plot1<- c(1,2,1,1)
plot2<- c(0,1,0,0)
df<- as.data.frame(cbind(species, rarness, endangered, plot1, plot2))
# Extend the dataframe with new random columns for the example
df2 <- data.frame(replicate(10,sample(-5:5,4,rep=TRUE)))
df[names(df2)] <- df2
df
# species rarness endangered plot1 plot2 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
# 1 Pinus halepensis F 0 1 0 -4 -2 4 4 0 5 -1 -5 3 2
# 2 Majorana syriaca CC 0 2 1 5 -3 -2 3 3 -1 0 5 2 4
# 3 Iris palaestina F 0 1 0 -1 2 -2 5 3 2 3 3 -1 -3
# 4 Velezia fasciculata O 6.8 1 0 -5 -3 4 5 5 -4 4 -5 -4 -3
# Create copy
test <- df
# Function to apply at each column
set_threshold <- function(col){
col <- as.numeric(col);
col[col > 1] <- 1;
return (col);
}
# Select all column names after the index 4
col_names <- tail(names(test),-3)
col_names
# [1] "plot1" "plot2" "X1" "X2" "X3" "X4" "X5" "X6" "X7" "X8" "X9" "X10"
# Process each column
test[col_names] <- lapply(test[col_names], FUN = set_threshold)
test
# species rarness endangered plot1 plot2 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
# 1 Pinus halepensis F 0 1 1 -4 -2 1 1 0 1 -1 -5 1 1
# 2 Majorana syriaca CC 0 1 1 1 -3 -2 1 1 -1 0 1 1 1
# 3 Iris palaestina F 0 1 1 -1 1 -2 1 1 1 1 1 -1 -3
# 4 Velezia fasciculata O 6.8 1 1 -5 -3 1 1 1 -4 1 -5 -4 -3
I use tail
to select all the columns names after the index 4 (e.g. remove all element until index 3) (doc). A discussion on how to subset a list.