0

I have a data frame df , and it has few columns & it contains text . I want to drop all elements if their length is less than 4 characters.What I expect is expected_df . Reproducible example is given below .

df<-data.frame(client=c("My Name is abcdff","Name is not right","Bangalore is getting hoter","BBa wasa school topper"),serial_numer=c(1:4))


expected_df<-data.frame(client=c("Name abcdff","Name right","Bangalore getting hoter","wasa school topper"),serial_numer=c(1:4))

This is what I have tried to solve my problem

df$client<-as.character(df$client)
df$client[nchar(df$client) > 3]
Yogesh Kumar
  • 609
  • 6
  • 22

1 Answers1

1

We can split the string and count number of characters in individual words and select only the ones which are greater than equal to 4.

df$client <- sapply(strsplit(as.character(df$client), "\\s+"), function(x) 
                paste0(x[nchar(x) >= 4], collapse = " "))

df
#                   client serial_numer
#1             Name abcdff            1
#2              Name right            2
#3 Bangalore getting hoter            3
#4      wasa school topper            4
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213