7

I'm trying to replace exact strings in a column using stringr functions.

The dataset I try it on is this:

data <- data.frame(
  column = c("Value", "Values", "Value", "Values")
)

data

  column
1 Value
2 Values
3 Value
4 Values

I want to replace "Value" with "Values". I tried str_replace(data$column, "Value", "Values"), but this creates the following unwanted replacements:

[1] "Values"  "Valuess" "Values"  "Valuess"

I'd like the output to be:

[1] "Values"  "Values" "Values"  "Values"
Cettt
  • 11,460
  • 7
  • 35
  • 58
MetaPhilosopher
  • 131
  • 2
  • 9
  • 1
    try `str_replace(data$column, "Value\\b", "Values")` and have a look [here](https://stackoverflow.com/questions/7227976/using-grep-in-r-to-find-strings-as-whole-words-but-not-strings-as-part-of-words). – Roman Aug 02 '18 at 07:33
  • as suggested by @Ravinder. Please use `sub()` for this task. Stay `base::` where you can. – Andre Elrico Aug 02 '18 at 08:17
  • 1
    Can you explain a little bit why one should choose base where one can? – MetaPhilosopher Aug 02 '18 at 13:02

4 Answers4

5

Here are a few possibilities using regular expressions:

x <- c("value", "values")
str_replace(x, "value$", "values") #method 1
str_replace(x, "value\\b", "values") #method 2
str_replace(x, "value(?!s)", "values") #method 3

all of the above return the same

[1] "values" "values"

A short explanation: the first method looks for 'value' at the end of a string. The symbol $ matches the end of the string.

The second method looks for 'value' followed by a word boundary.

The third method looks for 'value' followed by anything but the symbol 's'.

You can find a helpful cheat sheet about stringr and regular expressions here. Hope this helps.

Cettt
  • 11,460
  • 7
  • 35
  • 58
  • Thank you! Do you have any suggested comprehensive resources to learn more about regular expressions in stringr context? – MetaPhilosopher Aug 02 '18 at 07:55
  • 2
    I personally struggled a lot to understand regular expressions. What helped me in the end was the second page of the cheat sheet I posted in the answer plus a couple of hours experimenting with the examples they provide there. – Cettt Aug 02 '18 at 07:57
  • 1
    @MetaPhilosopher regex101.com – Andre Elrico Aug 02 '18 at 08:36
  • Use `^value$` for start and end of string. – qwr Apr 25 '23 at 01:12
2

Just a simple string comparison should do the trick.

data[data$col == "Value","col"] = "Values"
MSW Data
  • 441
  • 3
  • 8
1

Could you please try following:

sub("Value[a-z]+","Values",data$column)

Output will be as follows.

sub("Value[a-z]+","Values",data$column,perl = TRUE)
[1] "Values" "Values" "Values" "Values"

Explanation: Following is only for explanation purposes.

sub(             ##using sub function of R whose method is: sub(regex_to_match_in_current_value,new_value_which_should_be_there_after_match,variable)
"Value[a-z]+",   ##mentioning Value string with [a-z]+ alphabets till their regular sequences.
"Value",         ##Substitute above match of strings with only string Value here.
data$column)     ##Mentioning data frame data with its column.

Where sample data is from:

data <- data.frame(
  column = c("Value", "Values", "Value", "Values")
)
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
0

data$column <- ifelse(data$column=='Value','Values','Values')

Mumtaj Ali
  • 421
  • 4
  • 7