-6

I would like to erase characters "(B)" in the code column, so then I could do "summarise" the 'stock_needed'. My data looks like this.

  code   stock_need 
(B)1234    200          
(B)5678    240      
1234       700          
5678       200          
0123       200          

to be like this.

code   stock_need 
1234       200          
5678       240      
1234       700          
5678       200          
0123       200  

How could these "(B)" erased? Thanx in advance

2 Answers2

5

What are other patterns your data has? If it's always "(B)" you can do

sub("\\(B\\)", "", df$code)
#[1] "1234" "5678" "1234" "5678" "0123"

Or if it could be any character do

sub("\\([A-Z]\\)", "", df$code)

You could also extract only the numbers from Code

sub(".*?(\\d+).*", "\\1", df$code)

You might want to wrap output of sub in as.numeric or as.integer to get numeric/integer output.


We can also use readr

readr::parse_number(df$code)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

Basically, you need to do two things:

  • remove the unnecessary part of the string
  • convert the string to numeric.

Say, we load your data frame:

df <- read.table(header=TRUE, text="code   stock_need 
(B)1234    200          
(B)5678    240      
1234       700          
5678       200          
0123       200 ")

First, we replace the column "code" with something without the parentheses:

df$code <- gsub("\\(B\\)", "", df$code)

Explanation: why the weird \\? Because if we wrote (B), gsub would treat the parentheses in a special way. Parentheses have a special meaning in regular expressions, and the first argument to gsub is a regular expression.

Next, we make a number vector out of it:

df$code <- as.numeric(df$code)
January
  • 16,320
  • 6
  • 52
  • 74