Questions tagged [gsub]

Ruby, Lua, R and Awk methods for performing a global pattern substitution.

Ruby

The String#gsub method replaces all occurrences of a pattern (a regex or a plain string) with another string or the result of a block. gsub returns a copy of the original string with the replacements applied, gsub! performs that replacement in-place.

Lua

If used simply, it will replace all instances of provided with the replacement passed as argument. Instead, if the replacement is a function and not a string, the function receives all matched instances of the pattern as arguments. The string value; if returned, by the function is then substituted back in original string.

R

The function gsub replaces all occurrences of a pattern (a regex or a plain string) with another string or the result of a block. The function returns a new object; the original object is not modified. The function is vectorized and hence can be applied to vectors of strings too.

Awk

The function gsub(p, s [, t]) replaces all occurrences of a pattern p (a regex or a plain string) with another string s, either in the named target variable, t or in $0 if no target is given.

2445 questions
29
votes
5 answers

Remove everything after space in string

I would like to remove everything after a space in a string. For example: "my string is sad" should return "my" I've been trying to figure out how to do this using sub/gsub but have been unsuccessful so far.
user1214864
  • 293
  • 1
  • 3
  • 4
29
votes
3 answers

How to backreference in Ruby regular expression (regex) with gsub when I use grouping?

I would like to patch some text data extracted from web pages. sample: t="First sentence. Second sentence.Third sentence." There is no space after the point at the end of the second sentence. This sign me that the 3rd sentence was in a separate…
Konstantin
  • 2,983
  • 3
  • 33
  • 55
26
votes
1 answer

gsub speed vs pattern length

I've been using gsub extensively lately, and I noticed that short patterns run faster than long ones, which is not surprising. Here's a fully reproducible code: library(microbenchmark) set.seed(12345) n = 0 rpt = seq(20, 1461, 20) msecFF =…
Alexey Ferapontov
  • 5,029
  • 4
  • 22
  • 39
25
votes
1 answer

R - gsub replacing backslashes

I would like to use gsub to replace every occurrence of a backslash in a string with 2 backslashes. Currently, what I have I tried is gsub("\\\\", "\\", x). This doesn't seem to work though. However, if I change the expression to instead replace…
Jon Claus
  • 2,862
  • 4
  • 22
  • 33
22
votes
3 answers

Ruby regex- does gsub store what it matches?

If i use .gsub(/matchthisregex/,"replace_with_this") does gsub store what it matches with the regex somewhere? I'd like to use what it matches in my replacement string. For example something like "replace_with_" + matchedregexstring + "this" in…
Tommy
  • 965
  • 6
  • 13
  • 21
21
votes
2 answers

How to use ruby gsub Regexp with many matches?

I have csv file contents having double quotes inside quoted text test,first,line,"you are a "kind" man",thanks again,second,li,"my "boss" is you",good I need to replace every double quote not preceded or succeeded by a comma by…
Mahmoud Khaled
  • 6,226
  • 6
  • 37
  • 42
21
votes
5 answers

How to determine if there is a match an return true or false in rails?

I want to create a test that returns either true or false for email handling. For now, if the email address starts with r+ then it's true otherwise it's false. This will help our server ignore a lot of the SPAM we are getting hit…
AnApprentice
  • 108,152
  • 195
  • 629
  • 1,012
21
votes
7 answers

replacing the `'` char using awk

I have lines with a single : and a' in them that I want to get rid of. I want to use awk for this. I've tried using: awk '{gsub ( "[:\\']","" ) ; print $0 }' and awk '{gsub ( "[:\']","" ) ; print $0 }' and awk '{gsub ( "[:']","" ) ; print $0…
SIMEL
  • 8,745
  • 28
  • 84
  • 130
21
votes
2 answers

Regex return file name, remove path and file extension

I have a data.frame that contains a text column of file names. I would like to return the file name without the path or the file extension. Typically, my file names have been numbered, but they don't have to be. For…
Docuemada
  • 1,703
  • 2
  • 25
  • 44
20
votes
3 answers

Replace / translate characters in a string

I have a data frame with a character column: df <- data.frame(var1 = c("aabbcdefg", "aabbcdefg")) df # var1 # 1 aabbcdefg # 2 aabbcdefg I want to replace several different individual characters, e.g. from "a" to "h", from "b" to "i" and so…
jrara
  • 16,239
  • 33
  • 89
  • 120
20
votes
1 answer

Using dplyr + gsub on many columns

I'm using dplyr and gsub to remove special characters. I'm trying to translate a code I had with base R. Here's a fake example to resemble my data: library(dplyr) region = c("regi\xf3n de tarapac\xe1","regi\xf3n de tarapac\xe1") provincia =…
pachadotdev
  • 3,345
  • 6
  • 33
  • 60
19
votes
4 answers

Using shorthand character classes inside character classes in R regex

I have defined vec <- "5f 110y, Fast" and gsub("[\\s0-9a-z]+,", "", vec) gives "5f Fast" I would have expected it to give "Fast" since everything before the comma should get matched by the regex. Can anyone explain to me why this is not the…
ThanksABundle
  • 385
  • 1
  • 8
19
votes
4 answers

How to replace the characters in a string

I have a method that I want to use to replace characters in a string: def complexity_level_two replacements = { 'i' => 'eye', 'e' => 'eei', 'a' => 'aya', 'o' => 'oha'} word = "Cocoa!55" word_arr = word.split('') results = [] …
User9123
  • 515
  • 2
  • 6
  • 14
19
votes
4 answers

in R, use gsub to remove all punctuation except period

I am new to R so I hope you can help me. I want to use gsub to remove all punctuation except for periods and minus signs so I can keep decimal points and negative symbols in my data. Example My data frame z has the following data: [,1] [,2] …
18
votes
5 answers

Subset string by counting specific characters

I have the following strings: strings <- c("ABBSDGNHNGA", "AABSDGDRY", "AGNAFG", "GGGDSRTYHG") I want to cut off the string, as soon as the number of occurances of A, G and N reach a certain value, say 3. In that case, the result should…
Nivel
  • 629
  • 4
  • 12