14

I have a dataframe with various columns, Some of the data within some columns contain double quotes, I want to remove these, for eg:

ID    name   value1     value2
"1     x     a,"b,"c     x"
"2     y     d,"r"       z"

I want this to look like this:

ID    name   value1    value2
1     x      a,b,c      x
2     y      d,r        z
Jaap
  • 81,064
  • 34
  • 182
  • 193
Anubhav Dikshit
  • 1,729
  • 6
  • 25
  • 48

4 Answers4

23

I would use lapply to loop over the columns and then replace the " using gsub.

df1[] <- lapply(df1, gsub, pattern='"', replacement='')
df1
#  ID name value1 value2
#1  1    x  a,b,c      x
#2  2    y    d,r      z

and if need the class can be changed with type.convert

df1[] <- lapply(df1, type.convert)

data

df1 <-  structure(list(ID = c("\"1", "\"2"), name = c("x", "y"),
value1 = c("a,\"b,\"c", 
"d,\"r\""), value2 = c("x\"", "z\"")), .Names = c("ID", "name", 
"value1", "value2"), class = "data.frame", row.names = c(NA, -2L))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 5
    +1 for assignment using square brackets (`df[] <-`) to coerce `lapply`'s returned list to a data frame. I never knew that trick. – drammock Oct 01 '15 at 21:12
  • 3
    @drammock Thank's, The `[]` can be used whenever we want to return the structure of the old dataset for the replaced values. – akrun Oct 02 '15 at 02:10
  • This didnt work for me: `data[] <- lapply(data, gsub, pattern = '$', replacement = '')`...when I type `str(data) all of the "$" are still present... – zsad512 Jan 20 '18 at 14:27
  • 1
    @zsad512 I don't know about your data. The example data in my post works for me – akrun Jan 20 '18 at 15:41
  • 2
    @zsad512 Downvoting a working answer because your data is different is not good. First you need to understand the difference between a metacharacter i.e. `$` and a nonmetacharacter. Just looking at an answer and then complaining that it is not working and then downvoting is absolutely unfair. – akrun Jan 20 '18 at 15:55
  • Is there an option to do this with dplyr in R? – LDT Feb 01 '21 at 18:03
  • 1
    @LDT you can do `df1 <- df1 %>% mutate(across(everything(), str_remove_all(., '"')))` – akrun Feb 02 '21 at 04:47
  • I had problems running the suggestion of @akrun, error: `"argument is not an atomic vector"` . This worked for me: `df1 <- df1 %>% mutate(across(everything(), str_remove_all, pattern = '"'))` – Smerla Jan 17 '23 at 12:15
  • @Smerla Sorry, i forgot to add the `~` for lambda i.e. `df1 %>% mutate(across(everything(), ~ str_remove_all(.x, '"')))` – akrun Jan 17 '23 at 16:50
2

One option would be to use apply() along with the gsub() function to remove all double quotation marks:

df <- data.frame(ID=c("\"1", "\"2"),
                 name=c("x", "y"),
                 value1=c("a,\"b,\"c", "d,\"r\""),
                 value2=c("x\"", "z\""))

df <- data.frame(apply(df, 2, function(x) {
                                  x <- gsub("\"", "", x)
                              })

> df
  ID name value1 value2
1  1    x  a,b,c      x
2  2    y    d,r      z
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
1

To remove $ you have to escape it \\\$. Try:

df[] <- lapply(df, gsub, pattern="\\\$", replacement="")
Martin Gal
  • 16,640
  • 5
  • 21
  • 39
JohnBar
  • 11
  • 1
0

A dplyr solution (based on the suggestion of @akrun in one of the comments).

df1 <-  structure(list(ID = c("\"1", "\"2"), name = c("x", "y"),
                       value1 = c("a,\"b,\"c", "d,\"r\""),
                       value2 = c("x\"", "z\"")),
                      .Names = c("ID", "name", "value1", "value2"), class = "data.frame", row.names = c(NA, -2L))

df1 <- df1 %>% dplyr::mutate(across(everything(), stringr::str_remove_all, pattern = '"'))
Smerla
  • 174
  • 9