0

We have a bunch of file names of the form:

filename1(412)(4141567).csv, filename2(4214985).csv, filename3(34543).csv, filename4(3456984).csv, filename5(34582).csv, filename6(jrh)(234145).csv

What we are looking to do is truncate the file names so that we are left with only

filename1(412),filename2,filename3,...,filename6(jrh). 

i.e. to cut off the end of the name up until the last "(" bracket.

We cannot use substring as each of the file names are not exactly the same length. Also the following code I found:

sub("(.*?)[(].*", "\\1", files)

does not work either as some file names have two sets of brackets.

sgibb
  • 25,396
  • 3
  • 68
  • 74
user1836894
  • 293
  • 2
  • 5
  • 18

2 Answers2

1

That last regex you posted is pretty close, try this:

sub('(.*)\\(.*', '\\1', files)
eddi
  • 49,088
  • 6
  • 104
  • 155
0

You could use tools::file_path_sans_ext which runs (which only removes the file extension):

sub("([^.]+)\\.[[:alnum:]]+$", "\\1", x)

EDIT:

I didn't see that you want to remove the last (...), too:

f <- c("filename1(412)(4141567).csv", "filename2(4214985).csv", "filename3(34543).csv", "filename4(3456984).csv", "filename5(34582).csv", "filename6(jrh)(234145).csv")

sub("([^.]+)\\([^)]*\\)\\.[[:alnum:]]+$", "\\1", f)
#[1] "filename1(412)" "filename2"      "filename3"      "filename4"      "filename5"      "filename6(jrh)"
sgibb
  • 25,396
  • 3
  • 68
  • 74