1

I'm trying to extract a filename and save the dataframe with that same name. The problem I have is that if the filename for some reason is inside a folder with a similar word, stringr will return that word as well.

filename <- "~folder/testdata/2016/testdata 2016.csv"

If I run this:

library(stringr) str <- str_trim(stringr::str_extract(filename,"[t](.*)"), "left") it returns testdata/2016/testdata 2016.csv when all I want is testdata 2016. Optimally it would even be better to get testdata2016.

I've been trying several combinations but there has to be a simpler way of doing this. If there was a way of reading the path from right to left, starting at .csv stop at /, I wouldn't have this issue.

FilipeTeixeira
  • 1,100
  • 2
  • 9
  • 29
  • 4
    Look at `?basename` – Benjamin Apr 30 '17 at 10:48
  • 1
    You can do this in base R with `gsub(".csv", "", grep(".csv", unlist(strsplit(filename, "/")), value = TRUE))`. Or skip the `strsplit` and `grep` part with `basename`, as @Benjamin suggested. – ulfelder Apr 30 '17 at 10:51
  • 1
    This link has an example for removing all of the whitespace under 'Eliminating Whitespace' http://stackoverflow.com/documentation/r/5748/regular-expressions-regex#t=201704301051180929841 – Benjamin Apr 30 '17 at 10:53
  • Works perfectly. Thanks. – FilipeTeixeira Apr 30 '17 at 10:56

2 Answers2

2

You can have below approaches:

library(stringr)
str_replace(str_extract(filename,"\\w*\\s+\\w*(?=\\.)"),"\\s+","")

str_replace_all(basename(filename),"\\s+|\\.csv","")

You can use basename approach as suggested by Benjamin.

?basename:

basename removes all of the path up to and including the last path separator (if any).

Output:

[1] "testdata2016"
PKumar
  • 10,971
  • 6
  • 37
  • 52
2

Plenty of help in base R (tools pkg comes with the default R install):

gsub(" ", "",
  tools::file_path_sans_ext(
    basename("~folder/testdata/2016/testdata 2016.csv")))
hrbrmstr
  • 77,368
  • 11
  • 139
  • 205