43

I have the name of a file like this: name1.csv and I would like to extract two substrings of this string. One that stores the name1 in one variable and other that stores the extension, csv, without the dot in another variable.

I have been searching if there is a function like indexOf of Java that allows to do that kind of manipulation, but I have not found anything at all.

Any help?

smci
  • 32,567
  • 20
  • 113
  • 146
Layla
  • 5,234
  • 15
  • 51
  • 66
  • 12
    Try `tools::file_ext("name1.csv")`. See http://stackoverflow.com/questions/7779037/extract-file-extension-from-file-path – GSee Jan 05 '13 at 16:30

3 Answers3

75

Use strsplit:

R> strsplit("name1.csv", "\\.")[[1]]
[1] "name1" "csv"  
R> 

Note that you a) need to escape the dot (as it is a metacharacter for regular expressions) and b) deal with the fact that strsplit() returns a list of which typically only the first element is of interest.

A more general solution involves regular expressions where you can extract the matches.

For the special case of filenames you also have:

R> library(tools)   # unless already loaded, comes with base R
R> file_ext("name1.csv")
[1] "csv"
R> 

and

R> file_path_sans_ext("name1.csv")
[1] "name1"
R> 

as these are such a common tasks (cf basename in shell etc).

Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
8

Use strsplit():

http://stat.ethz.ch/R-manual/R-devel/library/base/html/strsplit.html

Example:

> strsplit('name1.csv', '[.]')[[1]]
[1] "name1" "csv"  

Note that second argument is a regular expression, that's why you can't just pass single dot (it will be interpreted as "any character").

Adam Stelmaszczyk
  • 19,665
  • 4
  • 70
  • 110
2

Using regular expression, you can do this for example

regmatches(x='name1.csv',gregexpr('[.]','name1.csv'),invert=TRUE)
[[1]]
[1] "name1" "csv"  
agstudy
  • 119,832
  • 17
  • 199
  • 261