12

I am trying to find a function that will extract characters at a certain position within a string. For example, I have a long file name with a date in it, and I want to end up with only the date:

'LT50420331984221PAC00_B7.tif'

and I want only the '1984221' portion. I've come up with a complicated function, but was wondering if there is a more elegant solution.

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
user2632308
  • 129
  • 1
  • 1
  • 3
  • I don't see how this is going to be universally possible to answer with the info provided. Does the date portion: always start after `n` characters? Always start with `19XX` or `20XX`? Always run for `n` characters? Can you provide any more information that would make this easier to answer? – thelatemail Jul 30 '13 at 01:46
  • Can you add your solution to the OP please? – agstudy Jul 30 '13 at 03:11
  • 1
    Everything is explained in the R programming wikibook : http://en.wikibooks.org/wiki/R_Programming/Text_Processing – PAC Jul 30 '13 at 09:35

2 Answers2

22

If you know the exact position of the date in your string you can use

substr('LT50420331984221PAC00_B7.tif', 10, 16)
alko989
  • 7,688
  • 5
  • 39
  • 62
3

For example:

gsub('(.*)([0-9]+{7})[A-Z].*','\\2','LT50420331984221PAC00_B7.tif')
"1984221"

Here I assume that the date is 7 digits before a capital letter.

agstudy
  • 119,832
  • 17
  • 199
  • 261
  • if you are assuming the length of the string, then what is the advantage to using `sub` over `substr`? – Ricardo Saporta Jul 30 '13 at 03:05
  • 1
    @RicardoSaporta I am not assuming the length of the string. The length is the length of a date in certain format. I assume the position of this date. – agstudy Jul 30 '13 at 03:07
  • yes, we are saying the same things, just using different terms. I was just wondering what you get from using `sub` in this specific context that you cannot have from using `substr`? – Ricardo Saporta Jul 30 '13 at 03:13
  • 1
    @RicardoSaporta We are not saying the same thing. But mine is a more stable solution: If the position change (incremented index for example) I will not change my code. – agstudy Jul 30 '13 at 03:14
  • 1
    You can look for example for a string of 4 numbers that represent a year between "1970" and "2013" and select from there until before a capital letter (like @agstudy). That way it does not matter if the date consists of 7 or 8 numbers. – hvollmeier Jul 30 '13 at 06:40