How to extract substring using regex given only the index?

Question

Is there any way to extract part of string/sentence, given only the from and to index of starting and ending position of the substring? Eg: "this is an example00001. and so on." and I need to get substring from position 10 to 15 (ie., examp) using regex.

Why would you want to use a regex for that? Are you on some platform that doesn't have a substring function in its standard libarary? — Jim Lewis, Jul 08 '15 at 03:21
@Jim Lewis, you r right... the tool I use takes only regex :( — user3366706, Jul 08 '15 at 06:54

score 2 · Accepted Answer · answered Jul 08 '15 at 03:28

2

Use a look behind anchored to start.

Using your example of position 10 to 15:

(?<=^.{10}).{5}

If look behind is not supported, use group 1 of:

^.{10}(.{5})

answered Jul 08 '15 at 03:28

Bohemian

412,405
93
575
722

its working, but could you please explain whats happening? or suggest me some good links to understand this. Thanks. – user3366706 Jul 08 '15 at 17:02
thanks Bohemian can you tell how to extract from back-position like i have "GetIndicatorsByAnalysisProcessIDServlet service" and in this want to extract only "GetIndicatorsByAnalysisProcess". – Nagappa L M Feb 23 '17 at 12:25
@feelgoodandprogramming that should be asked as a new question, and it isn't obvious where to stop - stop at `ID`, stop 17 from end, something else? (but try `^.*(?=ID)`) – Bohemian Feb 23 '17 at 15:09

score 1 · Answer 2 · answered Jul 08 '15 at 06:51

I think you need from position 11 to get the match that you want. Here is an example:

$ cat input.txt
This is an example00001. and so on.
$ sed -r 's|(.{10})(.{5})(.*)|\2|' input.txt
 exam
$ sed -r 's|(.{11})(.{5})(.*)|\2|' input.txt
examp

What this does is:

    -r      extended regular expressions (only on gnu sed) 
    s       for substitution  
    |       for separator  
    (.{11}) for the first group of any 11 characters (you might want 10)  
    (.{5})  for the second group of any 5 characters 
    (.*)    for any other character, not really needed though  
    \2      for replacing with the second group

You might want to use the ^ and $ characters in your regex for start and end of line.

numbered backref/call is not allowed in the tool I use :( – user3366706 Jul 08 '15 at 17:04 — user3366706, Jul 08 '15 at 17:04

How to extract substring using regex given only the index?

2 Answers2