How to match regexp starting from specific character index in Ruby 1.8?

Question

In Ruby 1.9 I would use String#match(regexp,start_index). I'm sure there must be a (computationally efficient) equivalent in Ruby 1.8, but I can't find it. Do you know what it is?

mikej · Answer 1 · 2012-08-29T21:28:08.977

3

You could start the regexp with ^.{start_index}

or take the substring first before performing the match.

Alternatively, if you're constrained to using Ruby 1.8, but can install your own libraries then you could use Oniguruma.

edited Aug 29 '12 at 21:28

answered Aug 29 '12 at 21:18

mikej

65,295
17
152
131

Sure... but notice I asked for a "computationally efficient" alternative. – Alex D Aug 29 '12 at 21:19
1

@AlexD: What makes you think `^.{n}` isn't computationally efficient? The regex engine probably just offsets into the string and starts working from there. A quick bit of benchmarking suggests that `^.{n}` is slightly faster than the obvious alternative (`s[i..-1].match(re)`). – mu is too short Aug 29 '12 at 22:04
I tried it in `irb` before answering, and it's inefficient. O(n) in the length of the string, and I will be using this to parse large files which may be tens or hundreds of megabytes. Based on the tests I have done, it could take as much as 30 seconds for a single match in such a case. – Alex D Aug 30 '12 at 05:21
@mikej, installing Oniguruma is an interesting idea (I didn't know it was possible), but unfortunately this is for a (somewhat popular) gem, and it's simply not acceptable to tell all Ruby 1.8 users that "you have to install Oniguruma to use this gem". – Alex D Aug 30 '12 at 05:34

score 0 · Accepted Answer · answered Sep 07 '12 at 07:14

0

As far as I can tell, there is no efficient way to match a Regexp against a large string, starting from an arbitrary index, in pure Ruby 1.8.

This seems like a major flaw. I guess the moral of the story is: use Ruby 1.9!

answered Sep 07 '12 at 07:14

Alex D

29,755
7
80
126

How to match regexp starting from specific character index in Ruby 1.8?

2 Answers2