Extract a substring from a string in Ruby using a regular expression

Question

How can I extract a substring from within a string in Ruby?

Example:

String1 = "<name> <substring>"

I want to extract substring from String1 (i.e. everything within the last occurrence of < and >).

Nakilon · Answer 1 · 2020-04-28T02:48:39.127

358

"<name> <substring>"[/.*<([^>]*)/,1]
=> "substring"

No need to use scan, if we need only one result.
No need to use Python's match, when we have Ruby's String[regexp,#].

See: http://ruby-doc.org/core/String.html#method-i-5B-5D

Note: str[regexp, capture] → new_str or nil

edited Apr 28 '20 at 02:48

answered Nov 06 '10 at 21:00

Nakilon

34,866
14
107
142

40

No need to discredit other perfectly valid (and might I opine, more readable) solutions. – coreyward Nov 06 '10 at 21:07
44

@coreyward, if they are better, please, argument it. For example, sepp2k's solution is more flexible, and that's why I pointed `if we need only one result` in my solution. And `match()[]` is slower, because it's two methods instead of one. – Nakilon Nov 06 '10 at 21:10
6

This is the fastest of all the methods presented, but even the slowest method takes only 4.5 microseconds on my machine. I do not care to speculate why this method is faster. In performance, speculation is _useless_. Only measurement counts. – Wayne Conrad Nov 07 '10 at 07:32
9

I find this solution more straightforward and to the point (since I am new to Ruby). Thanks. – Ryan H. Jun 30 '11 at 10:46
@Nakilon Readability can outweigh tiny performance differences when considering the overall success of a product and team, so coreyward made a valid comment. That said, I think `string[regex]` can be just as readable in this scenario, so that's what I used personally. – Nick Feb 09 '17 at 21:01
Do you mind adding since what ruby version this method is valid? I want to use this solution but now I need to go lookup what version or buy introduced string[regex] (for all I know it's always been there). Edit: I can only find ruby api docs > 2.0.0 where this method is still valid so it's probably fine to use it: https://ruby-doc.org/core-2.0.0/String.html#method-i-5B-5D – Asaf Aug 09 '17 at 16:54
@Asaf, if you ask me, then take a look at answer timestamp -- this code was valid in November 2010. – Nakilon Aug 10 '17 at 11:31
@Nakilon It's not something I noticed before, but of course it must be valid for at least that long. Thanks for the answer to this question which helped me solve my problem. – Asaf Aug 10 '17 at 22:29

sepp2k · Accepted Answer · 2023-07-07T09:05:08.440

150

String1.scan(/<([^>]*)>/).last.first

scan creates an array which, for each <item> in String1 contains the text between the < and the > in a one-element array (because when used with a regex containing capturing groups, scan creates an array containing the captures for each match). last gives you the last of those arrays and first then gives you the string in it.

edited Jul 07 '23 at 09:05

answered Nov 06 '10 at 20:58

sepp2k

363,768
54
674
675

score 26 · Answer 3 · edited Jan 10 '14 at 07:57

26

You can use a regular expression for that pretty easily…

Allowing spaces around the word (but not keeping them):

str.match(/< ?([^>]+) ?>\Z/)[1]

Or without the spaces allowed:

str.match(/<([^>]+)>\Z/)[1]

edited Jan 10 '14 at 07:57

Sergio Tulentsev

226,338
43
373
367

answered Nov 06 '10 at 20:59

coreyward

77,547
20
137
166

1

I'm not sure that the last `<>` actually needs to be the last thing in the string. If e.g. the string `foo baz` is allowed (and supposed to give the result `bar`), this will not work. – sepp2k Nov 06 '10 at 21:03
I just went based on the sample string he provided. – coreyward Nov 06 '10 at 21:06

score 12 · Answer 4 · answered Aug 30 '13 at 19:04

Here's a slightly more flexible approach using the match method. With this, you can extract more than one string:

s = "<ants> <pants>"
matchdata = s.match(/<([^>]*)> <([^>]*)>/)

# Use 'captures' to get an array of the captures
matchdata.captures   # ["ants","pants"]

# Or use raw indices
matchdata[0]   # whole regex match: "<ants> <pants>"
matchdata[1]   # first capture: "ants"
matchdata[2]   # second capture: "pants"

score 7 · Answer 5 · edited Jun 08 '16 at 15:52

7

A simpler scan would be:

String1.scan(/<(\S+)>/).last

edited Jun 08 '16 at 15:52

Alan Moore

73,866
12
100
156

answered Jun 08 '16 at 15:47

Navid

71
1
3

Extract a substring from a string in Ruby using a regular expression

5 Answers5

Linked

Related