Regex to extract last number portion of varying URL

Question

I'm creating a URL parser and have three kind of URLs from which I would like to extract the number portion from the end of the URL and increment the extracted number by 10 and update the URL. I'm trying to use regex to extract but I'm new to regex and having trouble.

These are three URL structures of which I'd like to increment the last number portion of:

Increment last number 20 by 10:

http://forums.scamadviser.com/site-feedback-issues-feature-requests/20/

Increment last number 50 by 10:

https://forums.questionablecontent.net/index.php/board,1.50.html

Increment last number 30 by 10:

https://forums.comodo.com/how-can-i-help-comodo-please-we-need-you-b39.30/

Why create another wheel? Between Ruby's URI and [Addressable::URI](https://github.com/sporkmonger/addressable) there's a lot of well tested code. — the Tin Man, Jan 12 '17 at 18:48
We'd like to see your attempt toward solving this, rather than write code for you that has nothing to do with what you've tried. — the Tin Man, Jan 12 '17 at 18:51

score 2 · Accepted Answer · answered Jan 12 '17 at 14:56

2

With \d+(?!.*\d) regex, you will get the last digit chunk in the string. Then, use s.gsub with a block to modify the number and put back to the result.

See this Ruby demo:

strs = ['http://forums.scamadviser.com/site-feedback-issues-feature-requests/20/', 'https://forums.questionablecontent.net/index.php/board,1.50.html', 'https://forums.comodo.com/how-can-i-help-comodo-please-we-need-you-b39.30/']
arr = strs.map {|item| item.gsub(/\d+(?!.*\d)/) {$~[0].to_i+10}}

Note: $~ is a MatchData object, and using the [0] index we can access the whole match value.

Results:

http://forums.scamadviser.com/site-feedback-issues-feature-requests/30/
https://forums.questionablecontent.net/index.php/board,1.60.html
https://forums.comodo.com/how-can-i-help-comodo-please-we-need-you-b39.40/

answered Jan 12 '17 at 14:56

Wiktor Stribiżew

607,720
39
448
563

A small update: if the string can have line breaks, use `/\d+(?!.*\d)/m` (but I suspect it is not the case here). – Wiktor Stribiżew Jan 12 '17 at 15:08
If I wanted to do just one url at a time rather than using this map how would we do this? – Horse Voice Jan 12 '17 at 15:23
It is already in the code above: `item = item.gsub(/\d+(?!.*\d)/) {$~[0].to_i+10}` – Wiktor Stribiżew Jan 12 '17 at 15:49

Mohammad Yusuf · Answer 2 · 2017-01-12T14:49:13.887

1

Try this regex:

\d+(?=(\/)|(.html))

It will extract the last number.

Demo: https://regex101.com/r/zqUQlF/1

Substitute back with this regex:

(.*?)(\d+)((\/)|(.html))

Demo: https://regex101.com/r/zqUQlF/2

edited Jan 12 '17 at 14:49

answered Jan 12 '17 at 14:43

Mohammad Yusuf

16,554
10
50
78

Scott Weaver · Answer 3 · 2017-01-12T15:17:43.760

0

this regex matches only the last whole number in each URL by using a lookahead (which 'sees' patterns but doesn't eat any characters):

\d+(?=\D*$)

online demo here.

edited Jan 12 '17 at 15:17

answered Jan 12 '17 at 15:00

Scott Weaver

7,192
2
31
43

score 0 · Answer 4 · answered Jan 12 '17 at 15:54

Like this:

urls = ['http://forums.scamadviser.com/site-feedback-issues-feature-requests/20/', 'https://forums.questionablecontent.net/index.php/board,1.50.html', 'https://forums.comodo.com/how-can-i-help-comodo-please-we-need-you-b39.30/']
pattern = /(\d+)(?=[^\d]+$)/

urls.each do |url|
    url.gsub!(pattern) {|m|  m.to_i + 10}
end

puts urls

You can also test it online here: https://ideone.com/smBJCQ

Regex to extract last number portion of varying URL

4 Answers4