0

Given this string:

bc.  some text
 more text
 even more

^ above here is the empty line

I want it to be:

<pre>
some text
more text
even more
</pre>

^ above here is the empty line

How can I regex for "starting from bc. until the first empty line"?

So far I got this:

# note that for some reason a direct .gsub! behaves
# differently/fails when using the block, so I use .gsub
textile_markup = textile_markup.gsub(/^bc.  .*^$/m) { |s| "<pre>#{s[5..(s.length)]}</pre>" }

Understandibly, this matches greedy until the very last empty line - instead of the first one. How can I make the ^$ part non-greedy?

user569825
  • 2,369
  • 1
  • 25
  • 45
  • Usually `.*?` is the non-greedy version of `.*`. Would that work? – tadman Mar 19 '13 at 19:04
  • you will need the /s modifier for dot to match new lines – Lodewijk Bogaards Mar 19 '13 at 19:45
  • Do you have only one block, or is this a repeating pattern through the string/file? If it's repeating you need to represent that in your sample data. Also, why does this have to be done using a regular expression? – the Tin Man Mar 19 '13 at 19:46
  • do you know this great site called rubular? http://rubular.com/r/uloTda090y – phoet Mar 19 '13 at 19:47
  • @theTinMan I only have one block. I am open to more efficient solutions also. However I think the shortest path will be a regex. – user569825 Mar 19 '13 at 20:06
  • @phoet Thanks for the hint on the site. The example matcher however fails my sample code. See here: http://rubular.com/r/Tz5MuKg41z (sorry also for the confusion - I updated the sample to display the last string **after** the closing ``. – user569825 Mar 19 '13 at 20:10
  • @user569825 there is no need to gsub the whole thing! use a matchgroup and then put everything where it belongs – phoet Mar 19 '13 at 20:34
  • @phoet I am having trouble understanding your proposed idea. Could you update the example you posted to reflect it? – user569825 Mar 19 '13 at 21:38
  • @user569825 have a look at the docs http://apidock.com/ruby/String/match – phoet Mar 20 '13 at 07:12

2 Answers2

2
str = 
"bc.  some text
more text
even more

^ above here is the empty line

bc.  some text
more text
even more

^ above here is the empty line"

puts str.gsub(/^bc\.  (.*?)\n\n/m, "<pre>\n\\1\n</pre>\n\n")

Output:

<pre>
some text
more text
even more
</pre>

^ above here is the empty line

<pre>
some text
more text
even more
</pre>

^ above here is the empty line

Explanation

? in .*? makes the star operator non greedy

/m modifier in the end makes dot match newlines

Yuri Golobokov
  • 1,829
  • 12
  • 11
1

It can be done in one go, but it needs some preparation:

txt = <<DOC
bc.  some text
 more text
 even more

bc.  some text
 more text
 even more

DOC

TRANSFORMS = {"bc.  " => "<pre>\n",       # The 'bc.  should become <pre> followed by a line-end
              /^ /    => "",              # leading space should be eliminated
             "\n\n"   => "\n<\/pre>\n\n"} # empty line should be preceded by a closing pre-tag

re = Regexp.union(TRANSFORMS.keys)
puts txt.gsub(re, TRANSFORMS)

Output:

<pre>
some text
more text
even more
</pre>

<pre>
some text
more text
even more
</pre>
steenslag
  • 79,051
  • 16
  • 138
  • 171
  • I absolutely like the way you coded that! The substitutions apply in other situations as well, which I want to avoid as I am working on huge documents. Yuriy Golobokov's `gsub` variant works and I think it'd be interesting for everyone to see it using your style, if possible. Would be great if you'd update. – user569825 Mar 21 '13 at 00:06
  • It will add `"\n<\/pre>\n\n"` for every empty line even if paragraph wasn't started with `bc. ` – Yuri Golobokov Mar 21 '13 at 01:53