2

The string looks like

Maximum number of Utilisations

A Borrower (or the Parent) may not deliver a Utilisation Request if as a result of the proposed Utilisation:<br/>
[10] or more Term Loans [(other than Incremental Company Loans)] would be outstanding; [or]<br/>
[15] or more Revolving Company Utilisations would be outstanding[; or<br/>
[20] or more Incremental Company Loans would be outstanding].<br/>
A Borrower (or the Parent) may not request that a Company A Loan [or an Incremental Company Loan] be divided if, as a result of the proposed division, [  25      ] or more Company A Loans [or [  50    ] or more Incremental Company Loans] would be outstanding.<br/>
[A Borrower (or the Parent) may not request that a Company B Loan or a Company C Loan be divided.]

Expected outputs:

[ 10 ] or more Term Loans [(other than Incremental Company Loans)] would be outstanding; 
[ 15 ] or more Revolving Company Utilisations would be outstanding[; or
[ 20 ] or more Incremental Company Loans would be outstanding].

What i was trying which doesn't seem to be working

Regex = '.*other than Incremental Company Loans.*'

This returns whole paragraph. There could be other way to do this, but we have to do this using REGEX only.

user3657339
  • 607
  • 2
  • 9
  • 20
  • Are your "lines" separated with `
    `? Is that HTML? The easiest is to replace all `
    ` with `\n`, and use a non-regex solution I already shared with you.
    – Wiktor Stribiżew Oct 31 '18 at 09:23
  • correct... thats how we are getting the string back from database... – user3657339 Oct 31 '18 at 09:23
  • So, why regex? `inputText.Split(new[] {"
    "}, StringSplitOptions.None).Where(x => x.Contains("other than Incremental Facility Loans"))`.
    – Wiktor Stribiżew Oct 31 '18 at 09:24
  • @WiktorStribiżew - Thanks, but is it doable using REGEX? – user3657339 Oct 31 '18 at 09:29
  • Yes, but why? Could you please explain where you are using it? The performance/readability with regex will be poorer than with pure code. – Wiktor Stribiżew Oct 31 '18 at 09:30
  • BTW, your expected matches do not contain `other than Incremental Facility Loans`. Could you please check your data/expected results? – Wiktor Stribiżew Oct 31 '18 at 09:33
  • @WiktorStribiżew - made the changes to output string... i mentioned it correctly. I wasnt aware performance issue with regex... can you help in inputText.Split approach for 3 output results expected. Please put this in answer so that i can accept it quickly – user3657339 Oct 31 '18 at 09:38
  • After your edit, I see that you want to get the "lines" from the match up to the full stop. Right? I am not sure I understand the rule for extraction here. The sample string and expected result are not clarifying everything here. – Wiktor Stribiżew Oct 31 '18 at 09:40
  • @WiktorStribiżew - Tried to do more formatting, not sure if that would help. But yes you are right get the "lines" from the match up to the full stop...or semicolon – user3657339 Oct 31 '18 at 09:49
  • Since you are writing in C#, see [a C# demo](https://ideone.com/ivo6Qg), no regex involved. – Wiktor Stribiżew Oct 31 '18 at 10:02

1 Answers1

0

The pure regex approach may not suffice since you might want to further replace <br/> with line breaks, and the pattern is rather complex:

(?<=^|<br/>)(?:(?!<br/>).)*other than Incremental Company Loans[\s\S]*?(?=[.;]<br/>|$)

See the regex demo

It matches:

  • (?<=^|<br/>) - a location preceded with start of string or <br/> substring
  • (?:(?!<br/>).)* - any char, 0+ occurrences, that does not start <br/> substring
  • other than Incremental Company Loans - search string
  • [\s\S]*? - any 0+ chars, as few as possible
  • (?=[.;]<br/>|$) - immediately followed with . or ; followed with <br/> or end of string.

As you are writing the code in C#, you may use a non-regex solution that is very readable and easier to adjust:

var s = "A Borrower (or the Parent) may not deliver a Utilisation Request if as a result of the proposed Utilisation:<br/>[10] or more Term Loans [(other than Incremental Company Loans)] would be outstanding; [or]<br/>[15] or more Revolving Company Utilisations would be outstanding[; or<br/>[20] or more Incremental Company Loans would be outstanding].<br/>A Borrower (or the Parent) may not request that a Company A Loan [or an Incremental Company Loan] be divided if, as a result of the proposed division, [  25      ] or more Company A Loans [or [  50    ] or more Incremental Company Loans] would be outstanding.<br/>[A Borrower (or the Parent) may not request that a Company B Loan or a Company C Loan be divided.]";
var result = s.Split(new[] {"<br/>"}, StringSplitOptions.None)
    .SkipWhile(x => !x.Contains("other than Incremental Company Loans"))
    .MagicTakeWhile(x => !x.EndsWith(".") && !x.EndsWith(";"));
Console.WriteLine(string.Join("\n", result));

Output:

[10] or more Term Loans [(other than Incremental Company Loans)] would be outstanding; [or]
[15] or more Revolving Company Utilisations would be outstanding[; or
[20] or more Incremental Company Loans would be outstanding].

The MagicTakeWhile method is borrowed from TakeWhile, but get the element that stopped it also. It takes items until a condition is met including the last item where the condition stops being met.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563