3

We got input string as below.

String inputstr = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus venenatis ultricies pulvinar. Sed sed faucibus orci, at pharetra ex. Donec lacinia massa sed nunc aliquet ultricies. Duis suscipit, purus et commodo auctor, leo tellus molestie dui, quis porttitor orci nulla eu diam. Cras efficitur mauris dignissim, lobortis purus id, luctus erat. Vestibulum a mollis ante, id viverra libero. Vestibulum gravida enim non dignissim varius. Sed velit sapien, blandit quis imperdiet a, vulputate nec turpis. In hac habitasse platea dictumst.\n*TEST06499YGOV 297296+10*\nMorbi auctor fringilla pulvinar. Donec mattis arcu ac metus scelerisque\n 2090 12/15 Page 1 of 3Sed faucibus tempor ex, euismod consequat diam tincidunt sed. Donec sagittis aliquam dolor vitae faucibus. Ut lobortis magna risus, ut sagittis sem convallis eget. Nulla tellus lectus, aliquet ut lacinia quis, sagittis in odio. Ut egestas, sapien id ultrices aliquet, urna mi rutrum nunc, scelerisque rhoncus nulla sem eget risus. Sed eget mollis ante. Vivamus et malesuada neque, ac finibus lectus. Vestibulum consequat purus sit amet elit dapibus gravida. Phasellus in lorem vestibulum, sagittis lacus nec, hendrerit velit. Praesent sapien eros, pharetra eu magna quis, aliquam vestibulum mi. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Donec at aliquet felis, vitae semper ante. Pellentesque aliquam, nisl vitae ullamcorper posuere, lorem neque placerat elit, non hendrerit eros lectus nec quam. Praesent sollicitudin tempor tortor. Ut tellus massa, viverra sed iaculis nec, egestas gravida felis. Nam fringilla placerat volutpat.\n*TEST06499YGOV 297296+10*\nMorbi auctor fringilla pulvinar. Donec mattis arcu ac metus scelerisque\n 2090 12/15 Page 2 of 3Duis ullamcorper, nunc id aliquet luctus, arcu justo tristique nisl, a viverra libero odio sed massa. Duis nec nibh eu risus feugiat dignissim sit amet eu orci. Aliquam malesuada tristique augue non venenatis. Sed in viverra mauris. Suspendisse eu leo non augue molestie tempus. Donec ultrices facilisis turpis, vel fringilla mauris semper ut. Aliquam ullamcorper ante vitae porttitor ultricies. Nullam et consectetur justo. Vestibulum non ullamcorper ex"

Expected output after apply regex on above input string:

  1. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus venenatis ultricies pulvinar. Sed sed faucibus orci, at pharetra ex. Donec lacinia massa sed nunc aliquet ultricies. Duis suscipit, purus et commodo auctor, leo tellus molestie dui, quis porttitor orci nulla eu diam. Cras efficitur mauris dignissim, lobortis purus id, luctus erat. Vestibulum a mollis ante, id viverra libero. Vestibulum gravida enim non dignissim varius. Sed velit sapien, blandit quis imperdiet a, vulputate nec turpis. In hac habitasse platea dictumst.\n
  2. Sed faucibus tempor ex, euismod consequat diam tincidunt sed. Donec sagittis aliquam dolor vitae faucibus. Ut lobortis magna risus, ut sagittis sem convallis eget. Nulla tellus lectus, aliquet ut lacinia quis, sagittis in odio. Ut egestas, sapien id ultrices aliquet, urna mi rutrum nunc, scelerisque rhoncus nulla sem eget risus. Sed eget mollis ante. Vivamus et malesuada neque, ac finibus lectus. Vestibulum consequat purus sit amet elit dapibus gravida. Phasellus in lorem vestibulum, sagittis lacus nec, hendrerit velit. Praesent sapien eros, pharetra eu magna quis, aliquam vestibulum mi. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Donec at aliquet felis, vitae semper ante. Pellentesque aliquam, nisl vitae ullamcorper posuere, lorem neque placerat elit, non hendrerit eros lectus nec quam. Praesent sollicitudin tempor tortor. Ut tellus massa, viverra sed iaculis nec, egestas gravida felis. Nam fringilla placerat volutpat.\n
  3. Duis ullamcorper, nunc id aliquet luctus, arcu justo tristique nisl, a viverra libero odio sed massa. Duis nec nibh eu risus feugiat dignissim sit amet eu orci. Aliquam malesuada tristique augue non venenatis. Sed in viverra mauris. Suspendisse eu leo non augue molestie tempus. Donec ultrices facilisis turpis, vel fringilla mauris semper ut. Aliquam ullamcorper ante vitae porttitor ultricies. Nullam et consectetur justo. Vestibulum non ullamcorper ex

We have used below regex for getting above output.

Regex = new Regex(@"^(?:(.+?\n)\*.+Page \d+ of \d+)+(.+)$", RegexOptions.Singleline);

But With this regex we are getting output 1 & 3 not 2, we want to basically ignore below specific string.

We want to ignore below string from the input string.

\n*TEST06499YGOV 297296+10*\nMorbi auctor fringilla pulvinar. Donec mattis arcu ac metus scelerisque\n 2090 12/15 Page 1 of 3

\n*TEST06499YGOV 297296+10*\nMorbi auctor fringilla pulvinar. Donec mattis arcu ac metus scelerisque\n 2090 12/15 Page 2 of 3

Matching Rules for the input string

1.ignore text starting from * and up to page d of d.

eg. \n*TEST06499YGOV 297296+10*\nMorbi auctor fringilla pulvinar. Donec mattis arcu ac metus scelerisque\n 2090 12/15 Page 2 of 3

So please help us to solve this issue.

KARAN
  • 1,023
  • 1
  • 12
  • 24
  • It would be more clearer if you just tell us the matching rules. – revo Feb 27 '19 at 06:07
  • @revoI have updated the question with matching rules please check it. – KARAN Feb 27 '19 at 06:11
  • You can just split the input string with this regex [`\*.*?page\s*\d+\s*of\s*\d+`](https://regex101.com/r/tZC1j9/1) – Gurmanjot Singh Feb 27 '19 at 06:20
  • Or you could just grab the contents of Group 1 in [`([^*]*)(?:$|\*.*?page\s*\d+\s*of\s*\d+)`](https://regex101.com/r/tZC1j9/2) – Gurmanjot Singh Feb 27 '19 at 06:26
  • @Potato with given regex we don't get desired output. we need to get 3 groups in output as we have described in Expected output. – KARAN Feb 27 '19 at 06:33
  • Why do you need 3 groups? Is an array containing the results not sufficient? – Jerry Feb 27 '19 at 06:40
  • @Jerry Yes, an array containing the results are also sufficient. – KARAN Feb 27 '19 at 06:49
  • Then you can [split](https://ideone.com/4uv5qK) or [grab only group 1](https://ideone.com/MLUW8B) from the array (if you need only the results here, nothing stopping you to append the string to a list from within the for loop). – Jerry Feb 27 '19 at 06:52
  • @Jerry Thanks but can we able get output in Regex.Match(str).Groups ? so we can directly combine all group and create one single string. – KARAN Feb 27 '19 at 07:18
  • Then why are you even matching? You know you can replace the parts you want to ignore, right? – Jerry Feb 27 '19 at 07:20
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/189108/discussion-between-admin-and-jerry). – KARAN Feb 27 '19 at 07:21

1 Answers1

0

If you want to keep the string as a whole, you might also remove the match from the string and use 2 newlines as a replacement.

Match

\n\*Test.*? Page \d+ of \d+

.NET regex demo with shortened content.

Replace with:

\n\n

For example:

String inputstr = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus venenatis ultricies pulvinar. Sed sed faucibus orci, at pharetra ex. Donec lacinia massa sed nunc aliquet ultricies. Duis suscipit, purus et commodo auctor, leo tellus molestie dui, quis porttitor orci nulla eu diam. Cras efficitur mauris dignissim, lobortis purus id, luctus erat. Vestibulum a mollis ante, id viverra libero. Vestibulum gravida enim non dignissim varius. Sed velit sapien, blandit quis imperdiet a, vulputate nec turpis. In hac habitasse platea dictumst.\n*TEST06499YGOV 297296+10*\nMorbi auctor fringilla pulvinar. Donec mattis arcu ac metus scelerisque\n 2090 12/15 Page 1 of 3Sed faucibus tempor ex, euismod consequat diam tincidunt sed. Donec sagittis aliquam dolor vitae faucibus. Ut lobortis magna risus, ut sagittis sem convallis eget. Nulla tellus lectus, aliquet ut lacinia quis, sagittis in odio. Ut egestas, sapien id ultrices aliquet, urna mi rutrum nunc, scelerisque rhoncus nulla sem eget risus. Sed eget mollis ante. Vivamus et malesuada neque, ac finibus lectus. Vestibulum consequat purus sit amet elit dapibus gravida. Phasellus in lorem vestibulum, sagittis lacus nec, hendrerit velit. Praesent sapien eros, pharetra eu magna quis, aliquam vestibulum mi. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Donec at aliquet felis, vitae semper ante. Pellentesque aliquam, nisl vitae ullamcorper posuere, lorem neque placerat elit, non hendrerit eros lectus nec quam. Praesent sollicitudin tempor tortor. Ut tellus massa, viverra sed iaculis nec, egestas gravida felis. Nam fringilla placerat volutpat.\n*TEST06499YGOV 297296+10*\nMorbi auctor fringilla pulvinar. Donec mattis arcu ac metus scelerisque\n 2090 12/15 Page 2 of 3Duis ullamcorper, nunc id aliquet luctus, arcu justo tristique nisl, a viverra libero odio sed massa. Duis nec nibh eu risus feugiat dignissim sit amet eu orci. Aliquam malesuada tristique augue non venenatis. Sed in viverra mauris. Suspendisse eu leo non augue molestie tempus. Donec ultrices facilisis turpis, vel fringilla mauris semper ut. Aliquam ullamcorper ante vitae porttitor ultricies. Nullam et consectetur justo. Vestibulum non ullamcorper ex";
String pattern = @"\n\*Test.*? Page \d+ of \d+";
Regex regex = new Regex(pattern, RegexOptions.Singleline | RegexOptions.IgnoreCase);

String result = regex.Replace(inputstr, "\n\n");
Console.WriteLine(result);

See a C# demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70