1

How do we remove or filter data using regex to remove data after an in between ? The starting point is the first date (date could be dynamic it is no the fixed) so for example 08/03/2020 and the endpoint is the last 3 in capslock string (which is also dynamic but only up to 3 characters in capital letters) for example TRU in the string below. And should ignore or remove all the data after that

Here is my current regex :

Regex.Match(text,"(?<=08/03/2020\s+)[\S\s]*?(?=TRU)").Value.Trim

But it aint dynamic .

This is to be remove since this is already after the 08/03/2020 and TRU.

Any idea how we can design a regex for this one ? thank you. #data to be remove

  Processing
       Co-Applicant
       No inquiry records found."

#The String

"08/03/2020        NOVUS HOME                  Mortgage Company                                                     TRU
                     MORTGAGE
   07/08/2020        FACTUAL DATA                Mortgage Reporter                                                    XPN
   07/08/2020        FCTUALDATA                                                                                       EFX
   07/08/2020        NOVUS HOME                  Mortgage Company                                                     TRU
                     MORTGAGE
   07/07/2020        CROSSCOUNTRY                Mortgage Loan                                                        TRU
                     MORTGAG
   07/07/2020        FACTUAL DATA                Mortgage Reporter                                                    XPN
   07/07/2020        FCTUALDATA                                                                                       EFX
   05/21/2020        CAP ONE NA                  Bank Credit Card                                                     XPN
   05/21/2020        CAPITAL ONE                 Credit Card                                                          TRU
   05/21/2020        CAPITALONE                  Bank                                                                 EFX
   05/20/2020        CROSSCOUNTRY                Mortgage Loan                                                        TRU
                     MORTGAG
   05/20/2020        FACTUAL DATA                Mortgage Reporter                                                    XPN
   05/20/2020        FCTUALDATA                                                                                       EFX
   05/20/2020        FINGERHUT/WEBBANK           Finance Company                                                      XPN
   05/07/2020        EMS                                                                                              EFX
   05/07/2020        GROW FINANCIAL CREDI        Credit Bureau/Mortgage                                               TRU
                                                 Processing
   Co-Applicant
   No inquiry records found."

#Expected output

   "08/03/2020        NOVUS HOME                  Mortgage Company                                                     TRU
                         MORTGAGE
       07/08/2020        FACTUAL DATA                Mortgage Reporter                                                    XPN
       07/08/2020        FCTUALDATA                                                                                       EFX
       07/08/2020        NOVUS HOME                  Mortgage Company                                                     TRU
                         MORTGAGE
       07/07/2020        CROSSCOUNTRY                Mortgage Loan                                                        TRU
                         MORTGAG
       07/07/2020        FACTUAL DATA                Mortgage Reporter                                                    XPN
       07/07/2020        FCTUALDATA                                                                                       EFX
       05/21/2020        CAP ONE NA                  Bank Credit Card                                                     XPN
       05/21/2020        CAPITAL ONE                 Credit Card                                                          TRU
       05/21/2020        CAPITALONE                  Bank                                                                 EFX
       05/20/2020        CROSSCOUNTRY                Mortgage Loan                                                        TRU
                         MORTGAG
       05/20/2020        FACTUAL DATA                Mortgage Reporter                                                    XPN
       05/20/2020        FCTUALDATA                                                                                       EFX
       05/20/2020        FINGERHUT/WEBBANK           Finance Company                                                      XPN
       05/07/2020        EMS                                                                                              EFX
       05/07/2020        GROW FINANCIAL CREDI        Credit Bureau/Mortgage                                               TRU
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Mr Dave
  • 35
  • 1
  • 6

1 Answers1

1

You can use

(?ms)\A(?:\d{2}/\d{2}/\d{2}(?:\d{2})?|−−DATE−−)\s.*\s\p{Lu}{3}$

See the regex demo

Details

  • (?ms) - RegexOptions.Multiline (^ matches line start and $ matches line end positions now) and RegexOptions.Singleline (. now also matches newline chars) inline modifers
  • \A - start of a string
  • (?:\d{2}/\d{2}/\d{2}(?:\d{2})?|−−DATE−−) - two digits, /, two digits, / and two or four digits or −−DATE−− string
  • \s - a whitespace
  • .* - any zero or more chars, as many as possible
  • \s - a whitespace
  • [A-Z]{3} - three uppercase ASCII letters (\p{Lu}{3} matches three uppercase letters from any language)
  • $ - end of a line.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Sir it did no longer work when the sample data is this – Mr Dave Sep 18 '20 at 10:58
  • @MrDave Use `(?ms)^\d{2}/\d{2}/\d{2}(?:\d{2})?\s.*\s\p{Lu}{3}$` ([demo](https://regex101.com/r/73wsdP/8)). – Wiktor Stribiżew Sep 18 '20 at 10:59
  • some of the mode modifier is not supported sir I am using uipath , can you put or test that in this website sir https://regexr.com/ ? – Mr Dave Sep 18 '20 at 11:09
  • @MrDave I won't use regexr, it is not user-friendly and less powerful than regex101 (only supports JavaScript and PCRE regex flavors). You are using the regex in UIPath that supports all the inline modifiers I am using in the answer. – Wiktor Stribiżew Sep 18 '20 at 11:10
  • By the way sir , I am so sorry , this is by the way the original format of data from the ouput https://regex101.com/r/73wsdP/9 – Mr Dave Sep 18 '20 at 11:38
  • that is the original spaces – Mr Dave Sep 18 '20 at 11:39
  • @MrDave It is crucial to know 1) your input and 2) pattern requirements. Else, you can't even think of a solution, let alone the regex pattern. If you are sure that is your actual input, use `(?sm)\A−−DATE−−\s.*\s\p{Lu}{3}$` regex. See [this regex demo](https://regex101.com/r/73wsdP/10). – Wiktor Stribiżew Sep 18 '20 at 11:40
  • yes sir this is one of the format and the other one which you solve earlier is https://regex101.com/r/73wsdP/4 , I wanted to have a regex that could solve both hehe – Mr Dave Sep 18 '20 at 11:44
  • This one https://regex101.com/r/73wsdP/4 and this one https://regex101.com/r/73wsdP/10 without creating seperate regex for each since they could have the same pattern – Mr Dave Sep 18 '20 at 11:45
  • @MrDave `(?ms)\A(?:\d{2}/\d{2}/\d{2}(?:\d{2})?|−−DATE−−)\s.*\s\p{Lu}{3}$` - [demo](https://regex101.com/r/73wsdP/12). I updated the answer. – Wiktor Stribiżew Sep 18 '20 at 11:47