0

EDIT: I believe the solution I'm looking for here is with recursion.

With regards to this Issue with RegEx Lookaround where new lines are included

I am trying to find a way to search a section of text for a header, and then select a section of the header, as well as a specific part of the section below it, which also requires a small conditional search.

The text format is like this:

Private Sub NAV_VE124_Click()
    'Open the picture in its description field
    Call ShowPic(Me.NAV_VE124.Description)
End Sub 

And I would like to select VE124 Open the picture in its description field.

Or, more generally, I want everything between NAV_ and Click(), and everything from the ' to the Call (not the line break as some of the descriptions have more than one line of text).

Any thoughts or help would be hugely appreciated. I have about 20000 of these to catalog so I'm kind of at a loss for how else to do it.

logi-kal
  • 7,107
  • 6
  • 31
  • 43

1 Answers1

0

Ok I think this should do it:

.*NAV_(.*)Click\(\)((?:\r\n\s*'(?:[^\r\n]*))*)(?:\r\n^((?!Sub NAV_).)*)*

Let me know if any of my assumptions are incorrect.

  • we don't care about the stuff before the first NAV_ (you may want to check for Sub though)
    • .*
  • we want to find the literal NAV_
    • NAV_
  • we want to find anything that is before Click() and capture it (we can just look for anything that isn't a space since there will be a space after Click() and we will match the Click() later)
    • ([^\s]*)
  • we want to match Click() (you may want to only match the opening ( in case there are params)
    • Click\(\)
  • now things get interesting, I'll break these bits up

    • ((?:\r\n\s*'(?:[^\r\n]*))*)
      • \r\n\s*' this bit gets the newline and the single quote that starts a comment
      • (?:[^\r\n]*) this bit gets all the comment text (anything other then a newline
      • the * at the end allows multiple comment lines
      • the outer parentheses capture the bit we're interested in
  • now we just ignore everything else until the next Sub NAV_ (using lookahead)

    • (?:\r\n^((?!Sub NAV_).))

Given that you appear to be pulling comments from VB code maybe look at tools like doxygen/castle windsor to see if they can help. If I have missed something in the question you may find this starts to get beyond a regular language and needs something other than a regex.

Edit: finished off the partially complete regex

Ed'
  • 397
  • 2
  • 7
  • Hmm... while this seems like it should be right it appears to select the full line including `Private Sub` and `Nav` It also does not select the text after `'` – itchyspacesuit Feb 12 '16 at 14:18
  • Ok. I've tested using Notepad++ (using 'Wrap around' and not '. matches newline) and replacing that full regex with `$1$2\r\n` and I get the list of methods with their comments. Maybe the newline character isn't matching correctly or the tool you're using doesn't match Notepad++. I'd try Notepad++ and \n instead of \r\n. – Ed' Feb 14 '16 at 00:30
  • Also... it will still match the full line including `Private Sub` but the replace will ignore that and only use the 2 groups. – Ed' Feb 14 '16 at 00:30