1

I'm creating a basic text editor and I'm using regex to achieve a find and replace function. To do this I've gotten this code:

Private Function GetRegExpression() As Regex
    Dim result As Regex
    Dim regExString As [String]
    ' Get what the user entered
    If TabControl1.SelectedIndex = 0 Then
        regExString = txtbx_Find2.Text
    ElseIf TabControl1.SelectedIndex = 1 Then
        regExString = txtbx_Find.Text
    End If

    If chkMatchCase.Checked Then
        result = New Regex(regExString)
    Else
        result = New Regex(regExString, RegexOptions.IgnoreCase)
    End If

    Return result
End Function

And this is the Find method

 Private Sub FindText()
    ''
    Dim WpfTest1 As New Spellpad.Tb
    Dim ElementHost1 As System.Windows.Forms.Integration.ElementHost = frm_Menu.Controls("ElementHost1")
    Dim TheTextBox As System.Windows.Controls.TextBox = CType(ElementHost1.Child, Tb).ctrl_TextBox
    ''
    ' Is this the first time find is called?
    ' Then make instances of RegEx and Match
    If isFirstFind Then
        regex = GetRegExpression()
        match = regex.Match(TheTextBox.Text)
        isFirstFind = False
    Else
        ' match.NextMatch() is also ok, except in Replace
        ' In replace as text is changing, it is necessary to
        ' find again
        'match = match.NextMatch();
        match = regex.Match(TheTextBox.Text, match.Index + 1)

    End If

    ' found a match?
    If match.Success Then
        ' then select it
        Dim row As Integer = TheTextBox.GetLineIndexFromCharacterIndex(TheTextBox.CaretIndex)
        MoveCaretToLine(TheTextBox, row + 1)
        TheTextBox.SelectionStart = match.Index
        TheTextBox.SelectionLength = match.Length

    Else
        If TabControl1.SelectedIndex = 0 Then
            MessageBox.Show([String].Format("Cannot find ""{0}""   ", txtbx_Find2.Text), Application.ProductName, MessageBoxButtons.OK, MessageBoxIcon.Information)
        ElseIf TabControl1.SelectedIndex = 1 Then
            MessageBox.Show([String].Format("Cannot find ""{0}""   ", txtbx_Find.Text), Application.ProductName, MessageBoxButtons.OK, MessageBoxIcon.Information)
        End If
        isFirstFind = True
    End If
End Sub

When I run the program I get errors:

  • For ?, parsing "?" - Quantifier {x,y} following nothing.; and
  • For *, parsing "*" - Quantifier {x,y} following nothing.

It's as if I can't use these but I really need to. How can I solve this problem?

Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
Zer0
  • 1,002
  • 1
  • 19
  • 40
  • What is the test input? You seem to provide that through the user interface, but we don't know what you enter in that user interface... – Willem Van Onsem Jan 07 '15 at 04:04
  • what do you mean by the test input? if it helps, the user enters text in a textbox and then when the user clicks find, it uses the text in the textbox to search and select that particular text. – Zer0 Jan 07 '15 at 04:15
  • What do you enter to get these errors? See answer below, `?` and `*` have a special meaning... – Willem Van Onsem Jan 07 '15 at 04:17

1 Answers1

2

? and * are quantifiers in regular expressions:

  • ? is used to specify that something is optional, for instance b?au can match both bau and au.
  • * means the group with which it binds can be repeated zero, one or multiple times: for instance ba*u can bath bu, bau, baau, baaaaaaaau,...

Now most regular expressions use {l,u} as a third pattern with l the lower bound on the number of times something is repeated, and u the upper bound on the number of occurences. So ? is replaced by {0,1} and * by {0,}.

Now if you provide them without any character before them, evidently, the regex parser doesn't know what you mean. In other words if you do (used csharp, but the ideas are generally applicable):

$ csharp
Mono C# Shell, type "help;" for help

Enter statements below.
csharp> Regex r = new Regex("fo*bar");
csharp> r.Replace("Fooobar fooobar fbar fobar","<MATCH>");    
"Fooobar <MATCH> <MATCH> <MATCH>"
csharp> r.Replace("fooobar far qux fooobar quux fbar echo fobar","<MATCH>");
"<MATCH> far qux <MATCH> quux <MATCH> echo <MATCH>"

If you wish to do a "raw text find and replace", you should use string.Replace.

EDIT:

Another way to process them is by escaping special regex characters. Ironically enough, you can do this by replacing them by a regex ;).

Private Function GetRegExpression() As Regex
    Dim result As Regex
    Dim regExString As [String]
    ' Get what the user entered
    If TabControl1.SelectedIndex = 0 Then
        regExString = txtbx_Find2.Text
    ElseIf TabControl1.SelectedIndex = 1 Then
        regExString = txtbx_Find.Text
    End If

    'Added code
    Dim baseRegex As Regex = new Regex("[\\.$^{\[(|)*+?]")
    regExString = baseRegex.Replace(regExString,"\$0")
    'End added code

    If chkMatchCase.Checked Then
        result = New Regex(regExString)
    Else
        result = New Regex(regExString, RegexOptions.IgnoreCase)
    End If

    Return result
End Function
Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
  • So ? and * are special characters for Regex to do certain operations however, Is there a way to skip it and just pass it as a regular string so that when the user searches for symbols it will just select it and highlight it if it's found? – Zer0 Jan 07 '15 at 04:18
  • @F4z: modified, you can do it by inserting two lines in the function. Since I'm however a C# programmer, there might be some syntax errors, in that case, please comment. – Willem Van Onsem Jan 07 '15 at 04:33
  • Code works fine when searching for text (plain text without symbols) however when I search for symbols like *?^$() it doesn't seem to be working. For ? and * it doesnt select the text just moves the carret to the next character and other symbols like ^$ regex related it just doesn't find them – Zer0 Jan 07 '15 at 04:36
  • Nevermind, looks like i had an extra '/' in '//$0'. Thank you! Hep was greatly appreciated. – Zer0 Jan 07 '15 at 04:39