1

I am trying to get a routine that will find a string that does not follow a parentheses. For instance if the file open in the RichEdit contains these lines of CNC code, I want it to find the first two and ignore the third. In the second line it should only find and highlight the first occurrence of the search string. The search string (mach.TOOL_CHANGE_CALL) in this example is 'T'.

N1T1M6
N1T1M6(1/4-20 TAP .5 DP.)
(1/4-20 TAP .5 DP.)

I have gotten this far, but am stumped.

procedure TMainForm.ToolButton3Click(Sender: TObject); // find tool number
var
  row:integer;
  sel_str:string;
  par:integer;
  tool:integer;
  tool_flag:integer ;
  line_counter:integer;
  tool_pos:integer;
  line_begin:integer;
  RE:TRichEdit;
begin
  RE:=(ActiveMDIChild as TMDIChild).RichEdit1;
  line_counter:=0;
  tool_flag:=0;
  tool_pos:=0;

  row:=SendMessage(RE.Handle,EM_LINEFROMCHAR,-1, RE.SelStart);

  while  tool_flag =0 do    
  begin
    RE.Perform(EM_LINESCROLL,0,line_counter);
    sel_str := RE.Lines[Line_counter];
    tool:=pos(mach.TOOL_CHANGE_CALL,sel_str);
    par:=pos('(',sel_str);
    if par=0 then 
      par:=pos('[',sel_str);
    tool_pos:=tool_pos+length(sel_str);

    if (tool>0) and (par = 0)  then
    begin
      RE.SetFocus;
      tool_pos:=tool_pos + line_counter-1;
      line_begin:=tool_pos-tool;
      RE.SelStart := line_begin;
      RE.SelLength := Length(sel_str);
      tool_flag:=1;
    end;
    inc (line_counter);
  end;
end;

The results I get is that it will ignore the third string, but will also ignore the second string as well. It also will not find subsequent occurrences of the string in the file, it just starts back at the beginning to the text and finds the first one again. How can I get it to find the second example and then find the next 'T' at the next click of the button? I also need it to highlight the entire line the search string was found on.

Ken White
  • 123,280
  • 14
  • 225
  • 444
user2662392
  • 65
  • 2
  • 9
  • I'm not sure I understand your requirements. If your search expression is 'T', you're suggesting that that should only match the two occurrences of 'N1T1M6' and not either of the `TAP` occurrences? – Ken White Sep 01 '13 at 23:59
  • Correct, It should only find a T that is not inside parentheses. In CNC code text inside parentheses are comments. I am trying to find only the actual tool ('T') change location in the code and that is always outside the comments. It can not find any other word containing a T inside parentheses either. – user2662392 Sep 02 '13 at 00:02
  • One final request for info. Is the test as you listed it (one "comand" per line)? Or is it a continuous block of text? I'm afraid I'm not familiar with CNC code. If it's one command per line, it should be easy to handle with a regular expression. – Ken White Sep 02 '13 at 00:13
  • Not quite sure I follow the question. but I'll take a stab at it. the CNC code contains many lines of text, sometimes into millions of lines. We need to quickly be able to find the tool change location ('T') in all those lines. There can be as many as 60 of these tool change calls, but they will always look similar to the examples I gave you. Sometimes the comment describing the tool will be on the line before the tool change call ('T') Sometimes on the same line, sometimes on the line after. So I need to ignore those comment lines even if they contain the T – user2662392 Sep 02 '13 at 00:15
  • If a 'par' is found, you can search for a matching *par-close* (?), and then ignore the matches in-between. You can use PosEx to begin the search at a certain position. – Sertac Akyuz Sep 02 '13 at 00:45

1 Answers1

3

Given the samples you posted, you can use Delphi (XE and higher) regular expressions to match the text you've indicated. Here, I've put the three sample lines you've shown into a TMemo (Memo1 in the code below), evaluate the regular expression, and put the matches found into Memo2 - as long as your TRichEdit contains only plain text, you can use the same code by replacing Memo1 and Memo2 with RichEdit1 and RichEdit2 respectively.

I've updated the code in both snippets to show how to get the exact position (as an offset from the first character) and length of the match result; you can use this to highlight the match in the richedit using SelStart and SelLength.

uses
  RegularExpressions;

procedure TForm1.Button1Click(Sender: TObject);
var
    Regex: TRegEx;
    MatchResult: TMatch;
begin
  Memo1.Lines.Clear;
  Memo1.Lines.Add('N1T1M6');
  Memo1.Lines.Add('N1T1M6(1/4-20 TAP .5 DP.)');
  Memo1.Lines.Add('(1/4-20 TAP .5 DP.)');
  Memo2.Clear;
  // See the text below for an explanation of the regular expression
  Regex := TRegEx.Create('^\w+T\w+', [roMultiLine]);
  MatchResult := Regex.Match(Memo1.Lines.Text);
  while MatchResult.Success do 
  begin
    Memo2.Lines.Add(MatchResult.Value +
                  ' Index: ' + IntToStr(MatchResult.Index) +
                  ' Length: ' + IntToStr(MatchResult.Length));
    MatchResult := MatchResult.NextMatch;
  end;
end;

This produces the following results:

Capture of results of above code

If you're using a version of Delphi that doesn't include regular expression support, you can use the free TPerlRegEx with some minor code changes to produce the same results:

uses
  PerlRegEx;

procedure TForm1.Button1Click(Sender: TObject);
var
  Regex: TPerlRegEx;
begin
  Memo1.Lines.Clear;
  Memo1.Lines.Add('N1T1M6');
  Memo1.Lines.Add('N1T1M6(1/4-20 TAP .5 DP.)');
  Memo1.Lines.Add('(1/4-20 TAP .5 DP.)');
  Memo2.Clear;
  Regex := TPerlRegEx.Create;
  try
    Regex.RegEx := '^\w+T\w+';
    Regex.Options := [preMultiLine];
    Regex.Subject := Memo1.Lines.Text;
    if Regex.Match then 
    begin
      repeat
        Memo2.Lines.Add(Regex.MatchedText +
                        ' Offset: ' + IntToStr(Regex.MatchedOffset) +
                        ' Length: ' + IntToStr(Regex.MatchedLength));
      until not Regex.MatchAgain;
    end;
  finally
    Regex.Free;
  end;
end;

The regular expression above (^\w+T\w+) means:

Options: ^ and $ match at line breaks

Assert position at the beginning of a line (at beginning 
  of the string or after a line break character) «^»
Match a single character that is a “word character” (letters, 
  digits, and underscores) «\w+»
    Between one and unlimited times, as many times as possible, 
    giving back as needed (greedy) «+»
Match the character “T” literally «T»
Match a single character that is a “word character” (letters, 
  digits, and underscores) «\w+»
    Between one and unlimited times, as many times as possible, 
    giving back as needed (greedy) «+»

Created with RegexBuddy

You can find a decent tutorial regarding regular expressions here. The tool I used for working out the regular expression (and actually producing much of the Delphi code for both examples) was RegexBuddy - I'm not affiliated with the company that produces it, but just a user of that product.

Ken White
  • 123,280
  • 14
  • 225
  • 444
  • Can you explain this line : Regex.RegEx := '^\w+T\w+'; ? I'm assuming the ^ signifies a pointer but I don't understand how the line excludes what is in the parentheses. I'm certain your code will work, I want to learn why so I don't have to ask more questions. – user2662392 Sep 02 '13 at 01:56
  • no, it's not a pointer. It's part of a regular expression; it's an anchor for the beginning of a line. I'll edit to provide more information about what the regular expression means, as well as a link to a page that has more information. – Ken White Sep 02 '13 at 02:02
  • If I'm understanding TPerlRegEx , it looks for every instance of T in the file. I basically just want to find an instance of T that is not inside parentheses in the RichEdit, set the caret at that position and highlight the line. then next time the button is pressed, find the next occurrence set the caret again and highlight it as many times as there are tools (T) in the file when the button is pushed to step through them. – user2662392 Sep 02 '13 at 02:06
  • No, that's not what it does. See the explanation I posted. You can do that by using properties of `TRegex` (or `TPerlRegex` that returns the offset of the matched text. (The regular expression I posted meets your condition - given the three lines of text you posted, it matches the first two lines (in the second, the portion up to but not including the opening `(`), but does not match the `T` in `TAP` in line two, or anything in line 3, just as the right memo in the screen image indicates.) Regexes are much more powerful than `Pos` for matching text that meets quite complex conditions. – Ken White Sep 02 '13 at 02:17
  • (continued) Regexes can even do things as complex as 'find a 5 digit number followed by a period followed by another 5 digit number that is higher than the first 5 digit number'. I actually have one at the office that does that, found in a post here at SO, that I saved just because it was interesting. – Ken White Sep 02 '13 at 02:21
  • " found in a post here at SO" Any chance you could post a link to the original? – MartynA Sep 02 '13 at 09:50
  • @MartynA: I don't know if I kept a link or not. As I said, it's at the office; it's a holiday weekend here, so I won't be there until tomorrow to check. It was in the last six weeks or so, IIRC. – Ken White Sep 02 '13 at 12:44
  • 1
    @Martyn: [Here's one](http://stackoverflow.com/a/17750585/62576) that matches a 5 digit number, a `:`, and then a 5 digit number and ensures the second number is greater than the first. – Ken White Sep 02 '13 at 19:44