Using Delphi 7, if I have some text like that shown in the lefthand window below, how could I extract all the words and punctuation in a paragraph and copy them to another window like that on the right, followed by a #?
-
What part of this task are you struggling with? What version of Delphi? Your tags are confusing. – David Heffernan Nov 18 '14 at 21:40
-
1study delphi 7 or delphi xe5 – Ramy Réz Nov 18 '14 at 22:07
1 Answers
If I'm understanding you correctly, you need what's called a "tokeniser" or "lexer".
D7 comes with one built-in, in the Classes unit, misleadingly called TParser (misleadingly because parsing normally means the "grammatical analysis" step which may follow the tokenisation of the text, as happens f.i. in the operation of a compiler).
Anyway, iirc, Delphi's TParser was intended to do things like process the text of DFM files, so will not necessarily split the text up exactly as you want, but it's a start. For example, when it tokenises ":=", it will return the ":" and "=" as two separate tokens, but, of course you are free to join them up again when NextToken/TokenString return these in succession. Btw, there are several alternative ways of implementing a tokeniser using classes in the freeware Jedi JCL and JVCL libraries, f.i.
If the text window on the left of your q is in your own app, code like the following may do what you want:
procedure TForm1.Tokenise;
var
SS : TStringStream;
TokenText : String;
Parser : TParser;
begin
SS := TStringStream.Create(Memo1.Lines.Text);
Parser := TParser.Create(SS);
try
while Parser.Token <> #0 do begin
TokenText := Parser.TokenString;
Memo2.Lines.Add(TokenText + '#');
Parser.NextToken;
end;
finally
Parser.Free;
SS.Free;
end;
end;
If the text window is in another app, you would need a method of retrieving the text from it, too, of course.

- 30,454
- 4
- 32
- 73
-
Glad if it helps. If it does, please feel free to "accept" the answer by clicking the "tick" icon on the left ... – MartynA Nov 18 '14 at 22:46
-
-
Thanks @JerryDodge. Perhaps someone will tell us. Maybe the -1 was somehow out of irritation with the OP's later q, now deleted. – MartynA Nov 19 '14 at 09:04