Indy v9 in Delphi 7 - how to extract readable text portion of TIDMessagePart.Body when ContentType is not text/plain

Question

I am trying to extract the readable portion of the Body.Text property of a TIDMessagePart object that is type TIDText. Something like the code below. However if ContentType of the TIDText message part is not text/plain, but is rather text/html, this fills sBody with all the HTML tags. I just want the readable text, but don't see a way to get that in the version 9 library. Am I missing something?

var email: TIDMessage; sBody: String;

...

for j := 0 to Pred(email.MessageParts.Count) do
begin
if email.MessageParts.Items[j] is TIdText then
begin
    sBody := TIdText(email.MessageParts.Items[j]).Body.Text;
end;
end;

score 2 · Answer 1 · answered Jan 18 '13 at 22:52

2

You have to manually parse the HTML to extract the plain text you want from it. TIdMessage is just an email container of data, it does not parse body content for you, other than to deal with charset conversions. You have to parse the content yourself.

answered Jan 18 '13 at 22:52

Remy Lebeau

555,201
31
458
770

This remains true for INDY 10 and I expect to remains the same for future INDY versions. – jachguate Jan 18 '13 at 23:06
1

True, there are no plans to implement a full HTML parser in Indy (there are plenty of third-party parsers available for that), however Indy 10 does have a small HTML parser in the `ParseMetaHTTPEquiv()` function of the `IdGlobalProtocols` unit, which `TIdHTTP` uses for parsing `` tags from HTML data. – Remy Lebeau Jan 18 '13 at 23:21

Indy v9 in Delphi 7 - how to extract readable text portion of TIDMessagePart.Body when ContentType is not text/plain

1 Answers1