5

I use delphi 7 and I would like to extract ONLY the text displayed in a webpage directly from a web page displayed in a TWebBrowser (no images....). Could it be done & how can I do it?

Charles
  • 50,943
  • 13
  • 104
  • 142
M0-3E
  • 1,012
  • 2
  • 13
  • 22

2 Answers2

6

I used the following...

procedure TForm1.WebBrowser1DocumentComplete(Sender: TObject;
  const pDisp: IDispatch; var URL: OleVariant);
 var
  Document: IHtmlDocument2;
begin
  edit1.text:=url;
  document := webbrowser1.document as IHtmlDocument2;
  memo2.lines.add(trim(document.body.innerhtml));  // to get html
  memo1.lines.add(trim(document.body.innertext));  // to get text
end;
PA.
  • 28,486
  • 9
  • 71
  • 95
  • Thank you PA : this is exactly what I need to do ! I would like to copy the text into a TRichedit : Is there any way to keep the formatting (bold, H1...) of the text? – M0-3E Jan 28 '10 at 15:36
  • you might need to remove all the tags and display de html back to the browser. – PA. Jan 28 '10 at 16:12
1

If your wanting to load this into a TRichEdit, then I suggest looking at the WPTools component which has the ability to load the data from an HTML stream, and export as RTF. I use this component to handle my internal email editor (which it appears is what your after).

skamradt
  • 15,366
  • 2
  • 36
  • 53