0

I have a question about loading html files into a grid in Lazarus or FreePascal. Is it possible to even do that? I've tried some pieces of code, I'll include some.

First, I'm talking about the kind of html files with a table in it, like this:

<html><head><title>201401061209</title></head><body><table width=451
border=0 cellspacing=0 cellpadding=0 class="realtable">
<tr class="trcolor">
<th></th>
<th align="left" width=40>Di</th>
<th align="left" width=40>Wo</th>
<th align="left" width=40>Do</th>
<th align="left" width=40>Vr</th>
<th align="left" width=40>Za</th>
<th align="left" width=40>Zo</th>
</tr>
<tr>
<td>Zonneschijn (%)</td>
<td align="left" width=40>    30</td>
<td align="left" width=40>20</td>
<td align="left" width=40>10</td>
<td align="left" width=40>10</td>
<td align="left" width=40>10</td>
<td align="left" width=40>10</td>
</tr>
<tr>
<td>Neerslagkans (%)</td>
<td align="left" width=40>    30</td>
<td align="left" width=40>50</td>
<td align="left" width=40>80</td>
<td align="left" width=40>40</td>
<td align="left" width=40>20</td>
<td align="left" width=40>30</td>
</tr>
</table></body></html>

I've already tried to load it into a grid (using StringList.Delimiter:=#9;) but that didn't work at all. I've tried to use Pos too, which came closer but with this code I'm litterally stuck in a loop (I think because the program doesn't react when I've opened a file). So what can I do to

procedure HtmlToGrid(Grid: TStringGrid; const FileName: string;
Sender: TObject);
var
  StringList, Line: TStringList;
  Row, Col: Integer;
  i, positie1, positie2: Integer;
  tekst, gekniptetekst: string;
  einddoc: boolean;
begin
  Grid.RowCount := 0;  //clear any previous data
  StringList := TStringList.Create;
  StringList.LoadFromFile(filename);
  Grid.RowCount := Stringlist.Count;
  try
  Line := TStringList.Create;
  try
  einddoc:=False;
    while einddoc=False do begin
      positie1:= Pos('<tr ', StringList.Text);
      positie2:= Pos('</tr>', StringList.Text);
    for Row := 0 to StringList.Count-1 do begin    //voor elke rij
        tekst:=StringList.Strings[Row];
        //Line[Row]:= GetPart(['<td align="left" width=40>'],['</td>'],tekst);

      if positie1 > 0 then begin
      //positie1:= positie1 + 26;
      if positie2 > 0 then begin               //als beide posities bestaan
        //einderij:=False;
        for Col := 0 to Grid.ColCount-1 do begin     //voor elke kolom
          gekniptetekst:= GetPart(['<td align="left"
width=40>'],['</td>'],tekst);
          if Col<Line.Count then
            Grid.Cells[Col,Row]:= gekniptetekst
          else
            Grid.Cells[Col,Row]:= '';
        end;

      end;

    end;
  end;
    if Pos('</table>', StringList.Text)-30<positie2 then
      einddoc:=True;
  end;
  finally
    Line.Free;
  end;
  finally
    StringList.Free;
  end;


  {prob := Pos('<th', StringList.Text);
  if (Sender is TLabel) then
    TLabel(Sender).Caption := IntToStr(prob);
   }
end;
Smiley17
  • 31
  • 7
  • You can't load HTML into a TDBGrid, except by loading it into a TDataSet descendant type and even then you would have to parse the individual HTML rows into separate records in the dataset. Apart from a TStringGrid, which I think is the only practicable possibility, what other type of grid are you asking about? – MartynA Mar 01 '21 at 16:53
  • Btw, if you are not doing this as homework/coursework you might want to take a look at this Lazarus component: https://wiki.freepascal.org/THtmlPort. This is an FPC port of Dave Baldwin's since-discontinued THtmlViewer component. – MartynA Mar 01 '21 at 17:02
  • @MartynA I was thinking that maybe TGrid could work too... Well, it 's not exactly homework but it's a big project at school. I have a lot of html files which need to be combined into one big table. I thought this was the best way to achieve that..? – Smiley17 Mar 02 '21 at 18:15
  • Btw, I'll try THtmlViewer – Smiley17 Mar 02 '21 at 18:19
  • Well, you could certainly get much further much faster using THtmlViewer. But working with HTML as a source of data is a pita. Isn't your source data available in another format? – MartynA Mar 02 '21 at 18:34
  • I'm afraid not... – Smiley17 Mar 02 '21 at 18:40
  • Btw, depending on the scale of what you're doing, it might be worthwhile spending a few hours investigating whether an HTML parser could help you. The JEDI project has one - https://wiki.delphi-jedi.org/wiki/JVCL_Help:TJvHTMLParser - but I've never used it myself. – MartynA Mar 02 '21 at 18:49
  • What's the difference exactly between a viewer and a parser? – Smiley17 Mar 02 '21 at 19:05
  • Well, that's a rather broad subject to answer in a comment. A parser is essentially a s/ware machine that processes an input, e.g. in a textfileand analyzes it in terms of the grammatical rules for the input language An Html viewer is just that, a viewver that displays text according to the rules of Html. You can probably guess than any Html viewr must be built around an Html parser and that's true, but an Html viewer doesn't necessarily make the Html building blocks of the input text readily accessible for you to process. A specialist form of parser that might help you is a DOM parser. – MartynA Mar 02 '21 at 19:25
  • [cont]see https://en.wikipedia.org/wiki/Document_Object_Model. MS's IE contains a very powerful if ageing DOM interface (Windows only) that can in principle be accessed from a Windows Lazarus app, but that would require a small book to explore and explain. – MartynA Mar 02 '21 at 19:29
  • Jesus I'm sorry for that question ;) Thanks for all the details, but I think THtmlViewer will be an option to try out. But I've downloaded it and have put "HtmlViewer" into **uses**. But when I compile he can't find it. Where do I have to put what part of that download? Or do I have to create a new question? I've tried in Project -> Project Options -> Paths -> Other unit files but that doesn't work... – Smiley17 Mar 02 '21 at 20:14
  • I'm a bit busy this evening. If you've downloaded it correctly you should have a top folder named e.g. HTMLViewer with two subfolders, demo_src and package. In the pacake sub-folder there should be a file htmlcomp.lpk. You need to open that in Lazarus and then compile and install it. Then, you can use the HtmlViewer component in a new lazarus gui project. – MartynA Mar 02 '21 at 20:39
  • Don't worry if you haven't got the time to answer. I've downloaded the file from [link](https://github.com/BerndGabriel/HtmlViewer) and as you can see on that site in the package folder, there is not a file named htmlcomp.lpk. The only option is FrameViewer09.lpk and that doesn't work. – Smiley17 Mar 02 '21 at 21:07
  • Smiley, how about downloading from the site @MartynA suggested. – Tom Brunberg Mar 03 '21 at 08:06
  • @TomBrunberg I have. When you click on his link you'll get on a site where everything is explained, but directly on that site, you can't download it. You have to click another link to get on a site where you can. So I did that. – Smiley17 Mar 03 '21 at 19:45
  • Ok, sorry. I didn't dl myself, so I probably misunderstood the problem. – Tom Brunberg Mar 03 '21 at 21:58
  • The FPSpreadsheet package contains a html reader. You can use the TsWorksheetGrid to display the html table like in a TStringGrid, but including all cell attributes like font, color, text alignment, merged cells etc. You can install FPSpreadsheet from the Online-Package-Manager with a single click. – wp_1233996 Mar 05 '21 at 15:28

0 Answers0