46

If I have a HTML file on disk, How can I read it all at once in to a String variable at run time? Then I need to do some processing on that string variable.

Some html file like this:

<html>
    <table cellspacing="0" cellpadding="0" rules="all" border="1" style="border-width:1px;border-style:solid;width:274px;border-collapse:collapse;">
        <COLGROUP><col width=35px><col width=60px><col width=60px><col width=60px><col width=59px></COLGROUP>
        <tr style="height:20px;">
            <th style="background-color:#A9C4E9;"></th><th align="center" valign="middle" style="color:buttontext;background-color:#D3DCE9;">A</th><th align="center" valign="middle" style="color:buttontext;background-color:#D3DCE9;">B</th><th align="center" valign="middle" style="color:buttontext;background-color:#D3DCE9;">C</th><th align="center" valign="middle" style="color:buttontext;background-color:#D3DCE9;">D</th>
        </tr><tr style="height:20px;">
            <th align="center" valign="middle" style="color:buttontext;background-color:#E4ECF7;">1</th><td align="left" valign="top" style="color:windowtext;background-color:window;">Hi</td><td align="left" valign="top" style="color:windowtext;background-color:window;">Cell Two</td><td align="left" valign="top" style="color:windowtext;background-color:window;">Actually a longer text</td><td align="left" valign="top" style="color:windowtext;background-color:window;">Final Word</td>
        </tr>
    </table>
</html>
vela
  • 147
  • 10
Bohn
  • 26,091
  • 61
  • 167
  • 254

8 Answers8

69

Use File.ReadAllText passing file location as an argument.

However, if your real goal is to parse html then I would recommend using Html Agility Pack.

Eonasdan
  • 7,563
  • 8
  • 55
  • 82
empi
  • 15,755
  • 8
  • 62
  • 78
25

Use System.IO.File.ReadAllText(fileName)

L.B
  • 114,136
  • 19
  • 178
  • 224
19
string html = File.ReadAllText(path);
Forte L.
  • 2,772
  • 16
  • 25
13

This is mostly covered already, but one addition as I ran into an issue with the previous code samples.

Dim strHTML as String = System.IO.File.ReadAllText(HttpContext.Current.Server.MapPath("~/folder/filename.html"))
s15199d
  • 7,261
  • 11
  • 43
  • 70
5

Use File.ReadAllText(path_to_file) to read

Srijan
  • 1,234
  • 1
  • 13
  • 26
4

What kind of processing are you trying to do? You can do XmlDocument doc = new XmlDocument(); followed by doc.Load(filename). Then the XML document can be parsed in memory.

Read here for more information on XmlDocument:

Stacked
  • 6,892
  • 7
  • 57
  • 73
Ted Spence
  • 2,598
  • 1
  • 21
  • 21
4

You can do it the simple way:

string pathToHTMLFile = @"C:\temp\someFile.html";
string htmlString = File.ReadAllText(pathToHTMLFile);

Or you could stream it in with FileStream/StreamReader:

using (FileStream fs = File.Open(pathToHTMLFile, FileMode.Open, FileAccess.ReadWrite))
{
    using (StreamReader sr = new StreamReader(fs))
    {
        htmlString = sr.ReadToEnd();
    }
}

This latter method allows you to open the file while still permitting others to perform Read/Write operations on the file. I can't imagine an HTML file being very big, but it has the added benefit of streaming the file instead of capturing it as one large chunk like the first method.

vapcguy
  • 7,097
  • 1
  • 56
  • 52
  • *This latter method allows you to open the file while still **permitting** others to perform **Read/Write** operations on the file*. Are you implying the first method **prohibits** others to perform **Read** operations on the file? Because I don't suppose so. – om-ha Jan 04 '20 at 07:57
  • Waiting for your response @vapcguy – om-ha Jan 04 '20 at 07:59
  • 1
    @om-ha Wasn't trying to imply that at all - honestly, I never tested reading and writing operations at the same time while doing a `File.ReadAllText`. I have tested it with multiple combinations of the latter code block, however. I only meant it as a description for what is in that last block of code because there are various options to use with `FileMode` and `FileAccess` that don't actually work that you might think actually would. – vapcguy Jan 06 '20 at 08:00
1
var htmlText = System.IO.File.ReadAllText(@"C:/filename.html");

And if file in at application root, user below

var htmlText = System.IO.File.ReadAllText(HttpContext.Current.Server.MapPath(@"~/filename.html"));