I am using CSQuery to parse a website in arabic. When I use text() function it returns the text as is, however when I use html() function it uses html encoding. for example this is my html tag:
<div>تعلن عن إرسالها مركبة فضائية للمريخ قريباً جداً</div>
when i use:
dom["div"].Text();
It returns: "تعلن عن إرسالها مركبة فضائية للمريخ قريباً جداً". However when I use:
dom["div"].Html();
It returns:
&#1578;&#1593;&#1604;&#1606; &#1593;&#1606; &#1573;&#1585;&#1587;&#1575;&#1604;&#1607;&#1575; &#1605;&#1585;&#1603;&#1576;&#1577; &#1601;&#1590;&#1575;&#1574;&#1610;&#1577; &#1604;&#1604;&#1605;&#1585;&#1610;&#1582; &#1602;&#1585;&#1610;&#1576;&#1575;&#1611; &#1580;&#1583;&#1575;&#1611;
The question is how can I use Html while preserving the actual text without encoding? I need the Html() function to retrieve any existing tags inside the selector tag.
Edit: here's the content type of the original html page:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">