0

I'm using this library wrapper for HTML Tidy in .NET: https://github.com/markbeaton/TidyManaged

it has a simple example:

using System;
using TidyManaged;

public class Test
{
 public static void Main(string[] args)
 {
   using (Document doc = Document.FromString("<hTml><title>test</tootle>     <body>asd</body>"))
   {
  doc.ShowWarnings = false;
  doc.Quiet = true;
  doc.OutputXhtml = true;
  doc.CleanAndRepair();
  string parsed = doc.Save();
  Console.WriteLine(parsed);
  }
 }
}

I want to use the library for a piece of HTML not a full page with "html" and "body" tags. Is it possible?

I basically want to validate opening and closing tags and remove tags with no matching opening.

TylerH
  • 20,799
  • 66
  • 75
  • 101
arik
  • 338
  • 1
  • 16

1 Answers1

0

i found the answer for this by using:

 doc.OutputBodyOnly = AutoBool.Yes;

but in the result I'm getting gibberish chars instead of UTF-8 (I use Hebrew chars) and find out it's an open bug: https://github.com/markbeaton/TidyManaged/issues/2

this didn't solve the issue:

doc.InputCharacterEncoding = TidyManaged.EncodingType.Utf8;
doc.OutputCharacterEncoding = TidyManaged.EncodingType.Utf8;
TylerH
  • 20,799
  • 66
  • 75
  • 101
arik
  • 338
  • 1
  • 16