8

I'm currently using the RubyTidy Ruby bindings for HTML tidy to make sure HTML I receive is well-formed. Currently this library is the only thing holding me back from getting a Rails application on Ruby 1.9. Are there any alternative libraries out there that will tidy up chunks of HTML on Ruby 1.9?

Christian
  • 302
  • 2
  • 8

4 Answers4

7

http://github.com/libc/tidy_ffi/blob/master/README.rdoc works with ruby 1.9 (latest version)

If you are working on windows, you need to set the library_path eg

    require 'tidy_ffi'
    TidyFFI.library_path = 'lib\\tidy\\bin\\tidy.dll'
    tidy = TidyFFI::Tidy.new('test')
    puts tidy.clean

(It uses the same dll as tidy) The above links gives you more example of the usage.

surajz
  • 3,471
  • 3
  • 32
  • 38
7

I am using Nokogiri to fix invalid html:

  Nokogiri::HTML::DocumentFragment.parse(html).to_html
Laurynas
  • 3,829
  • 2
  • 32
  • 22
3

Here is a nice example of how to make your html look better using tidy:

require 'tidy'
Tidy.path = '/opt/local/lib/libtidy.dylib' # or where ever your tidylib resides

nice_html = ""
Tidy.open(:show_warnings=>true) do |tidy|
  tidy.options.output_xhtml = true
  tidy.options.wrap = 0
  tidy.options.indent = 'auto'
  tidy.options.indent_attributes = false
  tidy.options.indent_spaces = 4
  tidy.options.vertical_space = false
  tidy.options.char_encoding = 'utf8'
  nice_html = tidy.clean(my_nasty_html_string)
end

# remove excess newlines
nice_html = nice_html.strip.gsub(/\n+/, "\n")
puts nice_html

For more tidy options, check out the man page.

thomax
  • 9,213
  • 3
  • 49
  • 68
  • As of now it appears the tidy gem is incompatible with Ruby 1.9. There appears to be a fork at https://github.com/ShogunPanda/tidy but I haven't investigated it. – aceofspades Jan 19 '12 at 18:40
1

Currently this library is the only thing holding me back from getting a Rails application on Ruby 1.9.

Watch out, the Ruby Tidy bindings have some nasty memory leaks. It's currently unusable in long running processes. (for the record, I'm using http://github.com/ak47/tidy)

I just had to remove it from a production Rails 2.3 application because it was leaking about 1MB/min.

Xavier
  • 11
  • 1