0

I'm using the Curb gem (https://github.com/taf2/curb) to get the HTML of a page that has special characters:

http = Curl.get("http://www.baidu.com/")
puts http.body_str

http.body_str.encoding is ASCII-8BIT. How do get the the body_str as UTF-8 without having to convert it after the fact?

ill_always_be_a_warriors
  • 1,546
  • 2
  • 17
  • 33

1 Answers1

1

You can use Curl::Easy.encoding. http://curb.rubyforge.org/classes/Curl/Easy.html#M000035

Auli
  • 74
  • 7
  • I just tried this, and it seems my encoding is not UTF-8 still. `2.1.2 :192 > c = Curl::Easy.new("http://www.baidu.com") => # 2.1.2 :193 > c.encoding = 'utf-8' => "utf-8" 2.1.2 :194 > c.perform => true 2.1.2 :195 > c.body_str.encoding => #``` – ill_always_be_a_warriors Dec 28 '15 at 11:03
  • It's curb problem, github.com/vcr/vcr/issues/150 Not yet to solve. you only can completed again to encoding. – Auli Dec 28 '15 at 11:24