0

I'm getting data from an api where special characters are are getting double encoded. By that I mean ’ is encoded as ’. I know how to decode but I am unable to double decode. I've tried raw and html_safe but neither will decode it past ’, even if i double up i.e. raw raw or .html_safe.html_safe. How can I completely decode these characters?

user3688241
  • 3,015
  • 2
  • 14
  • 13
  • 1
    Possible duplicate of [How do I encode/decode HTML entities in Ruby?](http://stackoverflow.com/questions/1600526/how-do-i-encode-decode-html-entities-in-ruby) – infused Feb 05 '16 at 22:57
  • 1
    @infused I don't think this is a duplicate... OP has a condition of double encoding, it's not straight encode / decode HTML. – SteveTurczyn Feb 05 '16 at 23:49

2 Answers2

1

This works...

require 'rubygems'
require 'nokogiri'

my_string = "This is Sam’s bicycle"

decoded_string = Nokogiri::HTML(my_string.gsub('&','&')).text 

puts decoded_string
=> => "This is Sam's bicycle"
SteveTurczyn
  • 36,057
  • 6
  • 41
  • 53
0

I have used to have the same problem. my workaround for all HTML problems is as follows:

def format_html_sentence(sentence)
    Nokogiri::HTML.parse(sentence.gsub(/(\\r|\\n)/, '')).text
end
Datpmt
  • 1
  • 2