0

I have a html-form with one html-input-field. The input is copied via clipboard from other programs. Sometimes the copied text is not utf-8, but ansi (tested with notepad++). Than, umlauts like ü are copied as ü. As I don't want to change the encoding of the clipboard-text everytime (with i.e.notepad++), I would like to do this with javascript directly when parsing and spliting the input-text.

Is there a way to do this without implementing an own function for this (which would be the next thing I would do for the most common umlauts)?

Sammy
  • 1,178
  • 1
  • 14
  • 27
  • OK. As always, searching, testing and looking and than just asked and an answer comes along like this one: http://stackoverflow.com/questions/18222665/huge-string-replace-in-javascript Maybe it's best for my case. – Sammy Mar 11 '15 at 09:21
  • I was looking for an utf-8 generic converter and found this: http://jsfromhell.com/geral/utf-8 . However, ü is not being converted to ü... If you want to, in any case, you can listen to the "paste" event and change the inpu value to the encoded one.. I've made this codepen while trying: http://codepen.io/anon/pen/GgYZeb . Everytime you will paste something it will automatically convert it to UTF8... But it seems not to be working with your specific case. – briosheje Mar 11 '15 at 09:37
  • Oh dang, that's because it is utf8.decode, I'm an idiot. Please check this out: http://codepen.io/anon/pen/GgYZeb – briosheje Mar 11 '15 at 09:41

1 Answers1

2

Stealing from the internet this:

//+ Jonas Raoni Soares Silva
//@ http://jsfromhell.com/geral/utf-8 [rev. #1]

var UTF8 = {
    encode: function(s){
        for(var c, i = -1, l = (s = s.split("")).length, o = String.fromCharCode; ++i < l;
            s[i] = (c = s[i].charCodeAt(0)) >= 127 ? o(0xc0 | (c >>> 6)) + o(0x80 | (c & 0x3f)) : s[i]
        );
        return s.join("");
    },
    decode: function(s){
        for(var a, b, i = -1, l = (s = s.split("")).length, o = String.fromCharCode, c = "charCodeAt"; ++i < l;
            ((a = s[i][c](0)) & 0x80) &&
            (s[i] = (a & 0xfc) == 0xc0 && ((b = s[i + 1][c](0)) & 0xc0) == 0x80 ?
            o(((a & 0x03) << 6) + (b & 0x3f)) : o(128), s[++i] = "")
        );
        return s.join("");
    }
};

You can then add your input:

<input type="text" id="test">

And listen to the PASTE event and, after a few milliseconds (else you will get "" as .val), you can replace the entire value of the input with the decoded one:

$('#test').on('paste', function(e) {
  var controller = $(this);
  setTimeout(function(){
    controller.val(UTF8.decode(controller.val()));
  },10);
});

Codepen:

http://codepen.io/anon/pen/GgYZeb

Please note that it is only listening to the PASTE event. You can also add other events if you're interested.

briosheje
  • 7,356
  • 2
  • 32
  • 54
  • Thanks. I'll try this - I think, I don't have to understand every edge of that code ;-) – Sammy Mar 11 '15 at 09:50
  • @Sammy: I've nnoticed that it is quite bugged though, since if you take a piece of text and then parse another it will for some reasons delete some characters. I don't really know how that function (UTF8 object) is made, therefore I can't go any further, but as far as you paste an entire ENCODED Ansi string it will decode it correctly. – briosheje Mar 11 '15 at 09:51
  • @Sammy: You may also want to take a further look at this: http://stackoverflow.com/questions/6607799/javascript-convert-ansi-to-utf8 – briosheje Mar 11 '15 at 09:53
  • My qad way was to use the link I posted. After saving my js-file as utf-8 it worked well...I will try this solution later. – Sammy Mar 11 '15 at 13:02