12

I got this nice code, which I have no idea why doesn't work. It should get the value of a text input and replace each given national character with it's HTML code, for compatibility purposes. But, when I click the button, the function returns the string without any changes. Any idea?

(jsfiddle)

<a id="reminder1" onclick="document.getElementById('reminder2').style.display = ''; document.getElementById('reminder1').style.display = 'none';">
    Set reminder
</a>
<a id="reminder2" class="reminder" style="display:none;">
    <input type="text" id="reminderh" size=40 style="font-size:20px;">
    <input type="button" value="Set" onclick="csere(document.getElementById('reminderh').value);">
</a>

<script>
function csere(qwe){
document.getElementById('reminder2').style.display = 'none';

var rtz0  = qwe.replace("á","&aacute;");
var rtz1  = rtz0.replace("Á","&Aacute;");

var rtz2  = rtz1.replace("é","&eacute;");
var rtz3  = rtz2.replace("É","&Eacute;");

var rtz4  = rtz3.replace("í","&iacute;");
var rtz5  = rtz4.replace("Í","&Iacute;");

var rtz6  = rtz5.replace("ö","&ouml;");
var rtz7  = rtz6.replace("Ö","&Ouml;");
var rtz8  = rtz7.replace("ő","&&#337;");
var rtz9  = rtz8.replace("Ő","&#336;");
var rtz10 = rtz9.replace("ó","&oacute;");
var rtz11 = rtz10.replace("Ó","&Oacute;");

var rtz12 = rtz11.replace("ü","&uuml;");
var rtz13 = rtz12.replace("Ü","&Uuml;");
var rtz14 = rtz13.replace("ű","&#369;");
var rtz15 = rtz14.replace("Ű","&#368;");
var rtz16 = rtz15.replace("ú","&uacute;");
var uio = rtz16.replace("Ú","&Uacute;");

//Creates a cookie with the final value (different function)
createCookie('reminder',uio,1500);

document.getElementById('reminder1').style.display = '';
}
</script>
SeinopSys
  • 8,787
  • 10
  • 62
  • 110
  • Works for me (I used `console.log`) – SomeKittens Aug 02 '12 at 17:16
  • you never assign the value back to the element after doing all of the replaces. – jbabey Aug 02 '12 at 17:29
  • Turns out I wasn't actually in need of UTF encoding. I was just using it, because the entire website uses it, and I tought It's neccessary. Basically, I don't need code to replace national characters, because they seem fine even without it. – SeinopSys Aug 02 '12 at 18:32

4 Answers4

11

You could create an object that has key/value pairs for each character to replace:

var chars = {
    "á" : "&aacute;",
    "Á" : "&Aacute;",
    "é" : "&eacute;",
    "É" : "&Eacute;",
    ...
}

And then use a function in your .replace call:

var uio = qwe.replace(/[áÁéÉ]/g,function(c) { return chars[c]; });

Your object and regular expression will obviously need to grow to include all the characters you want to replace

jackwanders
  • 15,612
  • 3
  • 40
  • 40
9

You can just replace everything programmatically, not using named entities:

return input.replace(/[^ -~]/g, function(chr) {
//                    ^^^^^^ 
// this is a regexp for "everything than printable ASCII-characters"
// and even works in a ASCII-only charset. Identic: [^\u0020-\u007E]
    return "&#"+chr.charCodeAt(0)+";";
});

If you want to use named entities, you can combine this with a key-value-map (as like in @jackwanders answer):

var chars = {
    "á" : "&aacute;",
    "Á" : "&Aacute;",
    "é" : "&eacute;",
    "É" : "&Eacute;",
    ...
}
return input.replace(/[^ -~]/g, function(chr) {
    return (chr in chars) 
      ? chars[chr]
      : "&#"+chr.charCodeAt(0)+";";
});

However, you should never need to use html entities in JavaScript. Use UTF8 as the character encoding for everything, and it will work.

Bergi
  • 630,263
  • 148
  • 957
  • 1,375
  • Using @jackwanders answer, in the second part of the code, How do I place the characters? If I try to use the **/\u00E1/** format I get errors. – SeinopSys Aug 02 '12 at 17:58
  • Where do you use that? If you still have a faulty encoding, you'd need to escape the keys of the map object. – Bergi Aug 02 '12 at 18:03
3

The characters are subject to the encoding of the HTML page, the JavaScript page, and the HTTP request. Try replacing the characters with their Unicode equivalents:

<a id="reminder1" onclick="document.getElementById('reminder2').style.display = ''; document.getElementById('reminder1').style.display = 'none';">
    Set reminder
</a>
<a id="reminder2" class="reminder" style="display:none;">
    <input type="text" id="reminderh" size=40 style="font-size:20px;">
    <input type="button" value="Set" onclick="csere(document.getElementById('reminderh').value);">
</a>

<script>
function csere(qwe){
document.getElementById('reminder2').style.display = 'none';

var rtz0  = qwe.replace(/\u00E1/,"&aacute;");
var rtz1  = rtz0.replace(/\u00C1/,"&Aacute;");

var rtz2  = rtz1.replace(/\u00E9/,"&eacute;");
var rtz3  = rtz2.replace(/\u00C9/,"&Eacute;");

var rtz4  = rtz3.replace(/\u00ED/,"&iacute;");
var rtz5  = rtz4.replace(/\u00CD/,"&Iacute;");

var rtz6  = rtz5.replace(/\u00F6/,"&ouml;");
var rtz7  = rtz6.replace(/\u00D6/,"&Ouml;");
var rtz8  = rtz7.replace(/\u00F5/,"&&#337;");
var rtz9  = rtz8.replace(/\u00D5/,"&#336;");
var rtz10 = rtz9.replace(/\u00F3/,"&oacute;");
var rtz11 = rtz10.replace(/\u00D3/,"&Oacute;");

var rtz12 = rtz11.replace(/\u00FC/,"&uuml;");
var rtz13 = rtz12.replace(/\u00DC/,"&Uuml;");
var rtz14 = rtz13.replace(/\u0171/,"&#369;");
var rtz15 = rtz14.replace(/\u0170/,"&#368;");
var rtz16 = rtz15.replace(/\u00FA/,"&uacute;");
var uio = rtz16.replace(/\u00DA/,"&Uacute;");

//Creates a cookie with the final value (different function)
createCookie('reminder',uio,1500);

document.getElementById('reminder1').style.display = '';
}
</script>

Double check my conversions to be sure. I used the grid on Wikibooks.

Micah Henning
  • 2,125
  • 1
  • 18
  • 26
  • YES! Thank you, looks like the Unicode replacement solved it. – SeinopSys Aug 02 '12 at 17:31
  • 1
    If that was a solution, then you should fix your encoding. Anyway, you should improve your algorithm, see jackwanders' answer or mine. – Bergi Aug 02 '12 at 17:43
  • 1
    This will only replace the first instance of every character. Here is an example: http://jsfiddle.net/hHZhv/1 – Josh Mein Aug 02 '12 at 18:03
1

I think you are having an issue with only replacing the first instance of a character. In javascript you have to specifiy global replaces using regex like this:

var rtz0  = qwe.replace(new RegExp("á", "g"), "&aacute;");

It would be best to create an array as mentioned by PPvG or jackwanders, but otherwise atleast reuse the existing variable. You could easily do it like this:

qwe  = qwe.replace(new RegExp("á", "g"), "&aacute;");
qwe  = qwe.replace(new RegExp("Á", "g"), "&Aacute;");
Josh Mein
  • 28,107
  • 15
  • 76
  • 87
  • Like I said, this method just again, returns the value as it is, and does nothing with it. – SeinopSys Aug 02 '12 at 17:29
  • @DJDavid98 I agree this does not seem to be the reason why it is not replacing, however you will need to do a global replace as I recommended. I also highly recommend using an array as mentioned by PPvG and jackwanders. – Josh Mein Aug 02 '12 at 17:33
  • The problem was with the characters not being recognized as they were not in the correct form, switching to unicode solved it. – SeinopSys Aug 02 '12 at 17:37
  • @DJDavid98 I noticed that. However, if you have a character used more than once, your current code will only remove the first instance of that character and using an array will drastically reduce your code. – Josh Mein Aug 02 '12 at 17:38
  • Josh, the String.replace() method does not need a regular expression. A string is fine. See here: https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/String/replace – Micah Henning Aug 02 '12 at 17:42
  • @Josh Mein God, you're right. I'll try to add an array, if I figure out how to implement it xD – SeinopSys Aug 02 '12 at 17:43
  • @MicahHenning I agree it does not require it, but if you do not use it then you will only replace the first instance of a character. Here is an example: http://jsfiddle.net/hHZhv/1/ – Josh Mein Aug 02 '12 at 17:43
  • Oh right, good point. DJDavid98, you can just add the letter g after the unicode characters since they're in regex form: rtz3.replace(/\u00ED/g,"í"); – Micah Henning Aug 02 '12 at 19:04