I'm looking to detect internationalized domain names and local portions in email addresses, and would like to know if there is a quick and easy way to do this with regex or otherwise in Javascript.
Asked
Active
Viewed 2.6k times
26
-
4What do you mean by ASCII? Remember that NUL (\0), BEL (\7 - causes PC to beep), ESC (\033) are also valid ASCII characters but most would't consider them to be valid ASCII text. – slebetman Nov 23 '12 at 04:13
-
@slebetman very fair point to add. – wwaawaw Nov 23 '12 at 05:34
5 Answers
31
This should do it...
var hasMoreThanAscii = /^[\u0000-\u007f]*$/.test(str);
...also...
var hasMoreThanAscii = str
.split("")
.some(function(char) { return char.charCodeAt(0) > 127 });
ES6 goodness...
let hasMoreThanAscii = [...str].some(char => char.charCodeAt(0) > 127);

alex
- 479,566
- 201
- 878
- 984
-
Shouldn't the `+` be a `*`? This requires that the string has characters in it, but the empty string `""` fulfills the OP's strict requirements: it doesn't have any non-ASCII characters in it. – Jeff Nov 23 '12 at 03:50
-
If you change your `.filter` to `.some`, you can get rid of `.length > 0` – I Hate Lazy Nov 23 '12 at 03:52
-
-
Nah, any browser that supports `.filter()`, supports `.some()`. They're both ES5 additions. :) – I Hate Lazy Nov 23 '12 at 03:54
-
@alex: Sorry, my regex doesn't work somehow (also I deleted the post before seeing your reply, sorry) – slebetman Nov 23 '12 at 04:14
-
the second one was broken for me before I added a `return` before the `char.charCodeAt...` – MalcolmOcean Apr 27 '14 at 03:01
-
`/^[\u0-\u7f]*$/.test("a/b")` returns false for some reason, it turns out the fix is `/^[\u0000-\u007f]*$/`. I've edited the answer. – Flimm Sep 14 '15 at 16:00
-
2The variable name is inverted. /^[\u0000-\u007f]*$/.test(str) = true when ascii so variable name should be: var isAscii = /^[\u0000-\u007f]*$/.test(str) – Munawwar Aug 28 '21 at 08:02
-
Or it should be negated like `var hasMoreThanAscii = !/^[\u0000-\u007f]*$/.test(str);` – ypresto Nov 04 '21 at 14:23
-
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/rest_parameters – mustafa candan Aug 27 '22 at 10:52
23
Try with this regex. It tests for all ascii characters that have some meaning in a string, from space 32
to tilde 126
:
var ascii = /^[ -~]+$/;
if ( !ascii.test( str ) ) {
// string has non-ascii characters
}
Edit: with tabs and newlines:
/^[ -~\t\n\r]+$/;

elclanrs
- 92,861
- 21
- 134
- 171
-
1
-
-
1@elclanrs I'm glad that you differentiated, though, because for many usecases they wouldn't be desired. – wwaawaw Nov 23 '12 at 05:38
-
1All Ascii characters have meanings, but not all of them are allowed or suitable in a particular context. The variable name `ascii` would be misleading here. – Jukka K. Korpela Nov 23 '12 at 07:45
11
charCodeAt
can be used to get the character code at a certain position in a string.
function isAsciiOnly(str) {
for (var i = 0; i < str.length; i++)
if (str.charCodeAt(i) > 127)
return false;
return true;
}

Nathan Wall
- 10,530
- 4
- 24
- 47
-
2
-
1I believe the largest ASCII code now is 255 check here http://www.ascii-code.com/ – repzero Feb 05 '17 at 23:04
1
Simpler alternative to @alex's solution:
const hasNonAsciiCharacters = str => /[^\u0000-\u007f]/.test(str);

Munawwar
- 1,712
- 19
- 14
0
You can use string.match() or regex.test() to achieve this. Following function will return true if the string contains only ascii characters.
string.match()
function isAsciiString(text) {
let isAscii = true;
if (text && !text.match(/^[\x00-\x7F]+$/g)) {
isAscii = false;
}
return isAscii;
}
regex.test()
function isAsciiString(text) {
return /^[\x00-\x7F]+$/g.test(text);
}
Example
console.log(isAsciiString("hello")); // true
console.log(isAsciiString("hello©")); // false

Ruchira Nawarathna
- 1,137
- 17
- 30