26

I'm looking to detect internationalized domain names and local portions in email addresses, and would like to know if there is a quick and easy way to do this with regex or otherwise in Javascript.

wwaawaw
  • 6,867
  • 9
  • 32
  • 42
  • 4
    What do you mean by ASCII? Remember that NUL (\0), BEL (\7 - causes PC to beep), ESC (\033) are also valid ASCII characters but most would't consider them to be valid ASCII text. – slebetman Nov 23 '12 at 04:13
  • @slebetman very fair point to add. – wwaawaw Nov 23 '12 at 05:34

5 Answers5

31

This should do it...

var hasMoreThanAscii = /^[\u0000-\u007f]*$/.test(str);

...also...

var hasMoreThanAscii = str
                       .split("")
                       .some(function(char) { return char.charCodeAt(0) > 127 });

ES6 goodness...

let hasMoreThanAscii = [...str].some(char => char.charCodeAt(0) > 127);
alex
  • 479,566
  • 201
  • 878
  • 984
  • Shouldn't the `+` be a `*`? This requires that the string has characters in it, but the empty string `""` fulfills the OP's strict requirements: it doesn't have any non-ASCII characters in it. – Jeff Nov 23 '12 at 03:50
  • If you change your `.filter` to `.some`, you can get rid of `.length > 0` – I Hate Lazy Nov 23 '12 at 03:52
  • @user1689607 True. I'll also get rid of a bit of browser support ;) – alex Nov 23 '12 at 03:52
  • Nah, any browser that supports `.filter()`, supports `.some()`. They're both ES5 additions. :) – I Hate Lazy Nov 23 '12 at 03:54
  • @alex: Sorry, my regex doesn't work somehow (also I deleted the post before seeing your reply, sorry) – slebetman Nov 23 '12 at 04:14
  • the second one was broken for me before I added a `return` before the `char.charCodeAt...` – MalcolmOcean Apr 27 '14 at 03:01
  • `/^[\u0-\u7f]*$/.test("a/b")` returns false for some reason, it turns out the fix is `/^[\u0000-\u007f]*$/`. I've edited the answer. – Flimm Sep 14 '15 at 16:00
  • 2
    The variable name is inverted. /^[\u0000-\u007f]*$/.test(str) = true when ascii so variable name should be: var isAscii = /^[\u0000-\u007f]*$/.test(str) – Munawwar Aug 28 '21 at 08:02
  • Or it should be negated like `var hasMoreThanAscii = !/^[\u0000-\u007f]*$/.test(str);` – ypresto Nov 04 '21 at 14:23
  • https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/rest_parameters – mustafa candan Aug 27 '22 at 10:52
23

Try with this regex. It tests for all ascii characters that have some meaning in a string, from space 32 to tilde 126:

var ascii = /^[ -~]+$/;

if ( !ascii.test( str ) ) {
  // string has non-ascii characters
}

Edit: with tabs and newlines:

/^[ -~\t\n\r]+$/;
elclanrs
  • 92,861
  • 21
  • 134
  • 171
11

charCodeAt can be used to get the character code at a certain position in a string.

function isAsciiOnly(str) {
    for (var i = 0; i < str.length; i++)
        if (str.charCodeAt(i) > 127)
            return false;
    return true;
}
Nathan Wall
  • 10,530
  • 4
  • 24
  • 47
1

Simpler alternative to @alex's solution:

const hasNonAsciiCharacters = str => /[^\u0000-\u007f]/.test(str);
Munawwar
  • 1,712
  • 19
  • 14
0

You can use string.match() or regex.test() to achieve this. Following function will return true if the string contains only ascii characters.

string.match()

function isAsciiString(text) {
    let isAscii = true;
    if (text && !text.match(/^[\x00-\x7F]+$/g)) {
        isAscii = false;
    }
    return isAscii;
}  

regex.test()

function isAsciiString(text) {
    return /^[\x00-\x7F]+$/g.test(text);
} 

Example

console.log(isAsciiString("hello"));   // true
console.log(isAsciiString("hello©"));  // false
Ruchira Nawarathna
  • 1,137
  • 17
  • 30