-1

I want to check whether a string only contains alphanumeric characters or not in C. I do not want to use the isalnum function. Why doesn't the code below work correctly?

int main()
{
    printf("test regular expression\n");
    int retval = 0;
    regex_t re;
    char line[8] = "4a.zCCb";
    char msgbuf[100];
    if (regcomp(&re,"[a-zA-z0-9]{2,8}", REG_EXTENDED) != 0)
    {
        fprintf(stderr, "Failed to compile regex '%s'\n", tofind);
        return EXIT_FAILURE;
    }
    if ((retval = regexec(&re, line, 0, NULL, 0)) == 0)
        printf("Match : %s\n", line);
    else  if (retval == REG_NOMATCH)
        printf("does not match : %s\n", line);
    else {
        regerror(retval, &re, msgbuf, sizeof(msgbuf));
        fprintf(stderr, "Regex match failed: %s\n", msgbuf);
        exit(1);
    }
    regfree(&re);
}
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • @KarthikT: The code includes the sample input. Granted, it doesn't express the actual output (which is 'Match') compared with the expected output (which is 'does not match'). – Jonathan Leffler Jul 16 '14 at 06:11

2 Answers2

2

If you want the entire string to be alphanumeric, you need to include begin and end anchors in the regex:

"^[a-zA-Z0-9]{2,8}$"

As it stands, there are 4 alphanumerics at the end of the string, which matches the original regex.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • +1, deleted mine as for some reason I read `C#`... Thanks for letting me know. :) – zx81 Jul 16 '14 at 06:07
  • @Jonathan Leffler. it should be `a-zA-Z`? – jian Mar 02 '23 at 04:35
  • @jian: Why? The word "alphanumeric" used in the questions means "letters and numbers", and the regex shown uses `[a-zA-Z0-9]` (the same as my answer). I could have used `[[:alnum:]]` in place of `[a-zA-Z0-9]` as it means the same thing, but I followed the OP's notation so the answer was familiar to them (whereas `[[:alnum:]]` and its relatives are probably not, though they're rather useful). – Jonathan Leffler Mar 02 '23 at 04:43
  • 1
    my mistake. I mean it should be `"^[a-zA-Z0-9]{2,8}$"? even "^[a-zA-z0-9]{2,8}$" still works. – jian Mar 02 '23 at 04:48
  • I see I have `A-z` where I should have `A-Z` — you're right about that. Thanks! The notation with A-z 'works' but picks up the punctuation after `Z` and before `a` — namely ```[\]^_` ``` (assuming a code set based on ISO 8859 or ASCII, which includes UTF-8). – Jonathan Leffler Mar 02 '23 at 05:29
0

Try using \w* to match all word characters.