0

argv[1] seems to return 1 extra character than what is input. argv[2] is correct.

#include <stdio.h>
int main(int argc, wchar_t *argv[])
{
  printf("%d %d\n",wcslen(argv[1]),wcslen(argv[2]) );
  return 0;
}

I'm using mingw32 to compile. I compile with gcc myprog.c .

Why is this so?

  • 3
    I'm pretty sure main *only* gets ASCII (non-wide) characters .. or, at least that's the only way I've ever seen it. –  Jan 19 '13 at 02:39
  • 1
    so specifying wchar_t as an argument type is useless? –  Jan 19 '13 at 02:40
  • Maybe there is compiler switch to use? –  Jan 19 '13 at 02:41
  • Well, first, find a resource (tutorial, program, reference) that *does* use a wide-char argv - what does it do/requires? (It looks like it depends on compiler, see `wmain` as well: not sure where it is defined, but it does show up in MSVC++ documentation.) –  Jan 19 '13 at 02:42
  • 2
    possible duplicate of [wWinmain, Unicode, and Mingw](http://stackoverflow.com/questions/3571250/wwinmain-unicode-and-mingw) – Alexey Frunze Jan 19 '13 at 02:44
  • Does `mingw` support `int main(int argc, wchar_t *argv[])`? Do you have some documentation that says this is legal? – David Schwartz Jan 19 '13 at 02:51
  • @AlexeyFrunze Yeah I just read that same article. Looks like I need to use a regular `main` and then use `GetCommandLineW`. I also remember using a wrapper function for this a long time ago but I can't remember what it was. –  Jan 19 '13 at 02:55

2 Answers2

1

Here's a quote from the C standard draft, n1570.pdf:

5.1.2.2.1 Program startup

1 The function called at program startup is named main. The implementation declares no prototype for this function. It shall be defined with a return type of int and with no parameters:

int main(void) { /* ... */ }

or with two parameters (referred to here as argc and argv, though any names may be used, as they are local to the function in which they are declared):

int main(int argc, char *argv[]) { /* ... */ }

or equivalent;10) or in some other implementation-defined manner.

10) Thus, int can be replaced by a typedef name defined as int, or the type of argv can be written as char ** argv, and so on.

This should be fairly simple to comprehend. If your implementation supports argv with the type wchar_t**, then it'll work on your implementation in an implementation-defined manner. If you require portability, don't rely on anything implementation-defined.

Additionally, wcslen() is declared to return a size_t value, which you ought to use with the %zu directive to print, and it's probably also a good idea to #include <wchar.h>.

I don't think either of these caused your issue, but they both cause undefined behavior nonetheless.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
autistic
  • 1
  • 3
  • 35
  • 80
1

main expects parameters of type int and char** (or char*[] which is equivalent). There's also an optional 3rd parameter, which is the array of environment strings.

But what's happening is that most all compilers are relaxed about the type safety of the parameters for main. It happily lets you declare main taking any type of arguments (or no arguments) you want for argc and argv. I think it's largely historical with backwards compatibility with C to do this. And as result of the implicit casting of a char*[] type towchar_t*[], the strings get interpreted in wildly different ways.

So it's not correct to correct to say that you are getting +1 more from wcslen that expected. It's technically undefined behavior.

Two possible fixes:

The easy fix is just this delcare the second param an array of char strings instead of wchar_t strings.

int main(int argc, char* argv[])

If your compiler was was Visual Studio and you wanted Unicode arguments passed, the fix would be to declare your program's entry point as wmain instead of main

int wmain(int argc, wchar_t* argv[])

The above wmain fix will certainly compile with mingw, but I'm not sure if the linker has support for enabling wmain as the program entry point. Try it and find out.

selbie
  • 100,020
  • 15
  • 103
  • 173