Credit to @mp3ferret for having the right idea. But there was no example of a solution using Environment.CommandLine
, so I went ahead and created a OriginalCommandLine
class that will get the Command Line arguments as originally entered.
An argument is defined in the tokenizer
regex as being a double quoted string of any type of character, or an unquoted string of non-whitespace characters. Within the quoted strings, the quote character can be escaped by a backslash. However a trailing backslash followed by a double quote and then white space will not be escaped.
There reason I chose the exception of the escape due to whitespace was to accommodate quoted paths that end with a backslash. I believe it is far less likely that you'll encounter a situation where you'd actually want the escaped double quote.
Code
static public class OriginalCommandLine
{
static Regex tokenizer = new Regex(@"""(?:\\""(?!\s)|[^""])*""|[^\s]+");
static Regex unescaper = new Regex(@"\\("")(?!\s|$)");
static Regex unquoter = new Regex(@"^\s*""|""\s*$");
static Regex quoteTester = new Regex(@"^\s*""(?:\\""|[^""])*""\s*$");
static public string[] Parse(string commandLine = null)
{
return tokenizer.Matches(commandLine ?? Environment.CommandLine).Cast<Match>()
.Skip(1).Select(m => unescaper.Replace(m.Value, @"""")).ToArray();
}
static public string UnQuote(string text)
{
return (IsQuoted(text)) ? unquoter.Replace(text, "") : text;
}
static public bool IsQuoted(string text)
{
return text != null && quoteTester.Match(text).Success;
}
}
Results
As you can see from the results below the above method fixes maintains the quotes, while more gracefully handling a realistic scenario you might encounter.
Test:
ConsoleApp1.exe foo1 notepad.exe "C:\Progra\"m Files\MyDocuments\" "C:\Program Files\bar.txt"
args[]:
[0]: foo1
[1]: notepad.exe
[2]: C:\Progra"m Files\MyDocuments" C:\Program
[3]: Files\bar.txt
CommandLine.Parse():
[0]: foo1
[1]: notepad.exe
[2]: "C:\Progra"m Files\MyDocuments\"
[3]: "C:\Program Files\bar.txt"
Finally
I debated using an alternative scheme for escaping double quotes. I feel that using ""
is better given that command lines so often deal with backslashes. I kept the backslash escaping method because it is backwards compatible with how command line arguments are normally processed.
If you want to use that scheme make the following changes to the regexes:
static Regex tokenizer = new Regex(@"""(?:""""|[^""])*""|[^\s]+");
static Regex unescaper = new Regex(@"""""");
static Regex unquoter = new Regex(@"^\s*""|""\s*$");
static Regex quoteTester = new Regex(@"^\s*""(?:""""|[^""])*""\s*$");
If you want to get closer to what you expect from args
but with the quotes intact, change the two regexes. There is still a minor difference, "abc"d
will return abcd
from args
but [0] = "abc", [1] = d
from my solution.
static Regex tokenizer = new Regex(@"""(?:\\""|[^""])*""|[^\s]+");
static Regex unescaper = new Regex(@"\\("")");
If you really, really want to get the same number of elements as args
, use the following:
static Regex tokenizer = new Regex(@"(?:[^\s""]*""(?:\\""|[^""])*"")+|[^\s]+");
static Regex unescaper = new Regex(@"\\("")");
Result of exact match
Test: "zzz"zz"Zzz" asdasd zz"zzz" "zzz"
args OriginalCommandLine
------------- -------------------
[0]: zzzzzZzz [0]: "zzz"zz"Zzz"
[1]: asdasd [1]: asdasd
[2]: zzzzz [2]: zz"zzz"
[3]: zzz [3]: "zzz"