I'm developing a little network command interpreter for .net Micro Framework 4.3 running on a Netduino. I use a regular expression to parse user input that arrives from the network via a stream socket. Commands are in the following format:
<T1,0,CommandVerb=Payload>
That's a device address, a transaction ID which can be any integer, a command verb, followed by an equals sign followed by any text. The whole thing is delimited by angle brackets much like an XML tag, which helps with parsing.
Here's the regular expression I use:
/*
* Regex matches command strings in format "<Dn,TT,CommandVerb=Payload>
* D is the Device class
* n is the device index
* TT is a numeric transaction ID of at least 1 digits.
* CommandVerb is, obviously, the command verb ;-)
* Payload is optional and is used to supply any parameter values to the command.
*
* N.B. Micro Framework doesn't support named captures and will throw exceptions if they are used.
*/
const string CommandRegex = @"<(\w\d),(\d+),([A-Za-z]\w+)(=((\d+)|(.+)))?>";
static readonly Regex Parser = new Regex(CommandRegex);
This expression is designed to tease out the various parts of the command so I can access them easily in code. The last part (=((\d+)|(.+)))?
differentiates between a numeric payload and a text payload, or no payload at all.
This has been working well for me, and validates OK in ReSharper's regex validator. Here's the output I expect to get (I think this is subtly different from the results you'd get from the full NetFX, I had to work this out by trial and error):
/* Command with numeric payload has the following groups
* Group[0] contains [<F1,234,Move=12345>]
* Group[1] contains [F1]
* Group[2] contains [234]
* Group[3] contains [Move]
* Group[4] contains [=12345]
* Group[5] contains [12345]
* Group[6] contains [12345]
* -----
* Command with text payload has the following groups:
* Group[0] contains [<F1,234,Nickname=Fred>]
* Group[1] contains [F1]
* Group[2] contains [234]
* Group[3] contains [Nickname]
* Group[4] contains [=Fred]
* Group[5] contains [Fred]
* Group[7] contains [Fred]
* -----
* Command with verb only (no payload) produces these groups:
* Group[0] contains [<F1,234,Stop>]
* Group[1] contains [F1]
* Group[2] contains [234]
* Group[3] contains [Stop]
*/
...and it does work like that. Right up to the point where I tried to pass a URL as the payload. As soon as I have a dot (.) in my payload string, the regex breaks and I actually get back the third form, where it clearly thinks there's no payload at all. As an example:
<W1,0,HttpPost=http://deathstar.com/route>
What I expect to get back is the 'command with text payload' result, but what I actually get back is the 'command with no payload' result. If I take out the dot, then it parses as I expect and I get 'command with text payload'. As soon as I put the dot back in, then (ironically) .+
no longer seems to match.
Again note: this validates correctly in ReSharper's regex validator and appears to work on the normal 'desktop' framework as expected, but not in .NET Micro Framework. The Micro Framework regex implementation is a subset of the full version, but the documentation about what is supposed to work and what doesn't is pretty much non-existent.
I can't understand why .+
doesn't match text with a dot in it. Can anyone see why it's not working?
UPDATE 1 - added diagnostics
Here's the output:
[Cmd Processor ] Parser matched 8 groups
[Cmd Processor ] Group[0]: <W1,0,HttpPost=http://deat
[Cmd Processor ] Group[1]: W1
[Cmd Processor ] Group[2]: 0
[Cmd Processor ] Group[3]: HttpPost
A first chance exception of type 'System.ArgumentOutOfRangeException' occurred in mscorlib.dll
So it's not that Group[4]
is null, it's throwing an ArgumentOutOfRangeException
for that indexer, even though there are 8 groups. Also, Group[0]
is mysteriously truncated. Hmmm...
Update 2 - Improved the Diagnostic
I added this diagnostic method to my code, based on answer from @Shar1er80:
[Conditional("DEBUG")]
static void PrintMatches(Match match)
{
if (!match.Success)
{
Dbg.Trace("No match", Source.CommandProcessor);
return;
}
Dbg.Trace("Parser matched "+match.Groups.Count + " groups", Source.CommandProcessor);
for (int i = 0; i < match.Groups.Count; i++)
{
string value;
try
{
var group = match.Groups[i];
value = group == null ? "null group" : group.Value ?? "null value";
}
catch (Exception ex)
{
value = "threw " + ex.GetType() + " " + ex.Message??string.Empty;
}
Dbg.Trace(" Groups[" + i + "]: " + value, Source.CommandProcessor);
}
}
With the test input of <W1,0,HttpPost=http://deathstar.com>
the output was:
[Cmd Processor ] Parser matched 8 groups
[Cmd Processor ] Groups[0]: <W1,0,HttpPost=http://deaths
[Cmd Processor ] Groups[1]: W1
[Cmd Processor ] Groups[2]: 0
[Cmd Processor ] Groups[3]: HttpPost
A first chance exception of type 'System.ArgumentOutOfRangeException' occurred in mscorlib.dll
[Cmd Processor ] Groups[4]: threw System.ArgumentOutOfRangeException Exception was thrown: System.ArgumentOutOfRangeException
A first chance exception of type 'System.ArgumentOutOfRangeException' occurred in mscorlib.dll
[Cmd Processor ] Groups[5]: threw System.ArgumentOutOfRangeException Exception was thrown: System.ArgumentOutOfRangeException
A first chance exception of type 'System.ArgumentOutOfRangeException' occurred in mscorlib.dll
[Cmd Processor ] Groups[6]: threw System.ArgumentOutOfRangeException Exception was thrown: System.ArgumentOutOfRangeException
A first chance exception of type 'System.ArgumentOutOfRangeException' occurred in mscorlib.dll
[Cmd Processor ] Groups[7]: threw System.ArgumentOutOfRangeException Exception was thrown: System.ArgumentOutOfRangeException
A first chance exception of type 'System.ArgumentOutOfRangeException' occurred in mscorlib.dll
Clearly that's not right, because 8 matches are reported but trying to access anything about Groups[3] throws an exception. The stack trace for the exception is: System.String::Substring System.Text.RegularExpressions.Capture::get_Value TA.NetMF.WeatherServer.CommandParser::PrintMatches TA.NetMF.WeatherServer.CommandParser::ParseCommand [snip]
I have opened an issue against .NET MicroFramework