Looks like you've got a fairly straightforward infix expression grammar that goes something like this:
variable = [ a-z | A-Z ]
string = [ a-z | A-Z | 0-9 ]
atom = variable [ "==" | "!=" ] string
expr = [ atom | "(" expr ")" | expr [ "," | "|" ] expr ]
Assuming you can nest expressions to an arbitrary depth and you want to handle operator precedence properly (e.g. does w,x|y,z
mean ((w,x)|y),z)
or (w,x)|(y,z)
or even w,(x|y),z
, all of which give different results) you're going to struggle to evaluate the expression with basic string manipulation and you might need use a more complicated approach:
- Lex the string into tokens
- Change the order of the tokens (infix -> postfix) to make it easier to calculate results
- Evaluate the expression given some values for variables
Lexing
This basically breaks the expression string down into a list of logical chunks - for example A==test
might become identifier "A"
, operator "=="
, identifier "test"
.
It's called "infix" notation because the operator sits in-between the operands - e.g. A==test
.
function ConvertTo-InfixTokens
{
param( [string] $Source )
$chars = $Source.ToCharArray();
$index = 0;
while( $index -lt $chars.Length )
{
$token = [pscustomobject] [ordered] @{
"Type" = $null
"Value" = [string]::Empty
};
$char = $chars[$index];
switch -regex ( $char )
{
# var or str
"[a-zA-z0-9]" {
$token.Type = "identifier";
$token.Value += $char;
$index += 1;
while( ($index -lt $chars.Length) -and ($chars[$index] -match "[a-zA-z0-9]") )
{
$token.Value += $chars[$index];
$index += 1;
}
}
# eq
"=" {
if( $chars[$index+1] -eq "=" )
{
$token.Type = "eq";
$token.Value = $chars[$index] + $chars[$index+1];
$index += 2;
}
else
{
throw "LEX01 - unhandled char '$($chars[$index+1])'";
}
}
# ne
"!" {
if( $chars[$index+1] -eq "=" )
{
$token.Type = "ne";
$token.Value = $chars[$index] + $chars[$index+1];
$index += 2;
}
else
{
throw "LEX02 - unhandled char '$($chars[$index+1])'";
}
}
"," {
$token.Type = "and";
$token.Value = $char;
$index += 1;
}
"\|" {
$token.Type = "or";
$token.Value = $char;
$index += 1;
}
"\(" {
$token.Type = "lb";
$token.Value = $char;
$index += 1;
}
"\)" {
$token.Type = "rb";
$token.Value = $char;
$index += 1;
}
default {
throw "LEX03 - unhandled char '$char'";
}
}
write-output $token;
}
}
Examples:
PS> ConvertTo-InfixTokens -Source "A==test"
Type Value
---- -----
identifier A
eq ==
identifier test
PS> ConvertTo-InfixTokens -Source "E==test3,(F==test4|F==test5),username!=abc"
Type Value
---- -----
identifier E
eq ==
identifier test3
and ,
lb (
identifier F
eq ==
identifier test4
or |
identifier F
eq ==
identifier test5
rb )
and ,
identifier username
ne !=
identifier abc
Convert to Postfix
Infix notation looks more natural because it's similar to normal maths, but it relies on you handling the precedence rules for operations when you evaluate expressions - e.g. is 1 + 1 * 2 + 2
equal to (((1 + 1) * 2) + 2) => 6
or 1 + (1 * 2) + 2 => 5
or (1 + 1) * (2 + 2) => 8
? - which means you might need to look ahead to see what other operators are coming in case they need to be processed first.
It's easier to evaluate the expression if we convert it to postfix notation (or Reverse Polish Notation) - see this answer for some additional advantages. This basically puts the operands first and then follows with the operator - e.g. 1 + 1 * 2 + 2
becomes 1 1 2 * + 2 +
, which will be logically processed as: ((1 (1 2 *) +) 2 +)
=> ((1 2 +) 2 +)
=> (3 2 +)
=> 5
The following code will apply this conversion:
# based on https://www.tutorialspoint.com/Convert-Infix-to-Postfix-Expression#:~:text=To%20convert%20infix%20expression%20to,maintaining%20the%20precedence%20of%20them.
function Convert-InfixToPostfix
{
param( [pscustomobject[]] $Tokens )
$precedence = @{
"or" = 1
"and" = 2
"eq" = 9
"ne" = 9
};
$stack = new-object System.Collections.Generic.Stack[pscustomobject];
for( $i = 0; $i -lt $Tokens.Length; $i++ )
{
$token = $Tokens[$i];
switch( $token.Type )
{
"identifier" {
write-output $token;
}
"lb" {
$stack.Push($token);
}
"rb" {
while( $stack.Peek().Type -ne "lb" )
{
write-output $stack.Pop();
}
$null = $stack.Pop();
}
default {
# must be a known operator
if( -not $precedence.ContainsKey($token.Type) )
{
throw "PF01 - unknown operator '$($token | ConvertTo-Json -Depth 99 -Compress)' at index $i";
}
while( ($stack.Count -gt 0) -and ($precedence[$token.Type] -le $precedence[$stack.Peek().Type]) )
{
write-output $stack.Pop();
}
$stack.Push($token);
}
}
}
while( $stack.Count -gt 0 )
{
write-output $stack.Pop();
}
}
Examples:
PS> $infix = ConvertTo-InfixTokens -Source "A==test"
PS> $postfix = Convert-InfixToPostfix -Tokens $infix
PS> $postfix
Type Value
---- -----
identifier A
identifier test
eq ==
PS> $infix = ConvertTo-InfixTokens -Source "E==test3,(F==test4|F==test5),username!=abc"
PS> $postfix = Convert-InfixToPostfix -Tokens $infix
PS> $postfix
Type Value
---- -----
identifier E
identifier test3
eq ==
identifier F
identifier test4
eq ==
identifier F
identifier test5
eq ==
or |
and ,
identifier username
identifier abc
ne !=
and ,
Evaluation
The last step is to take the postfix tokens and evaluate them. "All" we need to do is read the tokens from left to right - when we get an operand we put it on a stack, and when we get an operator we read some operands off the stack, apply the operator and push the result back onto the stack...
function Invoke-PostfixEval
{
param( [pscustomobject[]] $Tokens, [hashtable] $Variables )
$stack = new-object System.Collections.Generic.Stack[pscustomobject];
for( $i = 0; $i -lt $Tokens.Length; $i++ )
{
$token = $Tokens[$i];
if( $token.Type -eq "identifier" )
{
$stack.Push($token);
}
elseif( $token.Type -in @( "eq", "ne" ) )
{
$str = $stack.Pop().Value;
$var = $stack.Pop();
if( -not $Variables.ContainsKey($var.Value) )
{
throw "undefined variable '$($var.Value)' in token '$($token | ConvertTo-Json -Depth 99 -Compress)' at index $i";
}
$val = $Variables[$var.Value];
$result = switch( $token.Type ) {
"eq" { $val -eq $str }
"ne" { $val -ne $str }
default { throw "unhandled operator '$($token.Type)' in token '$($token | ConvertTo-Json -Depth 99 -Compress)' at index $i"; }
}
#write-host "'$val' $($token.Type) '$str' => $result";
$stack.Push([pscustomobject] [ordered] @{ "Type" = "bool"; Value = $result });
}
elseif( $token.Type -in @( "and", "or" ) )
{
$left = $stack.Pop().Value;
$right = $stack.Pop().Value;
$result = switch( $token.Type ) {
"and" { $left -and $right }
"or" { $left -or $right }
default { throw "unhandled operator '$($token.Type)' in token '$($token | ConvertTo-Json -Depth 99 -Compress)' at index $i"; }
}
#write-host "'$left' $($token.Type) '$right' => $result";
$stack.Push([pscustomobject] [ordered] @{ "Type" = "bool"; Value = $result });
}
else
{
throw "EVAL01 - unhandled token '$($token | ConvertTo-Json -Depth 99 -Compress)' at index $i";
}
}
return $stack.Peek().Value;
}
Examples:
PS> $variables = [ordered] @{
"A" = "test";
"C" = "test1";
"D" = "test2";
"E" = "test3";
"F" = "test4";
"username" = "abc"
}
PS> $infix = ConvertTo-InfixTokens -Source "A==test"
PS> $postfix = Convert-InfixToPostfix -Tokens $infix
PS> $value = Invoke-PostfixEval -Tokens $postfix -Variables $variables
PS> $value
True
PS> $infix = ConvertTo-InfixTokens -Source "E==test3,(F==test4|F==test5),username!=abc"
PS> $postfix = Convert-InfixToPostfix -Tokens $infix
PS> $value = Invoke-PostfixEval -Tokens $postfix -Variables $variables
PS> $value
False # fails because "username!=abc" is false
PS> $infix = ConvertTo-InfixTokens -Source "E==test3,F==test4|F==test5,username!=abc"
PS> $postfix = Convert-InfixToPostfix -Tokens $infix
PS> $value = Invoke-PostfixEval -Tokens $postfix -Variables $variables
PS> $value
True # operator precedence means it's evaluated as "(E==test3,F==test4)|(F==test5,username!=abc)" and "(E==test3,F==test4)" is true
You'll probably want to add more error handling for production code, and write some Pester tests to make sure it works for a larger variety of inputs, but it worked for the limited tests I did here. If you find any examples that don't work feel free to post them in comments...