1

I created a parser to extract variables from strings and populate them with values, but am having a lot of difficulties detecting multiple values within a string. Let me illustrate:

The following message contains variables 'mass', 'vel', boolean arguments (or strings) 'AND', 'OR':

message = '"I have two variables" -mass "12" --vel "18" OR "this is just another descriptor" AND "that new thing" OR "that new fangled thing"'

With the above message the parser should detect and return a dictionary of variables containing values:

{'OR': ['this is just another descriptor', 'that new fangled thing'], 'vel': [18], 'AND': ['that new thing'], 'mass': [12.0]}

Here's the code:

import shlex
message = '"I have two variables" -mass "12" --vel "18" OR "this is just another descriptor" AND "that new thing" OR "that new fangled thing"'
args = shlex.split(message)
data = {}
attributes = ['mass', 'vel', 'OR', 'AND']
var_types = ['float', 'int', 'str', 'str']
for attribute in attributes: data[attribute] = []
for attribute, var_type in zip(attributes, var_types):
        options = {k.strip('-'): True if v.startswith('-') else v
                for k,v in zip(args, args[1:]+["--"]) if k.startswith('-') or k.startswith('')}
        if (var_type == "int"):
                data[attribute].append(int(options[attribute]))   #Updates if "attribute" exists, else adds "attribute".
        if (var_type == "str"):
                data[attribute].append(str(options[attribute]))   #Updates if "attribute" exists, else adds "attribute".
        if (var_type == "float"):
                data[attribute].append(float(options[attribute]))   #Updates if "attribute" exists, else adds "attribute".
print(data)

The above code only returns the following dictionary:

{'OR': ['that new fangled thing'], 'vel': [18], 'AND': ['that new thing'], 'mass': [12.0]}

The first element of the 'OR' list ('this is just another descriptor') is not being detected. Where am I going wrong?

EDIT: I tried changing attributes = ['mass', 'vel', 'OR', ‘OR’, ‘AND'] but this returned: {'OR': ['that new fangled thing'], 'OR': ['that new fangled thing'], 'vel': [18], 'AND': ['that new thing'], 'mass': [12.0]}

Code Monkey
  • 800
  • 1
  • 9
  • 27
  • Try printing `zip(args, args[1:]+["--"])` -- is that what you are really expecting? It looks like that some arguments become keys in there, and the argument you want might get lost in the process – RafazZ Oct 29 '18 at 21:12

1 Answers1

0

Your dict comprehension {k.strip('-'): True if ... } sees the OR key twice, but the second overwrites the first, as a dict can only contain a key once.

John Gordon
  • 29,573
  • 7
  • 33
  • 58
  • That is true...but what can I do to make sure it doesn’t override the first? – Code Monkey Oct 29 '18 at 21:40
  • I don't think you can use a dict comprehension in cases where a key can occur more than once and you want to update the earlier value. You'll have to use a plain for loop to build the dict. – John Gordon Oct 29 '18 at 21:55
  • I tried to include 2 ‘OR’ elements in the attributes list, however this only gives me 2 of the last substrings ‘that new fangled thing’. – Code Monkey Oct 29 '18 at 22:05
  • Update the question to include this new code. Without seeing the code, we're just guessing. – John Gordon Oct 29 '18 at 23:10
  • You're still using a dict comprehension, so you still have the same problem. – John Gordon Oct 30 '18 at 00:59