0

I am trying to separate all the functions within square brackets and store them in a dictionary. However, the output strips the closing bracket from all the outputs except the last one.

import re
line="[f(x,y),g(y,z),f1(x1,y1)]"
matches = re.match(r"(.*)(\[)(.*)(\])(.*)", line)
if matches:
    all_action_labels = matches.group(3)
    sep_action_labels = re.split(r'\),',all_action_labels)
    j=0
    for x in sep_action_labels:
        print(f'Function #{j+1} : {x}')

All the outputs, as you can see, are missing the closing bracket')' except last one :

Function #1 : f(x,y
Function #1 : g(y,z
Function #1 : f1(x1,y1)

What regular expression should I use?

Further, how can I store these output in a dictionary?

Raj
  • 23
  • 4

2 Answers2

0

My general rule for extracting data is to call re.findall() with fairly simple regular expressions.

Perhaps this meets your needs:

import re
line="[f(x,y),g(y,z),f1(x1,y1)]"
all_action_labels = re.findall(r"\[(.*?)]", line)
for all_action_label in all_action_labels:
    sep_action_labels = re.findall(r"[a-z0-9]+\(.*?\)", all_action_label)
    for j, x in enumerate(sep_action_labels, 1):
        print(f'Function #{j} : {x}')

I use one simple regular expression to extract data from [] and another to extract the individual function calls.

Robᵩ
  • 163,533
  • 20
  • 239
  • 308
  • It does work, however, leaves out parts of function names in cases such as : line="[f1_11(x,y),g222(y,z),f1(x1,y1)]" Ouput is : Function #1 : 11(x,y) Function #2 : g222(y,z) Function #3 : f1(x1,y1) Anything before underscore(_) is removed. – Raj May 12 '18 at 09:43
  • Modify `[a-z0-9]` to include any valid characters that can appear in your function name. For example, `[a-z0-9_]`. – Robᵩ May 14 '18 at 23:46
0

If your not required to use regular expressions, it might be easier to do this. This is easy to follow, it just travels through the string, and putting the function strings into a list, and, it keeps track of brackets so functions with multiple commas will be handled just fine.

def getFuncList(line):
  """
  Assumes comma seperated, and opends and closes with square brackets
  """
  line = line[1:-1] # strip square brackets
  funcs = []

  current = ""
  brack_stack = 0 # we don't want to follow comma's if they are in a function
  for char in line:
    if char == "(":
      brack_stack += 1 
    elif char == ")":
      brack_stack -= 1 

    if char == "," and brack_stack == 0:
      # new function, clear current and append to list
      funcs.append(current)
      current = ""
    else:
      current += char
  funcs.append(current)
  return funcs


line="[f(x,y),g(y,z),f1(x1,y1)]"
func_list = (getFuncList(line))
print({"Function "+str(x+1): func_list[x] for x in range(len(func_list))}) # make and print the dictionary
# {'Function 1': 'f(x,y)', 'Function 2': 'g(y,z)', 'Function 3': 'f1(x1,y1)'}
user1762507
  • 772
  • 9
  • 32