0

I have a string that contains variable names separated by 'and's/'or's such as 'x[1] and x[2] or x[3]'. The number of variable names varies as does whether it's an 'and' or 'or' that comes in between them. I want to wrap parenthesis around each stretch of variables separated by 'or's. For example, if the string is 'x[1] and x[2] or x[3] and x[4] or x[5] or x[6] and x[7]', I want to change it to 'x[1] and (x[2] or x[3]) and (x[4] or x[5] or x[6]) and x[7]'.

I'm not even a novice at regex. I was wondering if there is a fairly elegant and efficient way to do this using regex in python? Any help would be greatly appreacited.

Josh

  • 1
    This is doable, but it sounds like the wrong way to solve whatever problem you want this string for. – user2357112 Jul 27 '16 at 22:44
  • If you don't know regex, having someone give you a complex regex won't be very helpful because you won't understand it and won't be able to maintain it. I suggest you spend one whole day at a computer with regex documentation, and figure it out yourself. The effort will pay off for your entire career. Regular expressions are only hard if you don't take the time to learn them. One day is likely all you'll need. – Bryan Oakley Jul 27 '16 at 22:58

2 Answers2

1

This might do what you want:

import re

s = 'x[1] and x[2] or x[3] and x[4] or x[5] or x[6] and x[7]'
s = re.sub(r'(\S+(?:\s*or\s*\S+)+)', r'(\1)', s)
assert s == 'x[1] and (x[2] or x[3]) and (x[4] or x[5] or x[6]) and x[7]'

EDIT: A slightly more robust expression and more test cases:

import re

tests = (
    ('x[1] and x[2] or x[3] and x[4] or x[5] or x[6] and x[7]',
    'x[1] and (x[2] or x[3]) and (x[4] or x[5] or x[6]) and x[7]'),
    ('door and floor', 'door and floor'),
    ('more and more and more', 'more and more and more')
)
for test, expected in tests:
    actual = re.sub(r'\S+(?:\s*\bor\b\s*\S+)+', r'(\g<0>)', test)
    assert actual == expected
Robᵩ
  • 163,533
  • 20
  • 239
  • 308
0

As you already have an answer with a regex method, here is a method that doesn't require a regex:

>>> s = 'x[1] and x[2] or x[3] and x[4] or x[5] or x[6] and x[7]'
>>> ' and '.join(['(%s)' % w if ' or ' in w else w for w in s.split(' and ')])
'x[1] and (x[2] or x[3]) and (x[4] or x[5] or x[6]) and x[7]'

How it works

The first step is to split on and:

>>> s.split(' and ')
['x[1]', 'x[2] or x[3]', 'x[4] or x[5] or x[6]', 'x[7]']

The next step is to decide if the substrings need to be surrounded by parens. That is done with a ternary statement:

>>> w = 'x[2] or x[3]'; '(%s)' % w if ' or ' in w else w
'(x[2] or x[3])'
>>> w = 'x[1]'; '(%s)' % w if ' or ' in w else w
'x[1]'

The last step is to reassemble the string with ' and '.join(...).

John1024
  • 109,961
  • 14
  • 137
  • 171