1

My specific concerns are

  1. When parsing a parameter, is it intuitive for a future maintainer to understand code that depends on throwing an error?
  2. Is it expensive to be throwing exceptions as a matter of course for the default case? (seems like it might be according to https://stackoverflow.com/a/9859202/776940 )

Context

I have a parameter counter that determines the name of a counter to increment, and optionally can increment by a positive or negative integer separated from the counter name by an =. If no increment value is provided, the default magnitude of the increment is 1. The function is fed by breaking up a comma delimited list of counters and increments, so a valid input to the whole process can look like:

"counter1,counter2=2,counter3=-1"

which would increment "counter1" by 1, increment "counter2" by 2 and decrement "counter3" by 1.

How I Originally Wrote It

counterDescriptor = counterValue.split('=')
if len(counterDescriptor) == 1:
    counterName = counterDescriptor[0]
    counterIncr = 1
elif len(counterDescriptor) == 2:
    counterName = counterDescriptor[0]
    counterIncr = int(counterDescriptor[1])
else:
    counterName, counterIncr = ('counterParsingError', 1)

which strikes me, as I recently came back to look at it, as overly verbose and clunky.

Is this a more or less Pythonic way to code that behavior?

def cparse(counter):
    try:
        desc,mag = counter.split('=')
    except ValueError:
        desc = counter
        mag = ''
    finally:
        if mag == '':
            mag = 1
    return desc, int(mag)

With these test cases, I see:

>>> cparse("byfour=4")
('byfour', 4)
>>> cparse("minusone=-1")
('minusone', -1)
>>> cparse("equalAndNoIncr=")
('equalAndNoIncr', 1)
>>> cparse("noEqual")
('noEqual', 1)

These test cases that would have been caught how I originally wrote it (above) won't get caught this way:

>>> cparse("twoEquals=2=3")
('twoEquals=2=3', 1)
>>> cparse("missingComma=5missingComma=-5")
('missingComma=5missingComma=-5', 1)

and this last test case doesn't get caught by either way of doing it. Both make the int() vomit:

>>> cparse("YAmissingComma=5NextCounter")
ValueError: invalid literal for int() with base 10: '5NextCounter'

I'm glad I discovered this problem by asking this question. The service that consumes this value would eventually choke on it. I suppose I could change the one line return desc, int(mag) of the function to this:

    if desc.find("=")<0 and (mag=='0' or (mag if mag.find('..') > -1 else mag.lstrip('-+').rstrip('0').rstrip('.')).isdigit()):
        return desc, int(mag)
    else:
        return 'counterParsingError: {}'.format(desc), 1

(hat tip to https://stackoverflow.com/a/9859202/776940 for figuring out that this was the fastest way offered in that discussion to determine if a string is an integer)

Community
  • 1
  • 1
Rob Fagen
  • 774
  • 1
  • 8
  • 24
  • 1
    I would only ever consider handling ValueErrors or this case if parsing an object may fail. The way it's being used here is only raising because the list you obtained could not be unpacked. I wouldn't consider it pythonic. Same reason we shouldn't handle AttributeErrors if we knew we were dealing with `None` values when they could just test for. – Jeff Mercado Oct 21 '17 at 00:19
  • @JeffMercado it gets raised either if the parameter is defaulting to 'increment=1' and completely leaves off the equals, or if the parameter is malformed (e.g. with more than one equals, or with a non-numeric character in the increment or if the full string of parameters was missing a comma between two of the counter values). The way I did it initially was explicitly parsing and testing, and I was missing a unit test case for 'foo=3bar=5' where there should be a comma before 'bar'. Thanks for the feedback. – Rob Fagen Oct 21 '17 at 01:19

1 Answers1

2

I would consider that pythonic, though you might perhaps prefer:

def cparse(counter):
    if "=" not in counter:
        # early exit for this expected case
        return (counter, 1)
    desc, mag = counter.split("=", maxsplit=1)
    # note the use of the optional maxsplit to prevent ValueErrors on "a=b=c"
    # and since we've already tested and short-circuited out of the "no equals" case
    # we can now consider this handled completely without nesting in a try block.
    try:
        mag = int(mag)
    except ValueError:
        # can't convert mag to an int, this is unexpected!
        mag = 1
    return (desc, mag)

You can tweak this to ensure you get the right output while parsing strings like a=b=c. If you expect to receive ('a', 1) then keep the code as-is. If you expect ('a=b', 1) you can use counter.rsplit instead of counter.split.

Adam Smith
  • 52,157
  • 12
  • 73
  • 112