0

It's hard to describe this in the abstract, so let me just give a (simplified & snipped) example:

class ClassificationResults(object):

  #####################################################################################################################
  # These methods all represent aggregate metrics. They all follow the same interface: they return a tuple
  # consisting of the numerator and denominator of a fraction, and a format string that describes the result in terms
  # of that numerator, denominator, and the fraction itself.
  #####################################################################################################################
  metrics  = ['recall', 'precision', 'fmeasure', 'f2measure', 'accuracy']

  # ...

  def recall(self):
    tpos, pos = 0, 0
    for prediction in self.predictions:
      if prediction.predicted_label == 1:
        pos += 1
        if prediction.true_label == 1:
          tpos += 1
    return tpos, pos, "{1} instances labelled positive. {0} of them correct (recall={2:.2})"

  def precision(self):
    tpos, true = 0, 0
    for prediction in self.predictions:
      if prediction.true_label == 1:
        true += 1
        if prediction.predicted_label == 1:
          tpos += 1
    return tpos, true, "{1} positive instances. We labelled {0} correctly (precision={2:.2})"

  # ...

  def printResults(self):
    for methodname in self.metrics:
      (num, denom, msg) = getattr(self, methodname)()
      dec = num/float(denom)
      print msg.format(num, denom, dec)

Is there a better way to indicate that these methods all belong the same 'family', and to allow them to be called in a loop without naming them every time?

Another way I've done it in the past is to name methods with a common prefix, e.g.

  def metric_precision(self):
    tpos, true = 0, 0
    for prediction in self.predictions:
      if prediction.true_label == 1:
        true += 1
        if prediction.predicted_label == 1:
          tpos += 1
    return tpos, true, "{1} positive instances. We labelled {0} correctly (precision={2:.2})"

  # ...

  def printResults(self):
    for methodname in dir(self):
      meth = getattr(self, methodname)
      if methodname.startswith('metric_') and callable(meth): 
        (num, denom, msg) = getattr(self, methodname)()
        dec = num/float(denom)
        print msg.format(num, denom, dec)

But this feels even more hackish.

I could also turn each method into an instance of a common superclass, but this feels like overkill.

Coquelicot
  • 8,775
  • 6
  • 33
  • 37
  • 1
    List/Dictionary of methods is perfectly valid approach for that, especially if the class that hosts those methods have some other methods. – J0HN May 31 '13 at 19:58

3 Answers3

2

Why don't you simply store the actual methods in the list and avoid calls to getattr altogether?

>>> class SomeClass(object):
...     
...     def method_one(self):
...         print("First!")
...         return 0
...     
...     def method_two(self):
...         print("Second!")
...         return 1
...     
...     def method_three(self):
...         print("Third!")
...         return 2
...     
...     _METHODS = (method_one, method_two, method_three)
...     
...     def call_all(self):
...         for method in SomeClass._METHODS:
...             # remember that _METHODS contains *unbound* methods! 
...             print("Result: {}".format(method(self)))
... 
>>> obj = SomeClass()
>>> obj.call_all()
First!
Result: 0
Second!
Result: 1
Third!
Result: 2

In some other languages design patterns like command pattern may be used, but this is mostly because these languages do not have first class function/method objects. Python has this kind of patterns built-in.

Bakuriu
  • 98,325
  • 22
  • 197
  • 231
  • Cool. I think I found this approach hard to come by because I'm used to defining class variables at the top of the class, but in this case they need to be in the middle so that the methods are defined. I do think it's cleaner though. – Coquelicot Jun 03 '13 at 17:11
1
  • You could use a class decorator to generate the list of metric methods. The advantage of doing this is that you can generate the list of metric methods at class definition time instead of regenerating the list each time printResults is called.

    Another advantage is that you do not have to manually maintain the ClassificationResults.metrics list. You won't have to spell the name of the method in two places, so it is DRY-er, and if you ever add another metric, you do not have to remember to also update the ClassificationResults.metrics. You just have to give it a name that starts with metrics_.

  • Since each metric method returns a similar object, you might consider formalizing that notion in a class (such as Metric, below). One advantage to doing this is that you could define a __repr__ method to handle how the result is printed. Notice how simple printResults (below) becomes.


def register_metrics(cls):
    for methodname in dir(cls):
        if methodname.startswith('metric_'):
            method = getattr(cls, methodname)
            cls.metrics.append(method)
    return cls


class Metric(object):
    def __init__(self, pos, total):
        self.pos = pos
        self.total = total

    def __repr__(self):
        msg = "{p} instances labelled positive. {t} of them correct (recall={d:.2g})"
        dec = self.pos / float(self.total)
        return msg.format(p=self.total, t=self.pos, d=dec)


@register_metrics
class ClassificationResults(object):
    metrics = []

    def metric_recall(self):
        tpos, pos = 1, 2
        return Metric(tpos, pos)

    def metric_precision(self):
        tpos, true = 3, 4
        return Metric(tpos, true)

    def printResults(self):
        for method in self.metrics:
            print(method(self))

foo = ClassificationResults()
foo.printResults()

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
0

So you basically want to eliminate the getattr call and the need to specify the functions in two places, basically. Or a command pattern.

Seems like a decent case for a augmented callable, perhaps something like this:

class Command(object):
    def __init__(self, function=None):
        self._function = function

    def function(self, *args):
        return self._function(*args)

    def name(self):
        return self.function.func_name   # Or other code for callables.

    def __call__(self, *args):
        return self.function(*args)

Then then some combination of:

commands = []
def recall(my, args):
     ...
commands.append(Command(recall))

class Precision(Command):
    def function(self, my, args):
        ...
commands.append(Precision)

and then

results = [command() for command in commands]

or perhaps

results = [(command.name(), command() for command in commands)]

Or a Runner:

class Runner(object):
    def __init__(self, commands):
        groupings = {}
        for command in commands:
            groupings.setdefault(command.__class__.__name__, []).append(command)
        self.groupings = groupings

     def run(self, group=None):
         commands = self.groupings.get(group,[]) if group else itertools.chain(*self.groupings.values())
         return [command() for command in commands]

Yadda yadda yadda.

Wrote this code quickly so it may have a typo or two.

Adam

Adam Donahue
  • 1,618
  • 15
  • 18