0

I'm working on a python project that requires me to compile certain attributes of some objects into a dataset. The code I'm currently using is something like the following:

class VectorBuilder(object):
    SIZE = 5

    def __init__(self, player, frame_data):
        self.player = player
        self.fd = frame_data

    def build(self):
        self._vector = []

        self._add(self.player)
        self._add(self.fd.getSomeData())
        self._add(self.fd.getSomeOtherData())

        char = self.fd.getCharacter()
        self._add(char.getCharacterData())
        self._add(char.getMoreCharacterData())

        assert len(self._vector) == self.SIZE
        return self._vector

    def _add(self, element):
        self._vector.append(element)

However, this code is slightly unclean because adding/removing attributes to/from the dataset also requires correctly adjusting the SIZE variable. The reason I even have the SIZE variable is that the size of the dataset needs to be known at runtime before the dataset itself is created.

I've thought of instead keeping a list of all the functions used to construct the dataset as strings (as in attributes = ['getPlayer', 'fd.getSomeData', ...]) and then defining the build function as something like:

def build(self):
    self._vector = []
    for att in attributes:
        self._vector.append(getattr(self, att)())
    return self._vector

This would let me access the size as simply len(attributes) and I only ever need to edit attributes, but I don't know how to make this approach work with the chained function calls, such as self.fd.getCharacter().getCharacterData().

Is there a cleaner way to accomplish what I'm trying to do?

EDIT:

Some additional information and clarification is necessary.

  1. I was using __ due to some bad advice I read online (essentially saying I should use _ for module-private members and __ for class-private members). I've edited them to _ attributes now.
  2. The getters are a part of the framework I'm using.
  3. The vector is stored as a private class member so I don't have to pass it around the construction methods, which are in actuality more numerous than the simple _add, doing some other stuff like normalisation and bool->int conversion on the elements before adding them to the vector.
  4. SIZE as it currently stands, is a true constant. It is only ever given a value in the first line of VectorBuilder and never changed at runtime. I realise that I did not clarify this properly in the main post, but new attributes never get added at runtime. The adjustment I was talking about would take place at programming time. For example, if I wanted to add a new attribute, I would need to add it in the build function, e.g.:

    self._add(self.fd.getCharacter().getAction().getActionData().getSpeed())
    

    , as well as change the SIZE definition to SIZE = 6.

  5. The attributes are compiled into what is currently a simple python list (but will probably be replaced with a numpy array), then passed into a neural network as an input vector. However, the neural network itself needs to be built first, and this happens before any data is made available (i.e. before any input vectors are created). In order to be built successfully, the neural network needs to know the size of the input vectors it will be receiving, though. This is why SIZE is necessary and also the reason for the assert statement - to ascertain that the vectors I'm passing to the network are in fact the size I claimed I would be passing to it.

I'm aware the code is unpythonic, that is why I'm here - the code works, it's just ugly.

Mate de Vita
  • 1,102
  • 12
  • 32
  • 1
    Why are you using `__` attributes? Why are you storing the values in a list instead of just as plain attributes, or at least in a dict? Why do you want getters like `getCharacter()` in the first place? Why is the vector stored anywhere at all when the only time you access it, it's to replace it with a brand-new value that you immediately return? What is that `assert` actually intended to protect against? Everything you're trying to do seems overcomplicated and unpythonic, and without knowing the reasons, it's very hard to say whether one particular element of it is particularly overcomplicated – abarnert Jul 03 '18 at 20:00
  • By convention attribute's whose name is all capital letters like `SIZE` means it's a constant value that won't ever be changed, yet you mention "adjusting the `SIZE` variable". If you're going to write Python code, I highly recommend reading (and following) the [PEP 8 - Style Guide for Python Code**](https://www.python.org/dev/peps/pep-0008/) suggestions. Lastly, as @abarnert said, lose the `__` prefix on the attribute names, just a simple `_` to indicate something is private would be good enough here. – martineau Jul 03 '18 at 20:10
  • @martineau I _think_ the implication is that `SIZE` should get created at class-creation time (by some module-level code, presumably, since I doubt he'd use a metaclass just for that… but you never know…) and then never change. If so, it's effectively a constant. For example, if he were using a normal `__slots__` class, he could write a trivial decorator that sets `cls.SIZE = len(cls.__slots__)`, and you'd still call that a constant even though technically it's being assigned after the `type` call returns, right? – abarnert Jul 03 '18 at 20:23
  • I've edited the OP to try and answer the questions posed in the above comments. – Mate de Vita Jul 04 '18 at 11:12

1 Answers1

0

Instead of providing the strings of the attributes as a list you would like to create the input arguments from, why don't you initialize the build function with a list containing all the values returned by your getter functions?

Your SIZE variable would then still be the length of the dynamic argument list provided in build(self,*args) for example.

I. Amon
  • 174
  • 11