3

Preface

I was wondering how to conceptualize data classes in a pythonic way. Specifically I’m talking about DTO (Data Transfer Object.)

I found a good answer in @jeff-oneill question “Using Python class as a data container” where @joe-kington had a good point to use built-in namedtuple.

Question

In section 8.3.4 of python 2.7 documentation there is good example on how to combine several named tuples. My question is how to achieve the reverse?

Example

Considering the example from documentation:

>>> p._fields            # view the field names
('x', 'y')

>>> Color = namedtuple('Color', 'red green blue')
>>> Pixel = namedtuple('Pixel', Point._fields + Color._fields)
>>> Pixel(11, 22, 128, 255, 0)
Pixel(x=11, y=22, red=128, green=255, blue=0)

How can I deduce a “Color” or a “Point” instance from a “Pixel” instance?

Preferably in pythonic spirit.

kuza
  • 2,761
  • 3
  • 22
  • 56
  • Do you mean that you want to split a `Pixel` namedtuple into a `Point` and a `Color`? – PM 2Ring Feb 21 '17 at 14:22
  • Not exactly *split* but be able instantiating “Color” or “Point” while only having instance of “Pixel” just as shown in a accepted answer. – kuza Feb 22 '17 at 07:15
  • Ok. You may be interested in my alternative implementation, and in my old answer that shows how to safely combine multiple namedtuples that may have duplicate field names. – PM 2Ring Feb 22 '17 at 10:34

5 Answers5

5

Here it is. By the way, if you need this operation often, you may create a function for color_ins creation, based on pixel_ins. Or even for any subnamedtuple!

from collections import namedtuple

Point = namedtuple('Point', 'x y')
Color = namedtuple('Color', 'red green blue')
Pixel = namedtuple('Pixel', Point._fields + Color._fields)

pixel_ins = Pixel(x=11, y=22, red=128, green=255, blue=0)
color_ins = Color._make(getattr(pixel_ins, field) for field in Color._fields)

print color_ins

Output: Color(red=128, green=255, blue=0)

Function for extracting arbitrary subnamedtuple (without error handling):

def extract_sub_namedtuple(parent_ins, child_cls):
    return child_cls._make(getattr(parent_ins, field) for field in child_cls._fields)

color_ins = extract_sub_namedtuple(pixel_ins, Color)
point_ins = extract_sub_namedtuple(pixel_ins, Point)
Nikolay Prokopyev
  • 1,260
  • 12
  • 22
  • 1
    You can avoid creating a temporary list by using [`namedtuple._make()`](https://docs.python.org/2.7/library/collections.html#collections.somenamedtuple._make): `Color._make(getattr(pixel_ins, field) for field in Color._fields)` – farsil Feb 21 '17 at 14:59
  • Thanks @NikolayProkopyev that’s way better than what I came up to (which is comprehending super as dictionary and instantiation of sub using “**” (double asterisks operator.)) – kuza Feb 22 '17 at 07:10
1

Point._fields + Color._fields is simply a tuple. So given this:

from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
Color = namedtuple('Color', 'red green blue')
Pixel = namedtuple('Pixel', Point._fields + Color._fields)

f = Point._fields + Color._fields

type(f) is just tuple. Therefore, there is no way to know where it came from.

I recommend that you look into attrs for easily doing property objects. This will allow you to do proper inheritance and avoid the overheads of defining all the nice methods to access fields.

So you can do

import attr

@attr.s
class Point:
    x, y = attr.ib(), attr.ib()

@attr.s
class Color:
    red, green, blue = attr.ib(), attr.ib(), attr.ib()

class Pixel(Point, Color):
    pass

Now, Pixel.__bases__ will give you (__main__.Point, __main__.Color).

chthonicdaemon
  • 19,180
  • 2
  • 52
  • 66
1

Here's an alternative implementation of Nikolay Prokopyev's extract_sub_namedtuple that uses a dictionary instead of getattr.

from collections import namedtuple

Point = namedtuple('Point', 'x y')
Color = namedtuple('Color', 'red green blue')
Pixel = namedtuple('Pixel', Point._fields + Color._fields)

def extract_sub_namedtuple(tup, subtype):
    d = tup._asdict()
    return subtype(**{k:d[k] for k in subtype._fields})

pix = Pixel(11, 22, 128, 255, 0)

point = extract_sub_namedtuple(pix, Point)
color = extract_sub_namedtuple(pix, Color)
print(point, color)

output

Point(x=11, y=22) Color(red=128, green=255, blue=0)

This could be written as a one-liner:

def extract_sub_namedtuple(tup, subtype):
    return subtype(**{k:tup._asdict()[k] for k in subtype._fields})

but it's less efficient because it has to call tup._asdict() for each field in subtype._fields.

Of course, for these specific namedtuples, you can just do

point = Point(*pix[:2])
color = Color(*pix[2:])

but that's not very elegant because it hard-codes the parent field positions and lengths.

FWIW, there's code to combine multiple namedtuples into one namedtuple, preserving field order and skipping duplicate fields in this answer.

Community
  • 1
  • 1
PM 2Ring
  • 54,345
  • 6
  • 82
  • 182
  • 1
    It is very close to my idea that I mentioned in accepted answer comments. I came up to method where I extracting dictionary `sub_dict = {k: v for k, v in super_type.iteritems() if k in sub_type._fields}` and instantiating sub-namedtuple as `sub_type(**sub_dict)` But I was seeking for a way to reuse `namedtuple` interface. – kuza Feb 23 '17 at 08:43
0

Another way you could do this is to make the arguments for "Pixel" align with what you actually want instead of flattening all of the arguments for its constituent parts.

Instead of combining Point._fields + Color._fields to get the fields for Pixel, I think you should just have two parameters: location and color. These two fields could be initialized with your other tuples and you wouldn't have to do any inference.

For example:

# Instead of Pixel(x=11, y=22, red=128, green=255, blue=0)
pixel_ins = Pixel(Point(x=11, y=22), Color(red=128, green=255, blue=0))

# Get the named tuples that the pixel is parameterized by
pixel_color = pixel_ins.color
pixel_point = pixel_ins.location

By mashing all the parameters together (e.g. x, y, red, green, and blue all on the main object) you don't really gain anything, but you lose a lot of legibility. Flattening the parameters also introduces a bug if your namedtuple parameters share fields:

from collections import namedtuple 

Point = namedtuple('Point', ['x', 'y'])
Color = namedtuple('Color', 'red green blue')
Hue = namedtuple('Hue', 'red green blue')
Pixel = namedtuple('Pixel', Point._fields + Color._fields + Hue._fields)
# Results in:
#    Traceback (most recent call last):
#      File "<stdin>", line 1, in <module>
#      File "C:\Program Files\Python38\lib\collections\__init__.py", line 370, in namedtuple
#        raise ValueError(f'Encountered duplicate field name: {name!r}')
#    ValueError: Encountered duplicate field name: 'red'

  

Noah
  • 495
  • 2
  • 7
  • 21
0

Background

Originally I've asked this question because I had to support some spaghetti codebase that used tuples a lot but not giving any explanation about the values inside them. After some refactoring, I noticed that I need to extract some typed information from other tuples and was looking for some boilerplate free and type-safe way of doing it.

Solution

You can subclass named tuple definition and implement a custom __new__ method to support that, optionally carrying out some data formatting and validation on the way. See this reference for more details.

Example

from __future__ import annotations

from collections import namedtuple
from typing import Union, Tuple

Point = namedtuple('Point', 'x y')
Color = namedtuple('Color', 'red green blue')
Pixel = namedtuple('Pixel', Point._fields + Color._fields)

# Redeclare "Color" to provide custom creation method
# that can deduce values from various different types
class Color(Color):

    def __new__(cls, *subject: Union[Pixel, Color, Tuple[float, float, float]]) -> Color:
        # If got only one argument either of type "Pixel" or "Color"
        if len(subject) == 1 and isinstance((it := subject[0]), (Pixel, Color)):
            # Create from invalidated color properties
            return super().__new__(cls, *cls.invalidate(it.red, it.green, it.blue))
        else:  # Else treat it as raw values and by-pass them after invalidation
            return super().__new__(cls, *cls.invalidate(*subject))

    @classmethod
    def invalidate(cls, r, g, b) -> Tuple[float, float, float]:
        # Convert values to float
        r, g, b = (float(it) for it in (r, g, b))
        # Ensure that all values are in valid range
        assert all(0 <= it <= 1.0 for it in (r, g, b)), 'Some RGB values are invalid'
        return r, g, b

Now you can instantiate Color from any of the supported value types (Color, Pixel, a triplet of numbers) without boilerplate.

color = Color(0, 0.5, 1)
from_color = Color(color)
from_pixel = Color(Pixel(3.4, 5.6, 0, 0.5, 1))

And you can verify all are equal values:

>>> (0.0, 0.5, 1.0) == color == from_color == from_pixel
True
kuza
  • 2,761
  • 3
  • 22
  • 56