Initializing an sql query as a dataclass

Question

The goal is to achieve the selected columns in SQL based on the some manual classification of the fields, e.g.

from dataclasses import dataclass


@dataclass
class Q:
    fruits = ('taste', 'color', 'shine')
    vege = ('is_green', 'color', 'juiceable')
    meat = ('is_white', 'is_dark', 'is_red', 'meat_name')
    carbs = ('is_refined', 'gi')

    @static_method
    def get_fields(self, add_fruits=False, add_vege=False, add_meat=False, add_carbs=False):
         fields = set()
         fields.update(self.fruits) if add_fruits else None
         fields.update(self.vege) if add_vege else None
         fields.update(self.meat) if add_meat else None
         fields.update(self.carbs) if add_carbs else None
         return sorted(",".join(sorted(fields)))


print(f"SELECT {Q.get_fields(add_fruits=True, add_vege=True)} from food_features")

Given a large list of categories of fields in Q object, how do we avoid the hardcoded in the .get_fields function?

fields.update(self._field_name_) if add_field_name else None

score 2 · Answer 1 · answered Apr 11 '22 at 05:23

Is this the output/functionality you're looking for?

class Q :
    def __init__(self):
        self.categories = {
            'fruits': ('taste', 'color', 'shine'),
            'vege': ('is_green', 'color', 'juiceable'),
            'meat': ('is_white', 'is_dark', 'is_red', 'meat_name'),
            'carbs': ('is_refined', 'gi')
        }
    
    def get_fields(self, groups):
        fields = []
        for group in groups:
            fields.extend(self.categories[group])
        return", ".join(sorted(set(fields)))

value = Q()
print(f"SELECT {value.get_fields(['fruits', 'vege'])} FROM food_features")

Output:

SELECT color, is_green, juiceable, shine, taste FROM food_features

This might have the correct functionality, but does not use a dataclass as the question title indicates (the choice of a dataclass here may be arguable, but is separate matter). — optimus_prime, Apr 17 '22 at 16:38

score 1 · Answer 2 · answered Apr 15 '22 at 13:13

1

You may just use a dict for the categories, and a set comprehension:

>>> @dataclass
class Q:
    fields = {
              'fruits' : ('taste', 'color', 'shine'),
              'vege' : ('is_green', 'color', 'juiceable'),
              'meat' : ('is_white', 'is_dark', 'is_red', 'meat_name'),
              'carbs' : ('is_refined', 'gi')
             }

    @staticmethod
    def get_fields(to_add):
        return ', '.join({f for x in to_add for f in Q.fields[x]})

print(f"SELECT {Q.get_fields(('fruits','vege'))} from food_features")

SELECT taste, juiceable, shine, color, is_green from food_features

answered Apr 15 '22 at 13:13

gimix

3,431
2
5
21

This requires a minor rework of the original code, but I'd say it's definitely the best answer here - easy to read, and does the trick. – optimus_prime Apr 17 '22 at 16:40
@optimus_prime what do u think about my improved solution ? – PleSo Apr 19 '22 at 00:23
1

@PleSo it works....but is still to me ultimately a workaround to implementing a dictionary in the first place - the point here is that we have a key-value situation with the 'fruit'/'vege', etc., and their associated sets - so why reinvent the wheel? I'm guessing it will ultimately just cause you more time later when you come back to the code. Plus, the set comprehension here is quite elegant. – optimus_prime Apr 19 '22 at 02:54
1

@optimus_prime thanks for the feedback. It is true, the dictionary solution is more easy to think about and implement. About reinventing the wheel, I wanted to give him a fully backwards compatible solution for his code, the thing with add_* prefix in arguments and the option to keep fields in the class, which imo is questionable in this case – PleSo Apr 19 '22 at 03:16

PleSo · Answer 3 · 2022-04-19T00:33:08.660

You can achieve this using kwargs and __ getattribute __ method

EDIT:

A more readable & efficient approach:

from dataclasses import dataclass
from functools import reduce

@dataclass
class Q:
    fruits = ('taste', 'color', 'shine')
    vege = ('is_green', 'color', 'juiceable')
    meat = ('is_white', 'is_dark', 'is_red', 'meat_name')
    carbs = ('is_refined', 'gi')

    @classmethod
    def get_fields(cls, **kwargs):
        fields = set()

        for key, is_key_true in kwargs.items():
          if is_key_true:
            fields.update(getattr(cls, key.replace('add_', '')))

        return ", ".join(sorted(fields))

print(f"SELECT {Q.get_fields(add_fruits=True, add_vege=True, add_meat=False)} from food_features")

This will output

SELECT color, is_green, juiceable, shine, taste from food_features

Old code & some explains:

Using kwargs you will receive a dictionary of function arguments, where you can put anything you want to add. For example add_fruits=True, add_vege=True, etc. We will use that dictionary to map the arguments in a tuple (parameter, value), but removing the add_ prefix. We add all of them to a list and you will end up having the values list of attributes and a boolean that tell us if we want to add them to our query or not. [('fruits', True), ('vege', True)]

We iterate in values list, and for every attribute that has True we get the data from the class attribute using __getattribute__ method.

We will have a list of tuples of attributes, we just have to convert them to a list, in order to have all values that we want in a single list.

Now that we have the list of values, we just have to join the list with ", " to comma space them.

And.. that's it..

Check the code below.

from dataclasses import dataclass


@dataclass
class Q:
    fruits = ('taste', 'color', 'shine')
    vege = ('is_green', 'color', 'juiceable')
    meat = ('is_white', 'is_dark', 'is_red', 'meat_name')
    carbs = ('is_refined', 'gi')

    
    def get_fields(self, **kwargs):
        values = [(key.replace('add_', ''), value) for key, value in kwargs.items()]
        tuples_query = [self.__getattribute__(key) for key, value in values if value]
        query = [field for single_tuple in tuples_query for field in single_tuple]
        
        return ", ".join(query)

q = Q()

print(f"SELECT {q.get_fields(add_fruits=True, add_vege=True, add_meat=False)} from food_features")

This will output

SELECT taste, color, shine, is_green, color, juiceable from food_features

rv.kvetch · Answer 4 · 2022-04-15T16:12:05.800

I would personally suggest using an Enum to model the classification of fields or groups for the SQL query.

For example:

from dataclasses import dataclass
from enum import Enum
from typing import Sequence


@dataclass
class Q:

    class Group(Enum):
        FRUIT = ('taste', 'color', 'shine')
        VEG = ('is_green', 'color', 'juiceable')
        MEAT = ('is_white', 'is_dark', 'is_red', 'meat_name')
        CARBS = ('is_refined', 'gi')

    @staticmethod
    def get_fields(groups: Sequence['Q.Group'] = None):
        fields = set()

        if not groups:  # if no groups specified, just select everything
            return '*'

        for g in groups:
            fields.update(g.value)

        return ','.join(sorted(fields))

    @classmethod
    def get_query(cls, *groups: 'Q.Group'):
        return f'SELECT {cls.get_fields(groups)} from food_features'


print(Q.get_query())
print(Q.get_query(Q.Group.FRUIT, Q.Group.VEG))

Output:

SELECT * from food_features
SELECT color,is_green,juiceable,shine,taste from food_features

Initializing an sql query as a dataclass

4 Answers4