2

Given a Django JSONField that is structured as a list of dictionaries:

# JSONField "materials" on MyModel:
[
    {"some_id": 123, "someprop": "foo"},
    {"some_id": 456, "someprop": "bar"},
    {"some_id": 789, "someprop": "baz"},
]

and given a list of values to look for:

myids = [123, 789]

I want to query for all MyModel instances that have a matching some_id anywhere in those lists of dictionaries. I can do this to search in dictionaries one at a time:

# Search inside the third dictionary in each list:
MyModel.objects.filter(materials__2__some_id__in=myids)

But I can't seem to construct a query to search in all dictionaries at once. Is this possible?

shacker
  • 14,712
  • 8
  • 89
  • 89

2 Answers2

2

Given the clue here from Davit Tovmasyan to do this by incrementing through the match_targets and building up a set of Q queries, I wrote this function that takes a field name to search, a property name to search against, and a list of target matches. It returns a new list containing the matching dictionaries and the source objects they come from.

from iris.apps.claims.models import Claim
from django.db.models import Q


def json_list_search(
    json_field_name: str,
    property_name: str, 
    match_targets: list
    ) -> list:
    """

    Args:
        json_field_name: Name of the JSONField to search in
        property_name: Name of the dictionary key to search against
        match_targets: List of possible values that should constitute a match

    Returns:
        List of dictionaries: [
            {"claim_id": 123, "json_obj": {"foo": "y"},
            {"claim_id": 456, "json_obj": {"foo": "z"}
        ]

    Example:
        results = json_list_search(
            json_field_name="materials_data", 
            property_name="material_id", 
            match_targets=[1, 22]
        )

        # (results truncated):
        [
            {
                "claim_id": 1,
                "json_obj": {
                    "category": "category_kmimsg",
                    "material_id": 1,
                },
            },
            {
                "claim_id": 2,
                "json_obj": {
                    "category": "category_kmimsg",
                    "material_id": 23,
                }
            },
        ]
    """

    q_keys = Q()
    for match_target in match_targets:
        kwargs = {
            f"{json_field_name}__contains": [{property_name: match_target}]
        }
        q_keys |= Q(**kwargs)

    claims = Claim.objects.filter(q_keys)

    # Now we know which ORM objects contain references to any of the match_targets 
    # in any of their dictionaries. Extract *relevant* objects and return them
    # with references to the source claim.
    results = []
    for claim in claims:
        data = getattr(claim, json_field_name)
        for datum in data:
            if datum.get(property_name) and datum.get(property_name) in match_targets:
                results.append({"claim_id": claim.id, "json_obj": datum})

    return results
shacker
  • 14,712
  • 8
  • 89
  • 89
1

contains might help you. Should be something like this:

q_keys = Q()

for _id in myids:
    q_keys |= Q(materials__contains={'some_id': _id})

MyModel.objects.filter(q_keys)
Davit Tovmasyan
  • 3,238
  • 2
  • 20
  • 36
  • Hmm. I'm very familiar with using `contains` expressions in JSONField queries, but that's actually not the interesting bit here. The interesting bit is that you are building up a Q query incrementally by iterating through the target matches. Not sure how that will scale if comparing to thousand of possible matches in a single query, but I will do some experiments and report back. – shacker Nov 21 '19 at 06:41
  • Thanks for pointing me in the right direction. I've posted another answer containing a custom function I wrote to accomplish this, using this "incremental Q building" technique. – shacker Nov 21 '19 at 18:51