Documents is the parent table. Paragraphs is the child table.
Users filter Documents based on various search criteria. Then I wish to annotate Documents with certain Paragraphs filtered by a text query. The same text query is used to filter Documents and rank them (SearchRank). This ranking makes it necessary to start from Documents and annotate them with Paragraphs, instead of starting from Paragraphs and grouping them by Document.
The postgresql way of concatenating one text field from multiple rows in Paragraphs would be the following:
SELECT array_to_string(
ARRAY(
SELECT paragraph.text
FROM paragraph
WHERE document id = '...'
ORDER BY paragraph.number),
', ');
I am trying to translate this into django coding.
I have tried numerous django approaches, to no avail. I can annotate 1 Paragraph. Query_sum is a Q() object built from user input.
results = Documents.filter(Query_sum)
sub_paragraphs = Paragraphs.filter(Query_sum).filter(document=OuterRef('id'))
results = results.annotate(paragraphs=Subquery(sub_paragraphs.values('text')[:1], output_field=TextField()))
Problems start when I get rid of slicing [:1].
results = results.annotate(paragraphs=Subquery(sub_paragraphs.values('text'), output_field=TextField()))
I then get the following error: "more than one row returned by a subquery used as an expression".
To fix that, I tried to use ArrayAgg and StringAgg. I made quite a mess ;-)
The Documents queryset (result) should be annotated either with a list of relevant Paragraphs (ArrayAgg), or a string of Paragraphs separated by any delimiter (StringAgg).
Any idea of how to proceed? I would be extremely grateful