I have two models, where one of them has a TreeManyToMany field (a many-to-many field for django-mptt models).
class ProductCategory(models.Model):
pass
class Product(models.Model):
categories = TreeManyToManyField('ProductCategory')
I'm trying to add categories to the Product objects in bulk by using the through-table, as seen in this thread or this.
I do this with a list of pairs: product_id and productcategory_id, which is also what my through-table contains. This yields the following error:
sqlite3.IntegrityError: UNIQUE constraint failed: products_product_categories.product_id, products_product_categories.productcategory_id
My code looks as follows:
def bulk_add_cats(generic_df):
# A pandas dataframe where ["product_object"] is the product_id
# and ["found_categories"] is the productcategory_id to add to that product.
generic_df = generic_df.explode("found_categories")
generic_df["product_object"] = generic_df.apply(
lambda x: Product.objects.filter(
merchant = x["forhandler"], product_id = x["produktid"]
).values('id')[0]['id'], axis=1
)
# The actual code
# Here row.product_object is the product_id and row.found_categories is one
# particular productcategory_id, so they make up a pair to add to the through-table.
through_objs = [
Product.categories.through(
product_id = row.product_object,
productcategory_id = row.found_categories
) for row in generic_df.itertuples()
]
Product.categories.through.objects.bulk_create(through_objs, batch_size=1000)
I have also done the following, to check that there are no duplicate pairs in the set of pairs that I want to add. There were none:
print(generic_df[generic_df.duplicated(subset=['product_object','found_categories'], keep=False)])
I suspect that the error happens because some of the product-to-productcategory relations already exist in the table. So maybe I should check if that is the case first, for each of the pairs, and then do the bulk_create. I just want to hope to retain effiency and would be sad if I have to iterate through each pair to check. Is there are way to bulk-update-or-create for this type of problem?
Or what do you think? Any help is highly appreciated :-)