I have a Django system that runs billing for thousands of customers on a regular basis. Here are my models:
class Invoice(models.Model):
balance = models.DecimalField(
max_digits=6,
decimal_places=2,
)
class Transaction(models.Model):
amount = models.DecimalField(
max_digits=6,
decimal_places=2,
)
invoice = models.ForeignKey(
Invoice,
on_delete=models.CASCADE,
related_name='invoices',
null=False
)
When billing is run, thousands of invoices with tens of transactions each are created using several nested for
loops, which triggers an insert for each created record. I could run bulk_create()
on the transactions for each individual invoice, but this still results in thousands of calls to bulk_create()
.
How would one bulk-create thousands of related models so that the relationship is maintained and the database is used in the most efficient way possible?
Notes:
- I'm looking for a native Django solution that would work on all databases (with the possible exception of SQLite).
- My system runs billing in a celery task to decouple long-running code from active requests, but I am still concerned with how long it takes to complete a billing cycle.
- The solution should assume that other requests or running tasks are also reading from and writing to the tables in question.