0

thank you in advance for your answers.

How can I avoid duplicating querys ?, if I go through only 6 products it takes 50ms (6 querys duplicated), but when I try to go through 7000 products it takes too long (7000 duplicated querys), there is some way of caching all the products so that you have to do a query every time we go through the for loop, or should go through the loop in another way.

Thanks.

models.py

class TareaDeScrap(Tarea):
  producto = models.ForeignKey(Producto)
  tipo = models.CharField(max_length=20, default='all')
  idioma = models.ForeignKey(Idioma)

class Dato(models.Model):
  campo = models.CharField(max_length=45)
  tarticulo = models.IntegerField(default=0)
  db_field = models.IntegerField(default=0)

  def __str__(self):
    return self.campo

class Producto(models.Model):
  tdbaseges = models.CharField(max_length=45, unique=True)
  codigo_O = models.CharField(max_length=60, null=True)
  referencia = models.CharField(max_length=500, null=True)
  codigo_limpio = models.CharField(max_length=60)
  codigo_fabricante = models.CharField(max_length=10)
  categoria = models.ForeignKey(Categoria, null=True)
  fabricante = models.ForeignKey('tarifas.Fabricante', blank=True, null=True)

class DatoProducto(models.Model):
  producto = models.ForeignKey(Producto)
  dato = models.ForeignKey(Dato)
  informacion = models.TextField()
  tarea = models.ForeignKey('main.Tarea', null=True)

views.py

tareas = TareaDeScrap.objects.filter(batch_id=batch_id, estado='SAVED').select_related("producto")

    for tarea in tareas:  

        producto_modificados.setdefault(tarea.producto.tdbaseges, {})

        datos_modificados = DatoProducto.objects.filter(producto=tarea.producto).select_related("dato")  # tarea=tarea)

        for dato in datos_modificados:

            producto_modificados[tarea.producto.tdbaseges].setdefault(dato.dato.campo, dato.informacion)
            cabeceras_productos_modificados.setdefault(dato.dato.campo, [])

Django debbug-toolbar sql duplicates

SELECT ••• FROM `importacion_datoproducto` INNER JOIN 
`importacion_dato` ON (`importacion_datoproducto`.`dato_id` = 
`importacion_dato`.`id`) WHERE 
`importacion_datoproducto`.`producto_id` = 132142
   Duplicated 6 times.

SELECT ••• FROM `importacion_datoproducto` INNER JOIN 
`importacion_dato` ON (`importacion_datoproducto`.`dato_id` = 
`importacion_dato`.`id`) WHERE 
`importacion_datoproducto`.`producto_id` = 132144
   Duplicated 6 times.  

SELECT ••• FROM `importacion_datoproducto` INNER JOIN 
`importacion_dato` ON (`importacion_datoproducto`.`dato_id` = 
`importacion_dato`.`id`) WHERE 
`importacion_datoproducto`.`producto_id` = 100613
   Duplicated 6 times.  

SELECT ••• FROM `importacion_datoproducto` INNER JOIN 
`importacion_dato` ON (`importacion_datoproducto`.`dato_id` = 
`importacion_dato`.`id`) WHERE 
`importacion_datoproducto`.`producto_id` = 100613
   Duplicated 6 times.  

 SELECT ••• FROM `importacion_datoproducto` INNER JOIN 
 `importacion_dato` ON (`importacion_datoproducto`.`dato_id` = 
 `importacion_dato`.`id`) WHERE 
 `importacion_datoproducto`.`producto_id` = 100613
   Duplicated 6 times.  

 SELECT ••• FROM `importacion_datoproducto` INNER JOIN 
 `importacion_dato` ON (`importacion_datoproducto`.`dato_id` = 
 `importacion_dato`.`id`) WHERE ` 
  importacion_datoproducto`.`producto_id` = 100613
    Duplicated 6 times.
Luis
  • 3
  • 2
  • Read this [Django--Database access optimization](https://docs.djangoproject.com/en/2.0/topics/db/optimization/) – JPG Jul 18 '18 at 10:59
  • Perform `prefetch_related`s. – Willem Van Onsem Jul 18 '18 at 11:32
  • @WillemVanOnsem I know I must use the prefetch_related, but I don't know what form in this case, due to the poor information or the few examples, I can't think of ways and I have tried it in different ways, if you see my code you can see that I have already use prefetch_related and select_related, I have reduced the querys from 400 to 11 – Luis Jul 18 '18 at 11:39

0 Answers0