0

I'm working on a page that lists companies and their employees. Employees have sales. These are saved in a database. Now I need to list all of them. My problem is that the current solution is not fast. One page load takes over 15 seconds.

Currently I have done the following:

companies = {}
employees = {}

for company in Company.objects.all():
    sales_count = 0
    sales_sum = 0

    companies[company.id] = {}
    companies[company.id]["name"] = company.name

    for employee in company.employees.all():
        employee_sales_count = 0
        employee_sales_sum = 0

        employees[employee.id] = {}
        employees[employee.id]["name"] = employee.first_name + " " + employee.last_name

        for sale in employee.sales.all():
            employee_sales_count+= 1
            employee_sales_sum += sale.total

        employees[employee.id]["sales_count"] = employee_sales_count
        employees[employee.id]["sales_sum"] = employee_sales_sum

        sales_count += employee_sales_count
        sales_sum += employee_sales_sum

    companies[company.id]["sales_count"] = sales_count
    companies[company.id]["sales_sum"] = sales_sum

I'm new to Python, not sure if this is a "pythonic" way to do things.

This makes 1500 queries to the database with 100 companies and some employees and sales for each. How should I improve my program to make it efficient?

MikkoP
  • 4,864
  • 16
  • 58
  • 106

1 Answers1

0

Avoid nesting of database queries in loops - it's a fine way to performance hell! :-)

Since you're counting all sales for all employees I suggest building your employee and sales dicts on their own. Don't forget to import defaultdict and you may want to lookup how group by and suming/counting works in Django :-)

Lets see... this should give you a indication where to go from here:

# build employee dict
employee_qset = Employee.objects.all()
employees = defaultdict(dict)
for emp in employee_qset.iterator():
    employees[emp.company_id][emp.id] = emp


# build sales dict
sales_qset = Sales.objects.all()
sales = defaultdict(dict)
for sale in sales_qset.iterator():
    # you could do some calculations here, like sum, or better yet do sums via annotate and group_by in the database
    sales[sale.employee_id][sale.id] = sale


# get companies
companies_qset = Companies.objects.all()
companies = {company.id: company for company in companies_qset.iterator()}


for company in companies.itervalues():
    # assign employees, assign sales, etc.
    pass
ACimander
  • 1,852
  • 13
  • 17
  • Thanks! What about if I want to paginate them using the build in Paginator? As I need to combine those dictionaries, I could simply use the paginator on that one. But this will cause a lot of unrelated data to be loaded. So if I want to show 15 per page, I have 500 loaded. – MikkoP Mar 08 '15 at 07:33
  • You could load and paginate the companies before the other data, build a list with the company ids and filter Sales and Employees via company__in=id_list – ACimander Mar 08 '15 at 11:32
  • Yea, now these are executed very quickly. But how do I show the data then? I can't use `{{ len(sales[company.id]) }}` for example to show the total amount of orders for one company. I came up with this: http://pastebin.com/KDyzEUGD – MikkoP Mar 08 '15 at 12:12
  • You have to do the count and sum work in your view and just display assigned context variables in your template. Are you returning your dict with the data in the get_context_data method? – ACimander Mar 08 '15 at 14:41