0

I have 7 tables in PostgreSql which was integrated with Django 1.3 application.Out of the 7 tables we will access based on the name which is existing in all tables with different column data.Some particular name contains 1,00,000 rows.So, My problem is I need to retrieve all the rows which belongs to the particular name data in table.

So, I implemented the django model orm queryset for the retrieval operations.but is doesn't seem to effective.It takes huge time to load the page even i did pagination and also created the index for all column for all tables.

still i am getting nginx-504 Gateway Time-out

So, I need to optimize the query which should be retrieved from all tables through 1,00,000 rows too.

What are the ways need to improve the query optimization?

How do I approach in django for such challenging operations?

Here I pasted my code http://pastebin.com/yNy5pxGG

Please suggest me the right direction!

EDIT1:

My Query: http://pastebin.com/A9B2ExVk

TABLES:

 fda_sheet1 = 208778 unique rows
 fda_sheet2 = 291958 unique rows
 fda_sheet3 = 191958 unique rows
 fda_sheet4 = 676363 uniq rows
 fda_sheet5 = 262651
 fda_sheet6 = 742707
 fda_sheet7 = 300000

These are my individual tables and row counts.here I am making LEFT JOIN of all to make one table for the performance.

This is also not good since the table has taking more than 700GB size.

What is the way to process the query ?

NOTES:

These tables contains various level of the data.Mainly we have Drug name and ISR no .Also the ISR no will be existing in all other tables.Based on the drug we will fetch ISR no and also using the ISR no we will query other data in other tables. Hope you understand the bottleneck

So, 1. Combining the table is also the problem Since crossing 700 GB

2.Querying across multiple tables also the problem Since multiple rows in each table has same isr no.

Do you think any other best ways?

Please if you have any doubts let me know

plz Suggest us to fix this .

Nava
  • 6,276
  • 6
  • 44
  • 68

1 Answers1

1

You are doing pagination after you evaluate all your queries (casting them to list) and creating your total_data. So all your data need to be retrieved and put to memory before any pagination works. You should avoid that.

Also, as I see, you have data on one object split up in 7 different tables - it is bad for Django's ORM. If you can't modify database schema, you can try to create a view in database to simulate one grand table by joining it all together.

So all joining and checking for empty fields will be done in DB, you will have much clearer Python code and it will work much faster.

ilvar
  • 5,718
  • 1
  • 20
  • 17