I know this question has probably been asked many times, but bear with me; I couldn't find the answer on my specific situation on stackoverflow. I'm also not a wiz at SQL and specifically not DB2 SQL. We're using DB2 12 running on z/OS.
I need to speed up a select from a table of persons, with more than 5 million rows. It's an old DB2 table, it's not intelligently structured, and it has no primary keys or anything. I can, however, create keys or indexes, if it helps. So my question is: Will it help? and if yes, what would be the best way to create the key / index? This is the situation:
Apart from information such as name and address, the table contains a PERSONTYPE (char 1), a PERSONCODE (char 8) and a DNUM (char 16), where DNUM is a kind of a case-number that the person is tied to. (If a row has a DNUM, it does not have a PERSONCODE.) I need to find all rows that have a specific subset of DNUM. Several rows can have the same DNUM, as several persons can be tied to a case. So this is the SQL:
SELECT NAME, ADRESS, SHOESIZE FROM PERSONTABLE WHERE DNUM IN ([a set of 1000 DNUMS])
This query takes about 55 seconds to produce a result. As a comparison, this query:
SELECT NAME, ADRESS, SHOESIZE FROM PERSONTABLE P JOIN OTHERTABLE O ON (P.PERSONCODE = O.PERSONCODE) WHERE O.DNUM IN ([same set of 1000 DNUMS])
takes about 2 seconds. (Yes, this OTHERTABLE also has the DNUM column.)
This is part of a data migration project, and 95% of the entire execution time is eaten up by the first SQL, so I'm keen on trying to speed that up.
I am able to index and insert keys for PERSONTABLE, but as DNUM isn't a unique column, I can't make it a primary key. There is, however, an ORDER_NUMBER column in the PERSONTABLE that numbers each person involved in a case, so DNUM+ORDER_NUMBER could possibly be made a primary key. But would this speed up the select? It's an old table and don't want to mess with it without knowing what it'll do.
To repeat, the system is DB2 12 running on z/OS.
Does anyone of you have a suggestion?