0

I have more than 10,000,000 rows need to be inserted into one table. Considering the number of rows is large, I want to split the big table into 100 small table first, and then insert from table_001 to table_100. I expect to do these:

int main(){
    int i = 0;
    int j = 0;
    int k = 0;
EXEC SQL BEGIN DECLARE SECTION;
    char *buff[100] = { 0 };
EXEC SQL END DECLARE SECTION:


    while( i < 100 ){
        buff[i] = malloc(12);
        sprintf(buff[i],"table_%3.3d",i+1);
        i++;
    }

    i = 0;
    while( i < 10000000 ){
        EXEC SQL INSERT INTO :buff[j] VALUES(:v1,:v2);/* v1 and v2 is not important here, so I eliminate the code generate the v1 and v2 */    
        k++;
        if( 1000000 == k ){
            k = 0;
            j++;
        }
    }

    return 0;
}

But it seems that the table name can't be variable, is there a similar way to do these? Or is there any other good way to deal with big table in embedded SQL?

mustaccio
  • 18,234
  • 16
  • 48
  • 57
Tee
  • 91
  • 2
  • 9
  • 2
    Seems like an X-Y problem to me. Why not use the [db2Load API](http://www.ibm.com/support/knowledgecenter/en/SSEPGG_10.5.0/com.ibm.db2.luw.apdv.api.doc/doc/r0008704.html)? – mustaccio Jun 07 '16 at 13:55
  • 1
    10 million rows isn't considered that large these days. What is your problem with loading it all into one table? – Klas Lindbäck Jun 07 '16 at 13:55
  • The table name can't be variable in a static SQL statement because the pre-compiler needs to know what the column definitions are. You must instead use static SQL and PREPARE the statement each time you need to use it. But other methods would probably be better, e.g., blocked INSERTs or other methods. Even so, 10million rows shouldn't be that much trouble, especially when compared to the whole process of splitting and processing multiple small tables. – user2338816 Jun 07 '16 at 14:58

1 Answers1

2

For a variable table name, you need to PREPARE and EXECUTE a dynamic SQL statement in which you've CONCAT'd the table name into the statement.

Normally dynamic SQL built concatenated text is a bad idea as it is open to SQL injection. However, since you're not dealing with input from a user and it's the only way to handle a variable tablename you should be ok.

But I see no advantage to inserting the 10M rows into 100 smaller tables first, then from the smaller tables into the main one.

Just instert directly into the final table; commiting every 1000 rows or so; as you probably don't want 10M rows in a single commit.

You don't mention the platform and version of DB2 you are using, there may be bulk loading process available that would be an alternative.

Charles
  • 21,637
  • 1
  • 20
  • 44
  • I don't know, I am not familiar with database, but 10M rows is large for me, and I need query the table many times a day. So I want creat a view using union all on these samller tables. Is this a good idea when we need query a 10M rows table a lot? – Tee Jun 07 '16 at 15:03
  • Seems like you're trying to roll your own table partitioning; not a completely invalid idea. But you'd be best served by using an actual partitioned table assuming it's available. Otherwise, I'd just make sure I have the indexes needed. My "large" table is 18M rows and it's just a regular table. – Charles Jun 07 '16 at 15:32