0

got this error when writing into tidb use jdbc drive in spark2.2:

java.sql.BatchUpdateException: transaction too large, len:300200

got the error under condition:
select whole table

not getting error under condition :
select with limit 10000000;

haven't got any clue

no123ff
  • 307
  • 5
  • 16
  • I dig a little bit and it seems only set / update should trigger that error message. It confused me that select query will go through update code path. Would you please paste the full exception stack and explain a little bit what you were doing when error happened? – user1192878 Oct 27 '17 at 15:11

1 Answers1

1

Copy from TiDB document

The error message transaction too large is displayed.

As distributed transactions need to conduct two-phase commit and the bottom layer performs Raft replication, if a transaction is very large, the commit process would be quite slow and the following Raft replication flow is thus struck. To avoid this problem, we limit the transaction size:

Each Key-Value entry is no more than 6MB The total number of Key-Value entry is no more than 300,000 rows The total size of Key-Value entry is no more than 100MB There are similar limits on Google Cloud Spanner.

Solution:

When you import data, insert in batches and it'd be better keep the number of one batch within 10,000 rows.

As for insert and select, you can open the hidden parameter set @@session.tidb_batch_insert=1;, and insert will execute large transactions in batches. In this way, you can avoid the timeout caused by large transactions, but this may lead to the loss of atomicity. An error in the process of execution leads to partly inserted transaction. Therefore, use this parameter only when necessary, and use it in session to avoid affecting other statements. When the transaction is finished, use set @@session.tidb_batch_insert=0 to close it.

As for delete and update, you can use limit plus circulation to operate.

user1192878
  • 704
  • 1
  • 10
  • 20