I am writing to HBase using TThreadPoolServer
Thrift server, I have the following HBase settings for max worker threads:
hbase-site.xml
<property>
<name>hbase.thrift.maxWorkerThreads</name>
<value>50000</value>
<source>hbase-site.xml</source>
</property>
This is the script I use to do concurrent writes:
test.py
import happybase
from random import randint
connection = happybase.Connection('ec2-xx-xx-xx-xx.compute-1.amazonaws.com', timeout=50000)
table_name = 'test' + str(randint(0,1000000))
families = {
'cf1': dict(max_versions=1),
}
connection.create_table(table_name, families)
table = connection.table(name=table_name)
x = 0
while x < 1000000:
table.put('row-key' + str(x), {b'cf1:qual1': b'testtesttest', b'cf1:qual2': b'testestest'})
x += 1
Now If I run 25 instances of test.py concurrently, after creating 18-20 connections all the other connections are unable to connect because of timeout error, I checked on hbase server, thrift is able to create only 300 threads and when that limit is reached new connections are not accepted and gets timed out.
There is no stress on the system even with 300 threads, the CPU and memory consumption is very low, Therefore I think it's because of some configuration.
Can somebody guide me on why thrift is not creating more threads, when in my HBase configuration the thrift max thread count is much more?