VoltDB Cluster PAUSED when 2 node reaches to 80% memory out of 9 nodes

Question

I am importing data from Kafka server using VoltDB Kafka Importer, my current setup is in AWS and all the 9 nodes has following config

8 vCPUs

200 GB HDD

8 GB RAM

The import rate is 10,000 recs per second.

My problem is the cluster enters into in Read-only mode after importing 42 million records, even if other 7 nodes are using only 20-30% of memory.

My table and stored procedure are partitioned. I enabled auto snapshot as well with 1 Hr frequency.

I am expecting 144 million records

What changes I should do to my configuration and how I can move data from memory to disk.

Please help.

It sounds like one or perhaps two servers are reaching 80% RAM, while others are using much less. What is the partitioning key for your main tables? It sounds like perhaps you have partitioned on a column that does not have high cardinality, so most of the workload is going to one partition. — BenjaminBallard, Sep 30 '18 at 03:35
Here is my table and partition details create table tsensor( sensor_id integer not null, clockTime timestamp, airTemp varchar(30), windSpeed varchar(30), surfaceTemp varchar(30), latitude varchar(30), longitude varchar(30), receive_time timestamp default now() ); PARTITION TABLE TSENSOR ON COLUMN SENSOR_ID; the procedure which inserts data is also partitioned on the same key, there are 100 possible sensor_ids — Arjun, Oct 01 '18 at 04:09
What is the range of values of sensor_id? With only 100 values, there could be a data skew. What are your sitesperhost and kfactor settings? The web interface Analysis tab, Data sub-tab can show the distribution of the tsensor table across your partitions. It may be very uneven. — BenjaminBallard, Oct 01 '18 at 14:27
Sensor is range is 0 - 99. Host per node is 8 and kfactor is 1. — Arjun, Oct 01 '18 at 16:18
You could try a lower sitesperhost, that might give a more even distribution. Do the sensors emit data evenly? — BenjaminBallard, Oct 02 '18 at 14:22
Very surprising..isnt this something VoltDB is supposed to handle itself. If there are 100 possible ids and only 9 nodes why doesn't it get distributed evenly? — Marut Singh, Jan 29 '19 at 04:03
After reducing site per host and fixing the code the scalability improved. The was randomly generating the ids which somehow attributed to the issue. Assigning is in round robin fashion helped. — Arjun, Jan 29 '19 at 13:49

VoltDB Cluster PAUSED when 2 node reaches to 80% memory out of 9 nodes

0 Answers0