I have a kinesis stream with 2 shards that looks like this:
{
"StreamDescription": {
"StreamStatus": "ACTIVE",
"StreamName": "my-stream",
"Shards": [
{
"ShardId": "shardId-000000000001",
"HashKeyRange": {
"EndingHashKey": "17014118346046923173168730371587",
"StartingHashKey": "0"
},
{
"ShardId": "shardId-000000000002",
"HashKeyRange": {
"EndingHashKey": "340282366920938463463374607431768211455",
"StartingHashKey": "17014118346046923173168730371588"
},
]
}
}
The sender side sets a partition that is usually a UUID. It always falls in shard-002 above which makes the system not load balanced and therefore not scalable.
As side note, kinesis uses md5sum to assign a record and then send it to shard that contains the resulted hash in its range. In fact when i tested it on the UUId i used, they do fall always in the same shard.
echo -n 80f6302fca1e48e590b09af84f3150d3 | md5sum
4527063413b015ade5c01d88595eec11
17014118346046923173168730371588 < 4527063413b015ade5c01d88595eec11 < 340282366920938463463374607431768211455
Any idea on how to solve this?