1

I am trying to use redis-cli --pipe to bulk upload some commands to my AWS Elasticache for redis cluster. The commands come from parsing a file via a custom awk command, which helps generate some HSET commands. The awk command is in a custom shell script. When my Elasticache for redis server had cluster-mode disabled, doing something like the following worked like a charm:

sh script_containing_awk.sh $FILE_TO_PARSE | redis-cli -h <Primary_endpoint> -p <port> --tls --cacert <path/to/cert> --pipe

Due to an internal project requirement, the Elasticache for Redis server has been re-created with cluster-mode enabled, and hence I am adding the -c flag to the above command to specify as such.

I see the following results when trying to work with my Elasticache for Redis server with cluster-mode enabled:

  • I can connect to the cluster via the configuration endpoint no problem!
  • Single command uploads work (i.e: redis-cli -h <config_endpoint> -p <port> -c --tls --cacert <path/to/certs> SET key value)

It would be extremely convenient to just pipe output from my script to the cli:

sh script_containing_awk.sh $FILE_TO_PARSE | redis-cli -h <config_endpoint> -p <port> -c --tls --cacert <path/to/cert> --pipe

but adding the --pipe flag results in "MOVED" errors.

I have tried modifying the script to include {} (ex: HSET {user1}:hash field1 val1 field2 val2 ... brackets to try to force keys to the same CLUSTER SLOTS, but I still get the "MOVED" errors and I am attempting to bulk upload millions of keys so I don't think they would all fit in the same slot anyway.

Does anyone have experience getting --pipe to work with cluster-mode enabled Redis/Elasticache?

Thanks!

DBOY
  • 13
  • 3

1 Answers1

1

I am sure you understand that the core difference between Cluster Mode Disabled and Cluster Mode Enabled is that there is a split in your total Key slots.

Just to put in context; CMD - Let's say we have 4 node cluster with 1 Primary and 3 Replicas. if we have 100 key slots - All the 100 key slots will be there in all the nodes. 3 of them will serve Read only commands and 1 of the node will serve all the commands.

CME - Let's say we have 4 nodes split in 2 shards - 1 replica and 1 primary each. We can look at them as logical sub-clusters ie. they will have different sets of key-slots. Ideally a 50-50 split.

Now, the MOVED message is not necessarily an error. When you connect to the configuration endpoint, by default you are being connected with one of the primary nodes (chosen at random, at first). when you make a command, the client sends that command and the primary node decides if it has the correct hash-slot to serve that command.

As explained here, if the node does not have the hash-slot that your client is looking for, it will redirect you with a MOVED message.

So, I would assume MOVED messages are somewhat expected with CME clusters.

tedd_selene
  • 395
  • 1
  • 4
  • 9
  • Thank you for the reply @tedd_selene. Yes, I have slowly become familiar with the nuances of CME and CMD Redis. So, I guess what you are saying is that the redis-cli (despite having a -c option) is not sophisticated enough to track SLOT redirection for CME? It seems as though --pipe fails with CME, and I am wondering if this is a limitation of the cli, or if there is a better way to bulk insert keys/Hashes etc... I have since implemented pipelining using the hiredis-cluster lib and the redispy libraries, but for millions of keys, I am still getting less performant uploading than with --pipe – DBOY May 05 '22 at 14:28
  • True about Redis-cli. I believe there are other clients based on Java on the internet that are "cluster-aware" ie. they maintain a map with themselves over which slot is which node and make the request directly there. I am not sure about rest of your query, though. I havent worked enough with the redis-cli. I often only use to test a things connections or stunnel. – tedd_selene May 10 '22 at 07:19
  • Hi @tedd_selene. Thank you again for the suggestion. I ended up working with the redis client for python as it supports pipelining and cluster mode. I am back on the order of seconds/minutes for upploading millions of keys using pipelining, although the --pipe mass insertion flag for the cli is still loads faster. There seems to be some talk on github about this issue with the cli ([here](https://github.com/redis/redis/issues/6098) and [here](https://github.com/redis/redis/issues/6294)) I will accept your answer as correct. Perhaps later I can see if any of the client libraries address this – DBOY May 11 '22 at 13:03