1

I am using spark-cassandra-connector_2.11-2.0.0.jar to connect to Cassandra(version 2.1.9). Cassandra's partitioner is `ByteOrderedParitioner'.

However, when I submit spark driver programs, it shows: Exception in thread "main" java.lang.IllegalArgumentException: Unsupported partitioner: org.apache.cassandra.dht.ByteOrderedPartitioner.

It seems that only "Murmur3Partitioner" and "RandomPartitioner" are supported in the source code.

But Hadoop supports ByteOrderedPartitioner. I wonder how to solve it when I force to use ByteOrderedPartitioner.

Thanks for your help.

Jenny.D
  • 31
  • 6

1 Answers1

3

It's not in Spark because nobody should be using the ByteOrderedPartitioner anymore. This is because it:

A) Exists only for backward compatibility.

B) Its creation (and subsequent use) is widely recognized as a bad idea.

This has been discussed ad-nauseum. See my answer here to a similar question: Cassandra ByteOrderedPartitioner

I recommend you:

  • Rebuild your cluster using the Murmur3Partitioner.
  • OR build a new cluster, and load it with data from the original.
  • Find whomever built the original cluster and slap them.
Aaron
  • 55,518
  • 11
  • 116
  • 132
  • 1
    Ok, I was kidding about slapping the person who built that cluster. Maybe. At least, that should probably be my official statement. – Aaron Sep 23 '18 at 19:55
  • I discarded `Murmur3Partitioner` because I found that data amount ingested among nodes differed from each other greatly. But `ByteOrderedPartitioner` was a good choice for loading balance, and it did. I found that there are implementation of InputFormat in Cassandra's source codes, which supports both `Murmur3Partitioner` and `ByteOrderedPartitioner` and it worked for both Hadoop and Spark. – Jenny.D Sep 24 '18 at 02:25
  • 1
    If the data isn't very good distributed with Murmur partitioner, then it's a sign of the problems with data model... – Alex Ott Sep 24 '18 at 09:50
  • @Jenny.D Alex makes a good point, in that your data model itself needs to be designed for good data distribution. The bottom line, is that the `ByteOrderedPartitioner` is deprecated, will be removed from Cassandra at some point, and getting support and help around using it will be extremely difficult (as you are finding). – Aaron Sep 24 '18 at 14:26