4

I've written a small application to start an embedded instance of Cassandra 1.2.

I'm trying to create a cluster of 3 of these embedded instances locally, by running 3 instances of this application. Each one looks at a different cassandra.yaml on the filesystem. Each file has:

  • the same cluster name
  • blank initial_token
  • unique listen address (all mapped to 127.0.0.1 in my hosts file)
  • unique rpc, storage and ssl_storage ports
  • the same seed (the listen address (no port) of the first server)
  • unique -Dcom.sun.management.jmxremote.port value passed in application launch

When I launch the applications, all come up fine, and have separate storage on the filesystem. However, when I use nodetool to inspect each one, each appears to be in a cluster by itself:

C:\Program Files\DataStax Community\apache-cassandra\bin>nodetool -h 127.0.0.1 -p 7197 ring
Starting NodeTool

Datacenter: datacenter1
==========
Replicas: 1

Address    Rack        Status State   Load            Owns                Token

127.0.0.1  rack1       Up     Normal  198,15 KB       100,00%             8219116491729144532


C:\Program Files\DataStax Community\apache-cassandra\bin>nodetool -h 127.0.0.2 -p 7198 ring
Starting NodeTool

Datacenter: datacenter1
==========
Replicas: 1

Address    Rack        Status State   Load            Owns                Token

127.0.0.2  rack1       Up     Normal  152,13 KB       100,00%             -3632227916915216562

Blogs and docs online suggest this should be sufficient. Is it possible to cluster embedded instances? If so, does anyone know how my configuration or understanding is incorrect/insufficient?

Code to launch the embedded instances is below. Hope you can help, thanks.

public class EmbeddedCassandraDemo {

    private static final String CONF_PATH_FORMAT = "D:\\embedded_cassandra\\Node%d\\";

    private ExecutorService executor = Executors.newSingleThreadExecutor();
    private CassandraDaemon cassandraDaemon;
    private int nodeNumber;

    public EmbeddedCassandraDemo(int nodeNumber) {
        this.nodeNumber = nodeNumber;
    }

    public static void main(String [ ] args) throws InterruptedException, ConnectionException {
        new EmbeddedCassandraDemo(Integer.parseInt(args[0])).run();
    }

    private void run() throws InterruptedException, ConnectionException {
        setProperties();

        activateDeamon();
    }

    private void activateDeamon() {
        executor.execute( new Runnable(){

            @Override
            public void run() {
                cassandraDaemon = new CassandraDaemon();
                cassandraDaemon.activate();
            }});
    }

    private void setProperties() {
        System.setProperty("cassandra.config", String.format("file:%scassandra.yaml", String.format(CONF_PATH_FORMAT, nodeNumber)));
        System.setProperty("log4j.configuration", String.format("file:%slog4j-server.properties", String.format(CONF_PATH_FORMAT, nodeNumber)));
        System.setProperty("cassandra-foreground", "true");
    }
}
Ben Kirby
  • 904
  • 2
  • 11
  • 29

2 Answers2

2

"blank initial_token"

Are you using virtual nodes? If not, I wonder if that could be your issue. You should have each machine defined with a different initial token. For a 3-node cluster, those initial tokens should be increments of 56,713,727,820,156,410,577,229,101,238,628,035,242 apart from each other.

Using DataStax's Python script for computing initial tokens, these values should suit your needs:

node 0: 0
node 1: 56713727820156410577229101238628035242
node 2: 113427455640312821154458202477256070485

Also, which endpoint_snitch are you using? If you are using "PropertyFileSnitch" make sure that your cassandra-topology.properties file contains a definition for each node (along with DC and rack).

Give that a try and see if it helps.

Aaron
  • 55,518
  • 11
  • 116
  • 132
  • Thanks @BryceAtNetwork23. I've changed to RandomPartitioner from the 1.2 default, Murmur3Partitioner, and used your initial token values. Still no joy, unfortunately - the calls with nodetool still return each of the servers in a cluster on their own. Am I using that tool correctly? I have a feeling I'm doing something stupid which is preventing this from working... – Ben Kirby Dec 06 '13 at 09:30
  • @BenKirby Just a thought, but can you ping 127.0.0.1 from 127.0.0.2 and vice-versa? – Aaron Dec 06 '13 at 15:07
  • That all seems to be ok, yes. Sorted now, thanks for your help! – Ben Kirby Dec 06 '13 at 16:46
1

Sorted now. Looked through the code and intro on https://github.com/pcmanus/ccm (linked to from Run multiple cassandra nodes (a cluster) from the same machine?), and that keeps all port values the same apart from that for JMX connection.

Having made those changes, plus setting the initial token on each thanks to @BryceAtNetwork23, and specifying the IPs of all 3 servers as seeds, they now form a cluster.

Community
  • 1
  • 1
Ben Kirby
  • 904
  • 2
  • 11
  • 29
  • 1
    Is this embedded Cassandra style still valid for 2.1? I look for a way to configure it programmatically and start the cassandra nodes within my own application. – Martin Kersten Mar 19 '15 at 19:22
  • @MartinKersten We haven't tried, but looking on github the CassandraDaemon and StorageService classes are still there, so you should be able to. http://wiki.apache.org/cassandra/Embedding has more info. – Ben Kirby Mar 23 '15 at 09:25
  • I checked the latest code. One is able to completely rewrite the setup code making it possible to embedded it in various ways. One I have not figured out is how to best stop it and if Cassandra should run in a child process or the same JVM. Killing off a child process in Linux is always the cleanest way if you do not want memory leaking. So there is a certain amount of testing required. – Martin Kersten Mar 23 '15 at 09:38
  • 1
    http://wiki.apache.org/cassandra/Embedding looks rather outdated. What's the reference to Table class (in Cassandra 2.1.9)? Also, it mentions Thrift which, apparently, must not be used in new projects in favor of native protocol. – Ivan Balashov Dec 25 '15 at 14:44
  • @MartinKersten I've had 2.1.1 embedded and used it stand alone for quite some time here... https://github.com/nsoft/jesterj/blob/5bd88f1f353ef55dabaeda78437b99f963ff1021/code/ingest/src/main/java/org/jesterj/ingest/persistence/Cassandra.java but upgrading to 3.11 is proving a challenge... https://stackoverflow.com/questions/44760920/embedding-cassandra-security-manager-issues – Gus Jun 26 '17 at 19:11