Questions tagged [cassandra]

Apache Cassandra is a highly scalable, eventually consistent, distributed, structured row store. Questions about Cassandra server administration should be asked on https://dba.stackexchange.com/questions/tagged/cassandra .

Apache Cassandra is a highly scalable, eventually consistent, distributed, structured row/column store. Cassandra brings together the distributed systems technologies from Dynamo and the data model from Google's . Like , Cassandra is eventually consistent. Like BigTable, Cassandra provides a ColumnFamily-based data model richer than typical key/value systems.

Cassandra's Dynamo-based cluster model provides linear scalability and fault tolerance on commodity hardware or cloud infrastructure. Its support for replicating across multiple data centers is best-in-class, providing low latency and the ability to survive entire data center outages.

Cassandra's data model offers the convenience of column indexes with the performance of log-structured updates and powerful built-in caching with the fastest write performance as compared to other database solutions and makes it a compelling option for big data processing. It provides linear scalability with the provision to add/remove nodes on the fly without downtime.

Cassandra was open-sourced by Facebook in 2008 and quickly became a top-level Apache project. Today, it's widely used by companies in many markets.

Official links:

Documentation

Useful Links:

20596 questions
5
votes
1 answer

Spark Cassandra Connector keyBy and shuffling

I am trying to optimize my spark job by avoiding shuffling as much as possible. I am using cassandraTable to create the RDD. The column family's column names are dynamic, thus it is defined as follows: CREATE TABLE "Profile" ( key text, column1…
Shai
  • 119
  • 1
  • 8
5
votes
2 answers

How to use OpsCenter with CCM?

I'm new to Cassandra and want to run OpsCenter on my development cluster which I created with CCM. I see CCM has a -o option for configuring OpsCenter as mention here. However, it is not clear how to use this option. Here is what I've attempted…
Justin
  • 6,031
  • 11
  • 48
  • 82
5
votes
3 answers

Cassandra: Not enough replica error in single node cluster

Over the weekend, we started seeing errors in Cassandra. Essentially, complaining that it couldn't get enough nodes together for SERIAL consistency. This appeared to be a problem with AWS vpn across regions. So, to simplify, I dropped one the…
mtyson
  • 8,196
  • 16
  • 66
  • 106
5
votes
2 answers

Cassandra CQL3 conditional insert/update

I have a list of unordered events and my task is to store first and last occurrences for them. I have following column family in Cassandra: CREATE TABLE events ( event_name TEXT, first_occurrence BIGINT, last_occurrence BIGINT, PRIMARY…
board reader
  • 344
  • 3
  • 11
5
votes
2 answers

Scala - Cassandra: cluster read fails with error "Can't use this Cluster instance because it was previously closed"

I'm getting this error when reading from a table in a 5 node cluster using datastax drivers. 2015-02-19 03:24:09,908 ERROR [akka.actor.default-dispatcher-9] OneForOneStrategy akka://user/HealthServiceChecker-49e686b9-e189-48e3-9aeb-a574c875a8ab…
Kasun Kumara
  • 61
  • 1
  • 4
5
votes
1 answer

Delete query in cassandra

Delete delete = QueryBuilder.delete() .from("addresbook", "contact") .where(eq("username", "dgarcia")); what is the type of "eq" in where clause Delete example here
ING
  • 219
  • 1
  • 3
  • 8
5
votes
3 answers

Cassandra eats up all the disk space

I have a single node cassandra cluster, I use the current minute as partition key and insert rows with TTL of 12 hours. I see a couple of issue I can't explain The /var/lib/cassandra/data// contains multiple files, lots of…
5
votes
1 answer

Cassandra "default_time_to_live" property is not deleting data

I've created a table like: CREATE TABLE IF NOT EXISTS metrics_second( timestamp timestamp, value counter, PRIMARY KEY ((timestamp)) ) WITH default_time_to_live=1; And inserted some data like: UPDATE metrics_second SET value = value + 1 WHERE…
Mark
  • 67,098
  • 47
  • 117
  • 162
5
votes
3 answers

Database/NoSQL - Lowest latency way to retrieve the following data

I have a real estate application and a "house" contains the following information: house: - house_id - address - city - state - zip - price - sqft - bedrooms - bathrooms - geo_latitude - geo_longitude I need to perform an EXTREMELY fast (low…
Nickb
  • 51
  • 3
5
votes
3 answers

Using Thrift to connect to Cassandra from .NET

I'm interested in Cassandra and I'd like to test it at home in my Windows XP computer. I've found instructions for install an run Cassandra in Windows, and it's already up and running; I've also found the thrift executable for Windows and generate…
vtortola
  • 34,709
  • 29
  • 161
  • 263
5
votes
1 answer

Adding an existing non-seed Cassandra node to the list of seeds

I have an existing Cassandra cluster with the following setup: DC1 Node1 Node2 Node3 DC2 Node4 Node5 Node6 Current seeds list in all nodes' yamls is "Node1, Node4" I would like to add one more node from each datacenter to the seed list, i.e. I…
CRCerr0r
  • 525
  • 1
  • 8
  • 17
5
votes
1 answer

No Response Data When using Cassandra JMeter

I am new to JMeter and Cassandra and trying to use Apache Jmeter Cassandra Plugin for Testing Purpose https://github.com/Netflix/CassJMeter/wiki By following the steps given there i was able to configure JMeter Cassandra Plugin. In the JMeter…
Yasmeen
  • 801
  • 1
  • 7
  • 20
5
votes
2 answers

phantom-dsl_2.11 error implicit session

I'm trying to connect to the cassandra database (With scala 2.11.2) using the phantom scala driver I followed this article on their blog: http://blog.websudos.com/2014/08/a-series-on-cassandra-part-1-getting-rid-of-the-sql-mentality/ (note on github…
Guillaume
  • 694
  • 1
  • 6
  • 15
5
votes
2 answers

Spark- Saving JavaRDD to Cassandra

This link shows a way to save a JavaRDD to Cassandra in this way: import static com.datastax.spark.connector.CassandraJavaUtil.*; JavaRDD productsRDD = sc.parallelize(products); javaFunctions(productsRDD,…
chrisTina
  • 2,298
  • 9
  • 40
  • 74
5
votes
1 answer

NoSuchMethodError Sets.newConcurrentHashSet() while running jar using hadoop

I'm using cassandra-all 2.0.7 api with hadoop 2.2.0.
prayagupa
  • 30,204
  • 14
  • 155
  • 192