Questions tagged [cassandra]

Apache Cassandra is a highly scalable, eventually consistent, distributed, structured row store. Questions about Cassandra server administration should be asked on https://dba.stackexchange.com/questions/tagged/cassandra .

Apache Cassandra is a highly scalable, eventually consistent, distributed, structured row/column store. Cassandra brings together the distributed systems technologies from Dynamo and the data model from Google's . Like , Cassandra is eventually consistent. Like BigTable, Cassandra provides a ColumnFamily-based data model richer than typical key/value systems.

Cassandra's Dynamo-based cluster model provides linear scalability and fault tolerance on commodity hardware or cloud infrastructure. Its support for replicating across multiple data centers is best-in-class, providing low latency and the ability to survive entire data center outages.

Cassandra's data model offers the convenience of column indexes with the performance of log-structured updates and powerful built-in caching with the fastest write performance as compared to other database solutions and makes it a compelling option for big data processing. It provides linear scalability with the provision to add/remove nodes on the fly without downtime.

Cassandra was open-sourced by Facebook in 2008 and quickly became a top-level Apache project. Today, it's widely used by companies in many markets.

Official links:

Documentation

Useful Links:

20596 questions
5
votes
1 answer

Why does Spark Cassandra Connector fail with NoHostAvailableException?

I am having problems getting Spark Cassandra Connector working in Scala. I'm using these versions: Scala 2.10.4 spark-core 1.0.2 cassandra-thrift 2.1.0 (my installed cassandra is v2.1.0) cassandra-clientutil 2.1.0 cassandra-driver-core 2.0.4…
Greg
  • 10,696
  • 22
  • 68
  • 98
5
votes
1 answer

Cassandra database, which python interface?

I'm going to write the web portal using Cassandra databases. Can you advise me which python interface to use? thrift, lazygal or pycassa? Are there any benefits to use more complicated thrift then cleaner pycassa? What about performace - is the same…
Robert Zaremba
  • 8,081
  • 7
  • 47
  • 78
5
votes
6 answers

Cassandra and asp.net (C#)

I am interested to create portal on cassandra services, since I faced some performance and scale issues starting from 1 million of records. Definitely, it could be solved, but I am interested on other options. My main issues is cost of updating all…
st78
  • 8,028
  • 11
  • 49
  • 68
5
votes
2 answers

Which Cassandra partitioner is better: Random or Murmur3 (in terms of throughput) and what is the difference between them?

What difference the choice of partitioners could bring in my Cassandra throughput and latency? I have gone through all three partitioners and one thing I noticed is that ByteOrdered partitioner has overhead so I do not use it. Now I am a bit split…
5
votes
1 answer

Thrift,.NET,Cassandra - Is this is right combination?

I've been evaluating technology stack for developing a social network based application. Below are the stack I think could well suitable for this application type of application: GUI -- ASP.NET MVC, Flash (Flex) Business Services -- Thrift based…
asyncwait
  • 4,457
  • 4
  • 40
  • 53
5
votes
1 answer

Cassandra's atomicity and "rollback"

The Cassandra 2.0 documentation contains the following paragraph on Atomicity: For example, if using a write consistency level of QUORUM with a replication factor of 3, Cassandra will replicate the write to all nodes in the cluster and wait for…
Chris Lercher
  • 37,264
  • 20
  • 99
  • 131
5
votes
1 answer

Error on Cassandra server: Unable to gossip with any seeds

I'm adding a second node to a single-node cassandra cluster, and getting a stack trace on the second node: ERROR 18:13:42,841 Exception encountered during startup java.lang.RuntimeException: Unable to gossip with any seeds at…
Don Branson
  • 13,631
  • 10
  • 59
  • 101
5
votes
2 answers

Difference between Document-oriented-DB and Bigtable clones

Can someone give a head-to-head comparison between them? We are looking for a suitable storage engine for our weblog history data. We looked at Bigtable's paper and understand it is suitable to us well. However, I also understand that…
chen
  • 4,302
  • 6
  • 41
  • 70
5
votes
1 answer

Is Cassandra database row size limited by available memory?

I'm working with very long time series -- hundreds of millions of data points in one series -- and am considering Cassandra as a data store. In this question, one of the Cassandra committers (the über helpful jbellis) says that Cassandra rows can be…
Adam Hollidge
  • 759
  • 6
  • 12
5
votes
2 answers

cqlsh equivalent to mysql -e

I'm trying to create a small shell script where it would be very handy for me to run a command directly from the command line via cqlsh. In MySQL I could do something like: mysql -u root -e "show databases;" Is there a cqlsh equivalent to -e, or…
pcalcao
  • 15,789
  • 1
  • 44
  • 64
5
votes
2 answers

When cassandra-driver was executing the query, cassandra-driver returned error OperationTimedOut

I use python script, that passes to cassandra batch query, like this: query = 'BEGIN BATCH ' + 'insert into ... ; insert into ... ; insert into ...; ' + ' APPLY BATCH;' session.execute(query) It is work some time, but in about 2 minutes after…
fervid
  • 2,033
  • 3
  • 13
  • 13
5
votes
1 answer

How to query for only 1 field with Spring Data Cassandra?

I am using Spring Data Cassandra 1.0.0. I have managed to persist and read back my entity. However, now I want to do a query that only returns 1 field of the entity. This is what I have tried: public Optional
Wim Deblauwe
  • 25,113
  • 20
  • 133
  • 211
5
votes
1 answer

How do I insert a row with a TimeUUIDType column in Cassandra?

In Cassandra, I have the following Column Family: I'm trying to insert a record into it as follows using a C++ generated function generated by Thrift: ColumnPath…
mixmasteralan
  • 489
  • 4
  • 11
5
votes
1 answer

Cassandra CQL3: JSON or UDT

I need to store records about user locations based on IP addresses but Im not sure how best to model it. For each address, we need to record the machine's details (ipaddress, agentId) and the machine's location (isocode, city). This information will…
beterthanlife
  • 1,668
  • 2
  • 18
  • 30
5
votes
1 answer

Using Mesos to manage a cluster of Web App + Databases

I'm just learning Apache Mesos. I would like to run a cluster of web apps (Scala Play) integrating with a database cluster (Cassandra), managed by Mesos. When a web app goes down or a database goes down, my understanding is that Mesos will auto…