Questions tagged [cassandra]

Apache Cassandra is a highly scalable, eventually consistent, distributed, structured row store. Questions about Cassandra server administration should be asked on https://dba.stackexchange.com/questions/tagged/cassandra .

Apache Cassandra is a highly scalable, eventually consistent, distributed, structured row/column store. Cassandra brings together the distributed systems technologies from Dynamo and the data model from Google's . Like , Cassandra is eventually consistent. Like BigTable, Cassandra provides a ColumnFamily-based data model richer than typical key/value systems.

Cassandra's Dynamo-based cluster model provides linear scalability and fault tolerance on commodity hardware or cloud infrastructure. Its support for replicating across multiple data centers is best-in-class, providing low latency and the ability to survive entire data center outages.

Cassandra's data model offers the convenience of column indexes with the performance of log-structured updates and powerful built-in caching with the fastest write performance as compared to other database solutions and makes it a compelling option for big data processing. It provides linear scalability with the provision to add/remove nodes on the fly without downtime.

Cassandra was open-sourced by Facebook in 2008 and quickly became a top-level Apache project. Today, it's widely used by companies in many markets.

Official links:

Documentation

Useful Links:

20596 questions
5
votes
1 answer

How to get good performance on reading cassandra partitions in spark?

I am reading data from cassandra partition to spark using cassandra-connector.I tried below solutions for reading partitions.I tried to parallelize the task by creating rdds as much as possible but both solution ONE and solution TWO had same…
Knight71
  • 2,927
  • 5
  • 37
  • 63
5
votes
1 answer

Cassandra cleanup on several servers at once

We have a big Cassandra cluster 18 Servers (on one server near 5T data ) http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html - We have added a new nodes following this documentation . After we have added new…
5
votes
2 answers

What node does Cassandra store data on?

Is there a command or any way at all to know what data is stored on what nodes of Cassandra? Im pretty new to Cassandra and haven't had much luck googling this question. Thanks!
user3376961
  • 867
  • 2
  • 12
  • 17
5
votes
2 answers

Cassandra Datastax Driver Retry Policy

We can create a cluster instance like this. cluster = Cluster .builder() .addContactPoint("192.168.0.30") .withRetryPolicy(DefaultRetryPolicy.INSTANCE) .build(); Were we will give the information for number of time a request has to be retried…
Jobs
  • 1,257
  • 2
  • 14
  • 27
5
votes
1 answer

Cassandra storage internal

I'm trying to understand what exactly happens internally in storage engine level when a row(columns) is inserted in a CQL style table. CREATE TABLE log_date ( userid bigint, time timeuuid, category text, subcategory text, itemid text, …
Woojun Kim
  • 93
  • 1
  • 2
  • 7
5
votes
2 answers

delete multiple elements from a MAP in cassandra?

I have a field cat_to_pub with type as MAP. {1: '9-20-21', 2: '2-5-21', 4: '2-5-21', 5: '2', 6: '2', 9: '2-83-153-149', 11: '2-5-21-31', 29: '100', 32: '113-198-21'} I can delete an individual element from this MAP by using DELETE cat_to_pub[1]…
HIRA THAKUR
  • 17,189
  • 14
  • 56
  • 87
5
votes
1 answer

How to search a cassandra collection map using QueryBuilder

In my cassandra table i have a collection of Map also i have indexed the map keys. CREATE TABLE IF NOT EXISTS test.collection_test( name text, year text, attributeMap map, PRIMARY KEY ((name, year)) ); CREATE INDEX ON…
Bibhu Biswal
  • 181
  • 1
  • 1
  • 10
5
votes
2 answers

why HBase count operation so slow

The command is: count 'tableName'. It's very slow to get the total row number of the whole table. My situation is: I have one master and two slaves, each node with 16 cpus and 16G memory. My table only has one column family with two columns:…
Jack
  • 5,540
  • 13
  • 65
  • 113
5
votes
1 answer

How to do polling in cassandra?

I'm trying to find a way to do polling over a cassandra database, but I'm new at this and I don't know how. Lets say I have a table "users" like this -> users -> user_name -> gender -> state and I want to do polling constantly so I know…
5
votes
1 answer

Cassandra - advantages of custom type

I am planning to use a Java object as a custom type and store it Cassandra. I am taking out 2 data members from the class and making them into primary key and keeping the rest of the data members in the custom type. data members of my class: name,…
summer
  • 139
  • 1
  • 10
5
votes
1 answer

Cassandra Keyspace name with hyphen (-)

I am using cassandra 1.2.15 version. Using cassandra CQL Java driver I will be creating a keyspace. My problem is I can't able to create a keyspace which contains hyphen (test-hyphen). Code: String query = "CREATE KEYSPACE \"test-hyphen\" WITH…
Jaya Ananthram
  • 3,433
  • 1
  • 22
  • 37
5
votes
1 answer

cassandra.InvalidRequest: code=2200 [Invalid query] message="Keyspace '' does not exist"

I'm trying to use python driver for cassandra but when I run these three lines in python shell from cassandra.cluster import Cluster cluster = Cluster() session = cluster.connect('demo') I get this error cassandra.InvalidRequest: code=2200 [Invalid…
micheal
  • 1,283
  • 3
  • 14
  • 23
5
votes
1 answer

How do I store unsigned integers in Cassandra?

I am storing some data in Cassandra via the Datastax driver, and I have the need to store unsigned 16-bit and 32-bit integers. For unsigned 16-bit integers, I can easily store them as signed 32-bit integers and cast them as needed. For unsigned…
Mark
  • 11,257
  • 11
  • 61
  • 97
5
votes
1 answer

CQL with a wide row - how to get most recent set?

How would I write the CQL to get the most recent set of data from each row? I'm investigating transitioning from MSSQL to Cassandra and am starting to grasp the concepts. Lots of research has help tremendously, but I haven't found answer to this (I…
5
votes
1 answer

Cassandra data aggregation by Spark

I would like to use Server-side data selection and filtering using the cassandra spark connector. In fact we have many sensors that send values every 1s, we are interested on these data aggregation using months, days, hours, etc, I have proposed the…
Wassim
  • 113
  • 1
  • 7