Questions tagged [clustering-key]
52 questions
1
vote
1 answer
Cassandra Partition Key and Clustering Column Size
How does cassandra calculates the size of partitioning key and clustering key . We have tables with with relatively large partitioning keys (UUID and combination of UUID) along with large clustering key for…

invincible
- 73
- 5
1
vote
2 answers
How clustering is helping in query pruning in Snowflake?
I have a table clustered on s_nation_key as below.
create or replace table t1
( S_SUPPKEY string,
S_NAME string,
S_NATIONKEY string,
S_ADDRESS string,
S_ACCTBAL string) cluster by (S_NATIONKEY);
Now i have added data to it
INSERT INTO …

HimanshuSPaul
- 278
- 1
- 4
- 19
1
vote
3 answers
Snowflake - Clustering
What is the best approach for clustering snowflake tables
Absolute clustering by manually reloading the tables at a certain frequency based on retrieval order
Create cluster key and turn on auto recluster but suspend it most of them, run it only at…

Rajib Deb
- 1,496
- 11
- 30
1
vote
0 answers
I want to know the x and y axes labels of dbscan (sklearn) algorithm
https://scikit-learn.org/stable/auto_examples/cluster/plot_dbscan.html#sphx-glr-auto-examples-cluster-plot-dbscan-py
This is the link of sklearn dbcsan.

J.l
- 31
- 1
1
vote
1 answer
Performance of query with only partition key
Is the performance impacted if I provide only the partition key while querying a table containing both partition key and clustering key?
For example, for a table with partition key p1 and clustering key c1, would
SELECT * FROM table1 where p1 =…

tourniquet_grab
- 792
- 9
- 14
1
vote
1 answer
Cassandra TimeUUID flood file descriptor when use uuid in default
I have Cassandra model as
import uuid
from cassandra.cqlengine import columns
from cassandra.cqlengine.models import Model
class MyModel(Model):
...
...
created_at = columns.TimeUUID(primary_key=True,
…

Nilesh
- 20,521
- 16
- 92
- 148
1
vote
1 answer
Cassandra sort and a changing clustering key
I have a data modeling question for cases where data needs to be sorted by keys which can be modified.
So , say we have a user table
{
dept_id text,
user_id text,
user_name text,
mod_date timestamp
PRIMARY KEY (dept_id,user_id)
}
Now…

factotum
- 900
- 10
- 13
1
vote
1 answer
WSO2 Instances Configuration
I'm trying to configure my Worker-Manager API instances with the wso2am-2.0.0. I have not created databases and made no configurations related to the same. However while running my bin/wsoserver.sh throws database exception…

user3584564
- 150
- 1
- 1
- 8
1
vote
1 answer
Cassandra CQL3 clustering order and pagination
I am building out a user favourites service using Cassandra. I want to be able to have the favourites sorted by latest and then be able to paginate over the track_ids i.e the front end sends back the last track_id in the 200 page.
CREATE TABLE…

Justin Holmes
- 73
- 6
1
vote
4 answers
Clustering Algorithm for average energy measurements
I have a data set which consists of data points having attributes like:
average daily consumption of energy
average daily generation of energy
type of energy source
average daily energy fed in to grid
daily energy tariff
I am new to clustering…

Keya Patel
- 29
- 3
0
votes
1 answer
Create and assign groups based on overlapping/nonoverlapping values between two columns
I have a data frame which includes min and max columns:
df <- data.frame(min=c(2, 4, 3, 3, 2, 6),
max=c(2.9, 5.9, 3.9, 4.9, 7.9, 7.9))
I am interested in creating and assigning groups based on overlap/non-overlap between the two…

Bradley S
- 13
- 2
0
votes
1 answer
Snowflake delete query scanning all partitions
I have an ETL process that it's deleting a couple hundred thousand rows from a table with 18 billion rows using a unique hashed surrogate key like: 1801b08dd8731d35bb561943e708f7e3
delete from CUSTOMER_CONFORM_PROD.c360.engagement
where…

Luis Lema
- 307
- 1
- 3
- 10
0
votes
0 answers
How can I do clustering with one variable input
I have a one dimensional data with 10,000 rows.
I like to do some group/clustering of these values.
I was trying to do k-menas clustering but it looks with one variable it's not quite possible.
I have tried to do clustering as follows but it looks…

Zerone
- 127
- 1
- 12
0
votes
1 answer
Snowflake Automatic Clustering RESUMED accid. nows always turns on again even after SUSPEND
I have a base table for which I've built two MVs:
to be filtered by LOCAL_TS (Epoch in milliseconds)
other to be filtered for UTC_TS
I've clustered both initially by date(TS) and it was working fine, until ...I've accidentally run the command to…

neverMind
- 1,757
- 4
- 29
- 41
0
votes
1 answer
n_jobs got an unexpected keyword argument
I have a parameter in k-Means clustering. how do i resolve this error to solve the problem in clustering? I tried all methods but cant find the solution.