Questions tagged [clustering-key]

52 questions
1
vote
1 answer

Cassandra Partition Key and Clustering Column Size

How does cassandra calculates the size of partitioning key and clustering key . We have tables with with relatively large partitioning keys (UUID and combination of UUID) along with large clustering key for…
1
vote
2 answers

How clustering is helping in query pruning in Snowflake?

I have a table clustered on s_nation_key as below. create or replace table t1 ( S_SUPPKEY string, S_NAME string, S_NATIONKEY string, S_ADDRESS string, S_ACCTBAL string) cluster by (S_NATIONKEY); Now i have added data to it INSERT INTO …
HimanshuSPaul
  • 278
  • 1
  • 4
  • 19
1
vote
3 answers

Snowflake - Clustering

What is the best approach for clustering snowflake tables Absolute clustering by manually reloading the tables at a certain frequency based on retrieval order Create cluster key and turn on auto recluster but suspend it most of them, run it only at…
Rajib Deb
  • 1,496
  • 11
  • 30
1
vote
0 answers

I want to know the x and y axes labels of dbscan (sklearn) algorithm

https://scikit-learn.org/stable/auto_examples/cluster/plot_dbscan.html#sphx-glr-auto-examples-cluster-plot-dbscan-py This is the link of sklearn dbcsan.
J.l
  • 31
  • 1
1
vote
1 answer

Performance of query with only partition key

Is the performance impacted if I provide only the partition key while querying a table containing both partition key and clustering key? For example, for a table with partition key p1 and clustering key c1, would SELECT * FROM table1 where p1 =…
tourniquet_grab
  • 792
  • 9
  • 14
1
vote
1 answer

Cassandra TimeUUID flood file descriptor when use uuid in default

I have Cassandra model as import uuid from cassandra.cqlengine import columns from cassandra.cqlengine.models import Model class MyModel(Model): ... ... created_at = columns.TimeUUID(primary_key=True, …
Nilesh
  • 20,521
  • 16
  • 92
  • 148
1
vote
1 answer

Cassandra sort and a changing clustering key

I have a data modeling question for cases where data needs to be sorted by keys which can be modified. So , say we have a user table { dept_id text, user_id text, user_name text, mod_date timestamp PRIMARY KEY (dept_id,user_id) } Now…
factotum
  • 900
  • 10
  • 13
1
vote
1 answer

WSO2 Instances Configuration

I'm trying to configure my Worker-Manager API instances with the wso2am-2.0.0. I have not created databases and made no configurations related to the same. However while running my bin/wsoserver.sh throws database exception…
user3584564
  • 150
  • 1
  • 1
  • 8
1
vote
1 answer

Cassandra CQL3 clustering order and pagination

I am building out a user favourites service using Cassandra. I want to be able to have the favourites sorted by latest and then be able to paginate over the track_ids i.e the front end sends back the last track_id in the 200 page. CREATE TABLE…
1
vote
4 answers

Clustering Algorithm for average energy measurements

I have a data set which consists of data points having attributes like: average daily consumption of energy average daily generation of energy type of energy source average daily energy fed in to grid daily energy tariff I am new to clustering…
Keya Patel
  • 29
  • 3
0
votes
1 answer

Create and assign groups based on overlapping/nonoverlapping values between two columns

I have a data frame which includes min and max columns: df <- data.frame(min=c(2, 4, 3, 3, 2, 6), max=c(2.9, 5.9, 3.9, 4.9, 7.9, 7.9)) I am interested in creating and assigning groups based on overlap/non-overlap between the two…
0
votes
1 answer

Snowflake delete query scanning all partitions

I have an ETL process that it's deleting a couple hundred thousand rows from a table with 18 billion rows using a unique hashed surrogate key like: 1801b08dd8731d35bb561943e708f7e3 delete from CUSTOMER_CONFORM_PROD.c360.engagement where…
0
votes
0 answers

How can I do clustering with one variable input

I have a one dimensional data with 10,000 rows. I like to do some group/clustering of these values. I was trying to do k-menas clustering but it looks with one variable it's not quite possible. I have tried to do clustering as follows but it looks…
Zerone
  • 127
  • 1
  • 12
0
votes
1 answer

Snowflake Automatic Clustering RESUMED accid. nows always turns on again even after SUSPEND

I have a base table for which I've built two MVs: to be filtered by LOCAL_TS (Epoch in milliseconds) other to be filtered for UTC_TS I've clustered both initially by date(TS) and it was working fine, until ...I've accidentally run the command to…
0
votes
1 answer

n_jobs got an unexpected keyword argument

I have a parameter in k-Means clustering. how do i resolve this error to solve the problem in clustering? I tried all methods but cant find the solution.