Questions tagged [elki]

ELKI is an open source data mining software with the focus on cluster analysis and outlier detection. It uses index structures for accelerating these algorithms.

ELKI is an open source data mining software with the focus on cluster analysis and outlier detection. In contrast to most other tools, it has support for index structures.

164 questions
2
votes
2 answers

Clustering algorithm with different epsilons on different axes

I am looking for a clustering algorithm such a s DBSCAN do deal with 3d data, in which is possible to set different epsilons depending on the axis. So for instance an epsilon of 10m on the x-y plan, and an epsilon 0.2m on the z axis. Essentially, I…
yamayama
  • 49
  • 9
2
votes
1 answer

ELKI PAM clustering

I'm using ELKI for the first time and I'm having a problem understanding it's structure. It seems that I need to do a lot to produce some results. How can I perform PAM K-medoid clustering with custom distance measure?
Kobe-Wan Kenobi
  • 3,694
  • 2
  • 40
  • 67
2
votes
1 answer

How to work with sparse data using ELKI?

i'm trying to use a sparse matrix as input data in ELKI SOD algorithm to detect outliers. I was looking for help in howto and faqs page about sparse data, so i've tried to use SparseNumberVectorLabelParser and SparseVectorFieldFilter like…
Wesin Alves
  • 371
  • 1
  • 3
  • 13
2
votes
1 answer

Top n outliers in ResultWriter

I am dealing with high dimensional and large dataset, so i need to get just Top N outliers from output of ResultWriter. There is some option in elki to get just the top N outliers from this output?
Wesin Alves
  • 371
  • 1
  • 3
  • 13
2
votes
1 answer

Outlier dectection Using ELKI

I am use ELKI data mining software for outlier detection. It have many outliers detection techniques but all provides same results(same outliers with all techniques the only difference is in the size of the circle around the points as shown in…
Fallak Asad
  • 368
  • 4
  • 18
2
votes
1 answer

ELKI - k-means clustering.

I' like to run ELKI k-means clustering in command line. It seems that running time is too short compared with R programming. I tried to run k-means clustering in R, then It took about 100 seconds. Moreover, there is no change among k=5, k=10 and so…
akiniwa
  • 617
  • 8
  • 18
2
votes
1 answer

Pass Java array as an input for ELKI DBSCAN

I have been able to use ELKI for DBSCAN using Java code and its amazingly fast compared to any other tool. Till now I was working with a CSV file and using following to give that as an…
sau
  • 1,316
  • 4
  • 16
  • 37
2
votes
2 answers

How to see ELKI DBSCAN clustering result

I am using ELKI for DBSCAN clustering of some ~14,000 GPS points.Its running fine but I want to see information about clusters like how many points are in a cluster.?
sau
  • 1,316
  • 4
  • 16
  • 37
2
votes
1 answer

Using a Geo Distance Function on ELKI

I am using ELKI to mine some geospatial data (lat,long pairs) and I am quite concerned on using the right data types and algorithms. On the parameterizer of my algorithm, I tried to change the default distance function by a geo function…
doublebyte
  • 1,225
  • 3
  • 13
  • 22
2
votes
1 answer

How to use DimensionSelectingLatLngDistanceFunction in ELKI

Does anyone know how I am supposed to use the DimensionSelectingLatLngDistanceFunction in ELKI? When try to use it I get Constraint: distance.latitudedim >= 0. but what is -distance.latitudedim exactly? Does it let me specify meters instead of…
2
votes
2 answers

OPTICSXi - ELKI ResultWriter

I'm using ELKI to cluster, in a hierarchical way, a dataset of geolocations using OPTICSXi. The result of the execution of the algorithm is a set of files. The content of a file could be: # Cluster: nameOfCluster # OPTICSModel # Parents:…
Deborah
  • 355
  • 1
  • 5
  • 15
2
votes
1 answer

Clustering string data with ELKI

I need to cluster a large number of strings using ELKI based on the Edit Distance / Levenshtein Distance. Since the data set is too large, I'd like to avoid file based precomputed distance matrices. How can I (a) load string data in ELKI from a file…
Stahli
  • 21
  • 2
1
vote
1 answer

sample_weight option in the ELKI implementation of DBSCAN

My goal is to find outliers in a dataset that contains many near-duplicate points and I want to use ELKI implementation of DBSCAN for this task. As I don't care about the clusters themselves just the outliers (which I assume are relatively far from…
user1541776
  • 497
  • 4
  • 14
1
vote
1 answer

How can I cluster data using a distance matrix with the ELKI library?

I have a distance matrix and I want to use that distance matrix when clustering my data. I've read the ELKI documentation and it states that I can overwrite the distance method when extending the AbstractNumberVectorDistanceFunction class. The…
Vahe Karapetyan
  • 149
  • 2
  • 9
1
vote
1 answer

Running DBSCAN on GPS Data: Memory Error

For a project that I am currently working on, I need to cluster a relatively large number of pairs of GPS into different location clusters. After reading many posts and suggestions here in StackOverflow and taking different approaches, I still have…
Timothy.L
  • 19
  • 3
1 2
3
10 11