I've been working on a project that involves the clustering of data with periodic boundary conditions. So, I am looking for clustering algorithms that can effectively handle datasets where periodicity plays a significant role.
My data is 3D and I am interested if DBSCAN or HDBSACN can implemnt the periodic boundary conditiin for this case. I found in literature K-means has a way of doing it. https://doi.org/10.3390/sym14061237
Thank you in advance for your valuable input!
ON HDBSCAN, I do it this way
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import hdbscan
import seaborn as sns
def fit_and_visualize_clusters(filename, atom_indices):
atoms = read(filename)
# Get the positions of the specified atoms
data = [atoms.positions[i] for i in atom_indices]
X = [i[0] for i in data]
Y = [i[1] for i in data]
Z = [i[2] for i in data]
# Create an instance of HDBSCAN
clusterer = hdbscan.HDBSCAN(min_cluster_size=4, gen_min_span_tree=True)
# Perform clustering
cluster_labels = clusterer.fit_predict(data)
print(cluster_labels)
For DBSCAN,
from sklearn.cluster import DBSCAN
import pandas as pd
# Convert the dataset to a DataFrame
DBSCAN_clustered = pd.DataFrame(X, columns=['X', 'Y', 'Z'])
# Perform DBSCAN clustering
DBS_clustering = DBSCAN(eps=6.25, min_samples=6).fit(DBSCAN_clustered)
# Assign cluster labels to the DataFrame
DBSCAN_clustered['Cluster'] = DBS_clustering.labels_