I have a classification problem where I have a set of blocks which form my data points. One of the attributes that I can use for block classification is a tag, which essentially is a block number of another block. The blocks also have other attributes (size) which can be used for classification. The "tag" attribute in my data set can be used for classification as follows - If 2 blocks have 2 tags (block numbers) that belong to the same cluster, the blocks or data points should be clustered together. Here, I do not know beforehand what cluster number the tag number will be beforehand.
Block 1 [Tag 4] size 10
Block 2 [Tag 3] size 20
Block 3 [Tag 1] size 100
Block 4 [Tag 2] size 110
Here, based on the Tag attribute, Block 1 and Block 2 tag Block 3 and 4 respectively. also, block 3 and block 4 tag block 2 and block 1 respectively. Hence, Block 1, Block 2 can belong to cluster id 1, and block 3 and 4 can belong to cluster id 2. also, the size of blocks 1,2 are more similar than sizes of blocks 3,4. the end result of classification should be
cluster id 1: Block 1 , Block 2
cluster id 2: Block 3 , Block 4
Is there a way to classify such data points? As I understand, a Naive Bayes Classifier considers each attribute to be independent of each other. Here, the attribute (tag) is dependent on a future event (the cluster id in which the tagged block number will belong). What form/class of clustering algorithms should I look for to solve this problem? One approach that I can think of is running k-means using other attributes such as size, and then when I approximately know the cluster ids, I add this cluster id to tags and use that as an attribute for classification. Are there alternative better approaches to write classifiers where attributes depend on resultant clusters themselves? Any help would be appreciated.