0

I am looking for some robust classification/clustering models, e.g. decision trees, that would utilise hierarchical information present in the dataset.

The dataset consists of unique rows (customer ID's) and purchased products (columns). The columns are 3-level and hierarchical, with hierarchy being - class - product - product type.

Example being -> 'Bedroom' (class) - 'Beds' (product) - 'King size beds' (product type).

Value of the table are counts, i.e. they indicate whether the customer in question bought e.g. a king size bed and how many.

I am looking for some classification model, which would classify customers first based on the 'class' of product, then based on 'product', then based on 'product type'.

Perhaps, I am looking for some classification within classification method. Is there anything like this available - preferably in Python?

kikatuso
  • 150
  • 12

1 Answers1

0

Did you try to simply use sklearn.cluster library (https://scikit-learn.org/stable/modules/classes.html#module-sklearn.cluster) to first clusterize your data based on the class, then in each cluster re-apply a clustering method based on the product, and so on?

Or maybe I misunderstood the question?

shinwoa
  • 36
  • 4