I'm using WEKA tool for clustering data analysis, however in some of my attributes, there are many values within the domain. Specifically, I need to represent some information about proteins and the information that I need to include is the terms associated with their functions.
For example these values are include on the same attribute "Function":
"RNA-Binding protein", "RNA bindingstructural constituent of ribosomerRNA binding", "translation", "intracellularribosomeribonucleoprotein complex".
And these terms diversify hugely.
Can someone help me?