0

From what I've seen in documentation and various examples, typical worfklow with data in Mallet requires you to work with feature list that you usually obtain by passing your data through "pipes" while iterating over them with some sort of iterator. The data is ususally stored in some csv file.

I am trying to obtain features list from two arrays of doubles. One array stores actual features and is of size n x m (where n is amount of features and m is count of feature vectors) and other one of size 1 x m and contains binary labels. How should I convert those into feature list, so I can use them in classifiers.

dkaras
  • 195
  • 2
  • 12
  • Show us some code. At least an example of the data you have and what you are aiming to achieve. – R.Costa Dec 09 '16 at 10:06
  • Having Double[][] containing features and Double[] containing labels for each feature vector my aim is to obtain object of class InstanceList that I can use in training/classification. There's no need to write about the way I obtain these values - it doesn't really matter. You can even imagine two arrays filled with values at compile time. – dkaras Dec 09 '16 at 10:38

1 Answers1

0

I ended up writing custom Itereator similar to the one present in Mullet called "ArrayDataAndTargetIterator". I also had to use a pipe defined like this:

new SerialPipes(Arrays.asList(new Target2Label(), new Array2FeatureVector()));
dkaras
  • 195
  • 2
  • 12