I am trying to implement Apriori algorithm in Java, and have problems with generating Candidate itemsets. To create candidates for k-itemset I use all combinations of k-1 and 1-itemsets. For example, for Frequent 1-itemset: bread:9, milk:9, coffee:9, sugar:10.
Candidate 2-itemsets generated should be: bread milk, bread coffee, bread sugar, milk coffee, milk sugar, coffee sugar.
But my code returns: bread coffee, bread milk, bread sugar, coffee bread, coffee milk, coffee sugar, milk bread, milk coffee, milk sugar, sugar bread, sugar coffee, sugar milk (all permutations; returns both bread milk and milk bread, however, these two are the same thing).
My code:
public static ArrayList<ArrayList<String>> getCandidates(Map<String, Long> one_itemset_1, Map<String, Long> n_itemset_1){
ArrayList<ArrayList<String>> tuples = new ArrayList<ArrayList<String>>();
Map<String, Long> one_itemset = sortbykey(one_itemset_1);
Map<String, Long> n_itemset = sortbykey(n_itemset_1);
for(String item : one_itemset.keySet()) {
for(String item2 : n_itemset.keySet()) {
if(!item.equals(item2) && !item2.contains(item)) {
ArrayList<String> singleList = new ArrayList<String>();
singleList.add(item);
String item2_sep[] = item2.split(" ");
for(int i = 0; i < item2_sep.length; i++)
singleList.add(item2_sep[i]);
//singleList.add(item2);
tuples.add(singleList);
}
//index2++;
}
//index2 = 0;
//index1++;
}
return tuples;
}
Is there a way to modify this method to get rid of repetitive itemsets? Please advise. Thank you.