I'm building a decision tree system in Scala, but some of the entries in my data have identical attributes. I've gotten around this by implementing a "random" node type, allowing the query to randomly select which branch to traverse, but I'm getting a "MatchError" when trying to split the remaining examples at random. My current code:
def splitRandom(examples: Array[String]): Array[String]={
examples.collect {case x if(r.nextInt(100) < 50) => x}
}
"examples" is an array of strings, with each string being a line containing a single data entry with all of its attributes.