I have a list of tuple list which have overlapping elements.
val tupLis:Seq[(List[(Integer,Char)],Int)] = null//data
I am trying to merge overlapping elements in the tuple list. Here is a code that i am working on which uses foldleft to merger overlapping tuple list from the list.The merging doesn't seam to be working as it misses out some elements of the tuple list.Each tuple list contains 4 tuples present in them. Each tuple list in the list often overlapping as they are generated from a bigger list using sliding function.
val alLis:Seq[(List[(Integer,Char)],Int)] = snGrMap.map(_._2).flatten.toList.sortBy(_._1.head._1)
val res = alLis.foldLeft(mutable.HashMap.empty[Int,(List[Integer],List[(Integer,Char)],Int)]) { (map, value) =>
if(map.size<=0){
map.put(0,(value._1.map(_._1),value._1,value._2))
}else{
val cads = map.filter(p=>value._1.intersect(p._2._2).size>=3)
if(cads.size>=1) {
cads.foreach { i =>
val cmnPos = i._2._1.intersect(value._1.map(_._1))
val cmnBase = i._2._2.filter(p=>cmnPos.contains(p._1)).intersect(value._1.filter(p=>cmnPos.contains(p._1)))
println(cmnBase.size,cmnPos.size,value._1, i._2._2)
if(cmnBase.size == cmnPos.size)
map.put(i._1,((i._2._1++value._1.map(_._1)).distinct,(i._2._2++value._1).distinct,i._2._3+value._2))
else
map.put(map.size,(value._1.map(_._1),value._1,value._2))
}
}else{
map.put(map.size,(value._1.map(_._1),value._1,value._2))
}
}
map
}
here is the example data which i am using:
(List((306,c), (328,g), (336,a), (346,g)),282)
(List((306,g), (328,c), (336,g), (346,a)),22)
(List((306,c), (328,c), (336,g), (346,a)),4)
(List((328,g), (336,a), (346,g), (348,t)),164)
(List((328,g), (336,a), (346,g), (348,c)),161)
(List((328,c), (336,g), (346,a), (348,c)),28)
(List((336,a), (346,g), (348,t), (358,a)),168)
(List((336,a), (346,g), (348,c), (358,a)),154)
(List((336,g), (346,a), (348,c), (358,g)),30)
(List((346,g), (348,t), (358,a), (361,c)),178)
(List((346,g), (348,c), (358,a), (361,c)),166)
(List((346,a), (348,c), (358,g), (361,g)),34)
The merged list look like:
List((306,c), (328,g), (336,a), (346,g), (348,t), (358,a), (361,c),792)
List((306,c), (328,g), (336,a), (346,g), (348,c), (358,a), (361,c) ),763)
List((306,g), (328,c), (336,g), (346,a), (348,c), (358,g), (361,g) ),96)
Update 1:
Overlap: If two list of tuples have the 3 or more exact tuples present in both list,then they are supposed to be overlapping list of tuples.But there should not be any difference when two list is merged.If one of the tuple values in both list have same integer but different char,then they shall not be merged. Merge: Combining two or more list of tuple list when they overlap.
Update 2: I have come up with a small solution,but not sure how efficient is it.
val alLisWithIndex = alLis.zipWithIndex
val interGrps = new ListBuffer[(Int,Int)]()
alLisWithIndex.foreach{i=>
val cads = alLisWithIndex.filter(p=>p._1._1.take(3).intersect(i._1._1.takeRight(3)).size>=3)
cads.foreach(p=>interGrps.append((i._2,p._2)))
}
println(interGrps.sortBy(_._1))
so when i print the above code, i get tuple list grouped in this manner. i have printed only the index of the each tuple group that should be merged.
Result generated: ListBuffer((0,2), (0,3), (1,4), (2,5), (3,6), (4,7), (5,8), (6,9), (7,10))
here is the list of tuples with their index that was used
List(((List((306,c), (328,g), (336,a), (346,g)),282),0),
((List((306,g), (328,c), (336,g), (346,a)),22),1),
((List((328,g), (336,a), (346,g), (348,t)),164),2),
((List((328,g), (336,a), (346,g), (348,c)),161),3),
((List((328,c), (336,g), (346,a), (348,c)),28),4),
((List((336,a), (346,g), (348,t), (358,a)),168),5),
((List((336,a), (346,g), (348,c), (358,a)),154),6),
((List((336,g), (346,a), (348,c), (358,g)),30),7),
((List((346,g), (348,t), (358,a), (361,c)),178),8),
((List((346,g), (348,c), (358,a), (361,c)),166),9),
((List((346,a), (348,c), (358,g), (361,g)),34),10))
So now all i had to do was use the interGrps
,link the groups based on second value and finally replace the indexes with list of tuples..