-1

In comparative genomics, the identification of orthologous genes [which are genes that are believed to have similar function] in two genomes is important in a variety of applications. The relationship between these genes from the two genomes can be 1:1, 1:M, M:1, and M:M. In Scala, I wrote this simple case class to represent a gene:

case class Gene(id: Int, protId: String, geneId: String)

and this function to do the mappings:

def orthologyMapping(genome1: Array[Gene], genome2: Array[Gene]): Vector[HashMap[Gene, Gene]] = { ...

I couldn't find in the documentation any built-in type for this specific type of collection of mapping relations. As you can see, orthologyMapping() return type is Vector[HashMap[Gene, Gene]], and that Vector contains a bunch of HashMap of 1:1 relationships.

7kemZmani
  • 658
  • 1
  • 8
  • 21

2 Answers2

1

Have you considered modeling this set of relationships as a graph? Because that seems like a natural fit to me. If you'd like a library that is ready to use, have a look at Quiver from Verizon's OnCue team: https://verizon.github.io/quiver/

Bob Gleason
  • 36
  • 1
  • 2
  • Quiver did work in my case; though I was looking for something simpler and more primitive to scala. My initial thought was to make the return type Vector[HashMap[List[Gene], List[Gene]]], given that in each HashMap, neither the key nor the value lists can be empty [both can have at least on item] – 7kemZmani Dec 23 '16 at 15:30
0

The HashMap[T,U] represents an M:1 relationship, for instance (a -> 1),(b -> 1). To represent an M:M relationship, you can use HashMap[Gene, Set[Gene]]. This can model an M:M relationship, e.g. (a -> (1,2)),(b -> (1,2))

radumanolescu
  • 4,059
  • 2
  • 31
  • 44
  • the usage of HashMap doesn't actually capture the reality of the relationship; the quick-and-dirty solution I'm using now is Vector[List[List[Gene], List[Gene]]] as a return type of orthologyMapping() function – 7kemZmani Nov 24 '16 at 03:56