0

Essentially, I have a spreadsheet of calls, both incoming and outgoing, for one phone number (with associated information, like time and date), which constitutes one "network". I then have a spreadsheet of both incoming and outgoing calls for a second phone number, which constitutes a second "network." (I may also have the same for a third phone number, but I'm waiting to see on that.)

I'm interested in finding a way to measure the similarity/overlap of two or more of these call "networks" and was curious what if any measures/statistics exist. (If you can point to any literature or other resources that validate the method as widely used/credible, I'll be doubly appreciative!)

JSO
  • 13
  • 5
  • By similarity, do you mean how many incoming/outgoing phone numbers do they have in common? You need to say a bit more about what you are trying to determine. Right now it's not clear to me what the network angle buys you. Why not just calculate the proportion they have in common? – paqmo May 25 '17 at 02:08
  • You need to state how you define similarity for this problem. Is it number of calls? Is it frequency of calls? Is it proportion of numbers in common? Is it similar distribution for duration of calls? Is it type-of-call (local, international...)? Is it a mix of all the above (and more)? You need to improve your question for it to be useful. – shirowww May 29 '17 at 20:50

1 Answers1

0

What you seem to describe is a graph.

You could have each node of the graph be a phone number, and each edge to be a call.

You can perform many operations in graphs, such as determining if 2 nodes are connected, directly or indirectly through other nodes and how.

You can also identify "connected components", which are groups of nodes that are mutually connected to each other, as well as many other operations.

If you don't wish to roll your own solution, you can use a graph database. Neo4J is a product that falls in this category.

arboreal84
  • 2,086
  • 18
  • 21
  • Yes, I actually have a graph showing the overlap between these two sets of phone calls, but I wanted to see if there was a way to quantify the degree of similarity or overlap between them. – JSO May 24 '17 at 21:08
  • Well that seems a bit vague. You should elaborate more about what you mean by similarity or overlap. I am sure that if you define this, you can come up with a Neo4J query that can sort that out for you. – arboreal84 May 24 '17 at 21:17
  • there is this project on graph algorithms by neo4j https://github.com/neo4j-contrib/neo4j-graph-algorithms/, where you have some algorithms like connected components included. – Tomaž Bratanič May 24 '17 at 21:38