1

I finished to create my Mongo database. It is made on two collections:

1. team
2. coach

I give you an example of the documents contained in these collections:

Here is a team document:

{
    "_id" : "Mil.74",
    "official_name" : "Associazione Calcio Milan S.p.A",
    "common_name" : "Milan",
    "country" : "Italy",
    "started_by" : {
        "day" : 16,
        "month" : 12,
        "year" : 1899
    },
    "stadium" : {
        "name" : "Giuseppe Meazza",
        "capacity" : 81277
    },
    "palmarès" : {
        "Serie A" : 18,
        "Serie B" : 2,
        "Coppa Italia" : 5,
        "Supercoppa Italiana" : 6,
        "UEFA Champions League" : 7,
        "UEFA Super Cup" : 5,
        "Cup Winners cup" : 2,
        "UEFA Intercontinental cup" : 4
    },
    "uniform" : "black and red"
}

This is a coach document:

{
    "_id" : ObjectId("556cec3b9262ab4f14165fcd"),
    "name" : "Carlo",
    "surname" : "Ancelotti",
    "age" : 55,
    "date_Of_birth" : {
        "day" : 10,
        "month" : 6,
        "year" : 1959
    },
    "place_Of_birth" : "Reggiolo",
    "nationality" : "Italian",
    "preferred_formation" : "4-2-3-1",
    "coached_Team" : [ 
        {
            "team_id" : "RMa.103",
            "in_charge" : {
                "from" : "26/june/2013",
                "to" : "25/may/2015"
            },
            "matches" : 119
        }, 
        {
            "team_id" : "PSG.00",
            "in_charge" : {
                "from" : "30/dec/2011",
                "to" : "24/june/2013"
            },
            "matches" : 77
        }, 
        {
            "team_id" : "Che.11",
            "in_charge" : {
                "from" : "01/july/2009",
                "to" : "22/may/2011"
            },
            "matches" : 109
        }, 
        {
            "team_id" : "Mil.74",
            "in_charge" : {
                "from" : "07/nov/2001",
                "to" : "31/may/2009"
            },
            "matches" : 420
        }
    ]

As you can see, I used a normalized model: every coach has an array of coached teams. I want to convert this Mongo database into a graph database, in particular Neo4j; my goal is to show that in this highly connected domains neo4j has better performance than Mongo(For example the query:"Find the palmarès of all teams coached by Carlo Ancelotti, in mongo requires two queries, instead in neo4j it's enough to follow relationships).
I found this guide on the forum that uses Gremlin to convert a mongo collection of documents into neo4j graph automatically.
The problem is that the guide talks about just one collection.
So, is it possible to generate automatically the neo4j graph starting from my mongo database(with two collections) or must I create the graph "by hand"?

Community
  • 1
  • 1
harry-potter
  • 1,981
  • 5
  • 29
  • 55

1 Answers1

1

Gremlin is a Domain Specific Language for working with graphs, but it is based on Groovy so you effectively have all the flexibility you want to really do whatever you want. In other words, what you can do with one MongoDB collection you can easily do with two (or however many collections you have). That was the point of the blog post referenced in one of the other answers:

http://thinkaurelius.com/2013/02/04/polyglot-persistence-and-query-with-gremlin/

Gremlin is a great language for transforming data into graph form, whatever its source format is. I would think that you would first load all of your teams as vertices then iterate through your coaches, creating coach vertices and edges to their related teams as you go.

I would also add that nothing is "automatic" about Gremlin. It's not as though you tell Gremlin that you have data in MongoDB and it turns it into a graph. You have to write Gremlin to tell it how you want your MongoDB data turned into a graph.

Community
  • 1
  • 1
stephen mallette
  • 45,298
  • 5
  • 67
  • 135
  • Ok, thanks for your answer. I thought Gremlin creates the graph from mongo -.- So if I create the graph using neo4j v2.2.2 or using Gremlin, what are the differences? – harry-potter Jun 04 '15 at 18:49
  • Well i think you understand at this point that either way you will have to write some code to construct your graph irrespective of the tool you use to do it. I'm not aware of any specific tools for Neo4j that simplify reading data from mongodb. So that pretty much leaves you with writing some java code against the neo4j java API or perhaps massaging your mongo data into something cypher can import (i.e. csv). With gremlin you effectively do the same thing (i.e. write a program) but from the blog post I hope you can see the intuitive development pattern involved. – stephen mallette Jun 04 '15 at 19:42
  • You're right. Ok, I'll try to use Gremlin. I had a problem, and I opened a new topic. Can you take a look? Thanks. – harry-potter Jun 04 '15 at 20:07