2

I'm not sure if this is a Neo4j question or a Spring Data question. I'm fairly new to Neo4j, so I just want to make sure I'm doing things right. I'm using spring-data-neo4j:4.0.0.RELEASE with a neo4j-community-2.3.1 DB instance.

The situation is that I am getting more nodes that I'm expecting back from DB queries. If I create a graph consisting of 3 different types of nodes:

(NodeA)-[:NodeAIncludesNodeB]->(NodeB)-[:NodeBIncludesNodeC]->(NodeC)

and then I run a query to get a single NodeA node I receive the entire graph from NodeA to NodeC in the query results.

It seems as though I'm getting cached results instead of live results from the DB. The reason I say this is because if I call session.context.clear() after the creation of the graph, the query no longer returns the entire graph including the NodeC's, but it does still return the single NodeA along with all of its NodeB's.

I found this quote in the Spring Data Neo4J documentation (http://docs.spring.io/spring-data/neo4j/docs/current/reference/html/):

Note, however, that the Session does not ever return cached objects so there’s no risk of getting stale data on load; it always hits the database.

I created a small example application to illustrate:

Entity classes:

@NodeEntity
public class NodeA extends BaseNode {

  private String name;
  @Relationship(type = "NodeAIncludesNodeB", direction = "OUTGOING")
  private Set<NodeB> bNodes;

  public NodeA() {}

  public NodeA(String name) {
    this.name = name;
  }
 //getters, setter, equals and hashcode omitted for brevity
}

@NodeEntity
public class NodeB extends BaseNode {

  private String name;
  @Relationship(type = "NodeBIncludesNodeC", direction = "OUTGOING")
  private Set<NodeC> cNodes;

  public NodeB() {}

  public NodeB(String name) {
    this.name = name;
  }
}

@NodeEntity
public class NodeC extends BaseNode {

  private String name;

  public NodeC() {}

  public NodeC(String name) {
    this.name = name;
  }  
}

Repository:

public interface NodeARepository extends GraphRepository<NodeA> {

  public NodeA findByName(String name);

  @Query("MATCH (n:NodeA) WHERE n.name = {nodeName} RETURN n")
  public NodeA findByNameQuery(@Param("nodeName") String name);

  @Query("MATCH (n:NodeA)-[r:NodeAIncludesNodeB]->() WHERE n.name = {nodeName} RETURN r")
  public NodeA findByNameWithBNodes(@Param("nodeName") String name);

  @Query("MATCH (n:NodeA)-[r1:NodeAIncludesNodeB]->()-[r2:NodeBIncludesNodeC]->() WHERE n.name = {nodeName} RETURN r1,r2")
  public NodeA findByNameWithBAndCNodes(@Param("nodeName") String name);
}

Test Application:

@SpringBootApplication
public class ScratchApp implements CommandLineRunner {

  @Autowired
  NodeARepository nodeARep;

  @Autowired
  Session session;

  @SuppressWarnings("unused")
  public static void main(String[] args) {
    ApplicationContext ctx = SpringApplication.run(ScratchApp.class, args);

  }

  @Override
  public void run(String...strings) {

    ObjectMapper mapper = new ObjectMapper();

    NodeA nodeA = new NodeA("NodeA 1");
    NodeB nodeB1 = new NodeB("NodeB 1");
    NodeC nodeC1 = new NodeC("NodeC 1");
    NodeC nodeC2 = new NodeC("NodeC 2");
    Set<NodeC> b1CNodes = new HashSet<NodeC>();
    b1CNodes.add(nodeC1);
    b1CNodes.add(nodeC2);
    nodeB1.setcNodes(b1CNodes);
    NodeB nodeB2 = new NodeB("NodeB 2");
    NodeC nodeC3 = new NodeC("NodeC 3");
    NodeC nodeC4 = new NodeC("NodeC 4");
    Set<NodeC> b2CNodes = new HashSet<NodeC>();
    b2CNodes.add(nodeC3);
    b2CNodes.add(nodeC4);
    nodeB2.setcNodes(b2CNodes);
    Set<NodeB> aBNodes = new HashSet<NodeB>();
    aBNodes.add(nodeB1);
    aBNodes.add(nodeB2);
    nodeA.setbNodes(aBNodes);
    nodeARep.save(nodeA);
//    ((Neo4jSession)session).context().clear();

    try {
      Iterable<NodeA> allNodeAs = nodeARep.findAll();
      System.out.println(mapper.writeValueAsString(allNodeAs));
//      ((Neo4jSession)session).context().clear();

      Iterable<NodeA> allNodeAs2 = nodeARep.findAll();
      System.out.println(mapper.writeValueAsString(allNodeAs2));

      NodeA oneNodeA = nodeARep.findByName("NodeA 1");
      System.out.println(mapper.writeValueAsString(oneNodeA));

      NodeA oneNodeA2 = nodeARep.findByNameQuery("NodeA 1");
      System.out.println(mapper.writeValueAsString(oneNodeA2));

      NodeA oneNodeA3 = session.load(NodeA.class, oneNodeA.getId());
      System.out.println(mapper.writeValueAsString(oneNodeA3));
//      ((Neo4jSession)session).context().clear();

      NodeA oneNodeA4 = nodeARep.findByNameWithBNodes("NodeA 1");
      System.out.println(mapper.writeValueAsString(oneNodeA4));

      NodeA oneNodeA5 = nodeARep.findByNameWithBAndCNodes("NodeA 1");
      System.out.println(mapper.writeValueAsString(oneNodeA5));

    } catch (JsonProcessingException e) {
      // TODO Auto-generated catch block
      e.printStackTrace();
    }

  }
}

Here are the results from the test program:

[{"id":20154,"name":"NodeA 1","bNodes":[{"id":20160,"name":"NodeB 1","cNodes":[{"id":20155,"name":"NodeC 1"},{"id":20156,"name":"NodeC 2"}]},{"id":20157,"name":"NodeB 2","cNodes":[{"id":20158,"name":"NodeC 3"},{"id":20159,"name":"NodeC 4"}]}]}] 
[{"id":20154,"name":"NodeA 1","bNodes":[{"id":20160,"name":"NodeB 1","cNodes":[{"id":20155,"name":"NodeC 1"},{"id":20156,"name":"NodeC 2"}]},{"id":20157,"name":"NodeB 2","cNodes":[{"id":20158,"name":"NodeC 3"},{"id":20159,"name":"NodeC 4"}]}]}] 
{"id":20154,"name":"NodeA 1","bNodes":[{"id":20160,"name":"NodeB 1","cNodes":[{"id":20155,"name":"NodeC 1"},{"id":20156,"name":"NodeC 2"}]},{"id":20157,"name":"NodeB 2","cNodes":[{"id":20158,"name":"NodeC 3"},{"id":20159,"name":"NodeC 4"}]}]} 
{"id":20154,"name":"NodeA 1","bNodes":[{"id":20160,"name":"NodeB 1","cNodes":[{"id":20155,"name":"NodeC 1"},{"id":20156,"name":"NodeC 2"}]},{"id":20157,"name":"NodeB 2","cNodes":[{"id":20158,"name":"NodeC 3"},{"id":20159,"name":"NodeC 4"}]}]} 
{"id":20154,"name":"NodeA 1","bNodes":[{"id":20160,"name":"NodeB 1","cNodes":[{"id":20155,"name":"NodeC 1"},{"id":20156,"name":"NodeC 2"}]},{"id":20157,"name":"NodeB 2","cNodes":[{"id":20158,"name":"NodeC 3"},{"id":20159,"name":"NodeC 4"}]}]} 
{"id":20154,"name":"NodeA 1","bNodes":[{"id":20157,"name":"NodeB 2","cNodes":[{"id":20158,"name":"NodeC 3"},{"id":20159,"name":"NodeC 4"}]},{"id":20160,"name":"NodeB 1","cNodes":[{"id":20155,"name":"NodeC 1"},{"id":20156,"name":"NodeC 2"}]}]} 
{"id":20154,"name":"NodeA 1","bNodes":[{"id":20157,"name":"NodeB 2","cNodes":[{"id":20159,"name":"NodeC 4"},{"id":20158,"name":"NodeC 3"}]},{"id":20160,"name":"NodeB 1","cNodes":[{"id":20156,"name":"NodeC 2"},{"id":20155,"name":"NodeC 1"}]}]}

Note that every query returns the same result, even though I'm only requesting a single node in all but the last two queries.

If I uncomment the session.context().clear() calls here are the results:

[{"id":20161,"name":"NodeA 1","bNodes":[{"id":20164,"name":"NodeB 2","cNodes":null},{"id":20167,"name":"NodeB 1","cNodes":null}]}]
[{"id":20161,"name":"NodeA 1","bNodes":[{"id":20164,"name":"NodeB 2","cNodes":null},{"id":20167,"name":"NodeB 1","cNodes":null}]}] 
{"id":20161,"name":"NodeA 1","bNodes":[{"id":20164,"name":"NodeB 2","cNodes":null},{"id":20167,"name":"NodeB 1","cNodes":null}]} 
{"id":20161,"name":"NodeA 1","bNodes":[{"id":20164,"name":"NodeB 2","cNodes":null},{"id":20167,"name":"NodeB 1","cNodes":null}]}
{"id":20161,"name":"NodeA 1","bNodes":[{"id":20164,"name":"NodeB 2","cNodes":null},{"id":20167,"name":"NodeB 1","cNodes":null}]}
{"id":20161,"name":"NodeA 1","bNodes":[{"id":20164,"name":"NodeB 2","cNodes":null},{"id":20167,"name":"NodeB 1","cNodes":null}]}
{"id":20161,"name":"NodeA 1","bNodes":[{"id":20164,"name":"NodeB 2","cNodes":[{"id":20165,"name":"NodeC 3"},{"id":20166,"name":"NodeC 4"}]},{"id":20167,"name":"NodeB 1","cNodes":[{"id":20163,"name":"NodeC 2"},{"id":20162,"name":"NodeC 1"}]}]}

Note that the entire graph is only returned when I request it explicitly, however I am still receiving the NodeB's with the NodeA.

I need to populate a response to a REST call and I'd rather not have to strip extraneous objects so they don't appear in the REST response. Will I have to make the call to session.context().clear() after every DB access so that I don't receive "cached" nodes? Is there a better way to make the call to receive a more fine-grained result? Can I turn off the "caching" altogether?

Luanne
  • 19,145
  • 1
  • 39
  • 51
Ric P.
  • 23
  • 4

1 Answers1

1

This is by design- the query does indeed hit the database to fetch fresh data, however, if the entity had related nodes already in the session, then those are retained. Note that the behaviour of some of your test methods is different:

Iterable<NodeA> allNodeAs = nodeARep.findAll(); //Default depth 1, so it will load related nodes from the graph, one hop away

NodeA oneNodeA = nodeARep.findByName("NodeA 1"); //Derived finder, default depth 1, same behaviour as above

NodeA oneNodeA2 = nodeARep.findByNameQuery("NodeA 1"); //Custom query, it will only return what the query asks it to.

You'll want to do a session.clear() followed by a load with depth 0 if you're using a findAll or find-by-id. A detailed explanation is available here https://jira.spring.io/browse/DATAGRAPH-642

Luanne
  • 19,145
  • 1
  • 39
  • 51
  • Thanks, Luanne. I included all the different test methods to show that the results were the same even though the expected behavior of the test should have been different. As I said, I have to return the results of my DB queries as responses to REST calls, the consumers of the REST responses are not expecting the related nodes to be returned so I have to strip them. Calling session.clear() after every DB access isn't ideal, but I guess we'll have to live with that for now. I'd like to cast my vote now for the removal of this behavior as mentioned in the Jira link you included. :) – Ric P. Feb 05 '16 at 14:24
  • just as a follow up, can you tell me if the session (aka mapping context) is global for all requests to the DB? I.e. if another request makes a change to a NodeB and I request NodeA, am I getting a stale NodeB from my session with the NodeA I asked for, or do I receive the updated NodeB from the other thread? – Ric P. Feb 05 '16 at 20:16
  • 1
    The life of the session can be managed by you. See http://docs.spring.io/spring-data/neo4j/docs/4.0.0.RELEASE/reference/html/#_session_bean Usually you'd want it to live as long as a unit of work, so a request or session scoped in in a web application. To avoid data integrity problems, you'd want fresh data at the beginning of each unit of work, either by re-fetching data, or by refreshing your session (clearing it/obtaining a new one) – Luanne Feb 06 '16 at 02:21
  • 1
    Found the specific way of defining the Scope of the Session Bean to be "request" here: https://neo4j.com/blog/spring-data-neo4j-4-1-applications/ (Min 4:55 in the final video) ```@Override @Bean @Scope(value = "request", proxyMode = ScopedProxyMode.TARGET_CLASS) public Session getSession() throws Exception { ...}``` In your class that extends Neo4jConfiguration. – Miguel Reyes Feb 08 '17 at 22:20