0

I'm using neo4j in a glassfish server through a modified version of Alex Smirnov neo4j JCA connector. My version is available here : https://github.com/Riduidel/neo4j-connector I'm using this connector with neo4j 1.8. As a consequence, when i want to use it, i first install the connector in my Glassfish application server, then use this connector in applications wishing to connect to.

It works OK when using it with fresh stores. But, when using it with stores created with previous version, I encounter weird bugs.

Typically, I got today the following stack

javax.resource.spi.ResourceAllocationException: Error in allocating a connection. Cause: Failed to transition org.neo4j.kernel.InternalAbstractGraphDatabase$DefaultKernelExtensionLoader@3bbd53b1 from NONE to STOPPED
...
...
.../* JCA internal exception stack */
...
...
Caused by: com.sun.appserv.connectors.internal.api.PoolingException: Failed to transition org.neo4j.kernel.InternalAbstractGraphDatabase$DefaultKernelExtensionLoader@494b584c from NONE to STOPPED
 at com.sun.enterprise.resource.pool.ConnectionPool.createSingleResource(ConnectionPool.java:924)
 at com.sun.enterprise.resource.pool.ConnectionPool.createResource(ConnectionPool.java:1185)
 at com.sun.enterprise.resource.pool.datastructure.RWLockDataStructure.addResource(RWLockDataStructure.java:98)
 ... 66 more
Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Failed to transition org.neo4j.kernel.InternalAbstractGraphDatabase$DefaultKernelExtensionLoader@494b584c from NONE to STOPPED
 at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.init(LifeSupport.java:388)
 at org.neo4j.kernel.lifecycle.LifeSupport.init(LifeSupport.java:82)
 at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:116)
 at org.neo4j.kernel.InternalAbstractGraphDatabase.run(InternalAbstractGraphDatabase.java:227)
 at org.neo4j.kernel.EmbeddedGraphDatabase.<init>(EmbeddedGraphDatabase.java:79)
 at org.neo4j.kernel.EmbeddedGraphDatabase.<init>(EmbeddedGraphDatabase.java:70)
 at com.netoprise.neo4j.AbstractNeo4jManagedConnectionFactory.createDatabase(AbstractNeo4jManagedConnectionFactory.java:165)
 at com.netoprise.neo4j.AbstractNeo4jManagedConnectionFactory.createDatabase(AbstractNeo4jManagedConnectionFactory.java:127)
 at com.netoprise.neo4j.Neo4jManagedConnectionFactory.createManagedConnection(Neo4jManagedConnectionFactory.java:163)
 at com.sun.enterprise.resource.allocator.ConnectorAllocator.createResource(ConnectorAllocator.java:160)
 at com.sun.enterprise.resource.pool.ConnectionPool.createSingleResource(ConnectionPool.java:907)
 ... 68 more
Caused by: java.lang.AssertionError
 at org.neo4j.index.impl.lucene.LuceneDataSource.cleanWriteLocks(LuceneDataSource.java:265)
 at org.neo4j.index.impl.lucene.LuceneDataSource.cleanWriteLocks(LuceneDataSource.java:260)
 at org.neo4j.index.impl.lucene.LuceneDataSource.cleanWriteLocks(LuceneDataSource.java:260)
 at org.neo4j.index.impl.lucene.LuceneDataSource.cleanWriteLocks(LuceneDataSource.java:260)
 at org.neo4j.index.impl.lucene.LuceneDataSource.<init>(LuceneDataSource.java:185)
 at org.neo4j.index.lucene.LuceneIndexProvider.load(LuceneIndexProvider.java:72)
 at org.neo4j.kernel.InternalAbstractGraphDatabase$DefaultKernelExtensionLoader.loadIndexImplementations(InternalAbstractGraphDatabase.java:1171)
 at org.neo4j.kernel.InternalAbstractGraphDatabase$DefaultKernelExtensionLoader.init(InternalAbstractGraphDatabase.java:1143)
 at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.init(LifeSupport.java:382)
 ... 78 more

A fast inspection reveals that this exception is linked to an undeletable "write.lock" file. My write.lock file can't be deleted because I guess migration is not over. How can I make sure the migration is done before using it without migrating it outside of Glassfish ?

Is there a way to ahve exclusive store migrations in that context ? And if so, how ? And is it the solution for my problem ?

EDIT 1 Added exception message.

EDIT 2 All this only happen when loaded graph was previously used with a Neo4j 1.5 and now with a Neo4j 1.8 connector. when graph is created by connector, absolutely no error happens.

EDIT 3 Strangely enough, this happens as long as there is no debugger plugged into that code : as soon as I try to debug it, the issue stop appearing. Which make me thinking there may be a migration cleanup mechanism that remvoe the write lock once migration is done, and this cleanup is not performed when using my neo4j JCA connector. Is it a valid observation ?

Riduidel
  • 22,052
  • 14
  • 85
  • 185
  • Cleaning the write locks happens before any checks for upgrading anything. I don't see a connection with upgrade in this case. And the fact that it doesn't appear when debugging is also very strange. So after a successful start and shutdown in debug mode, when starting up again it doesn't work? – Mattias Finné Apr 11 '13 at 07:24

2 Answers2

1

I am not too familiar with the JCA connector, but to be sure, I would just write a very small migration java class that opens the database, lets it migrate and shut down. Then try it again with the JCA connector?

Michael Hunger
  • 41,339
  • 3
  • 57
  • 80
Peter Neubauer
  • 6,311
  • 1
  • 21
  • 24
  • Sorry, I've just added the initial exception message. – Riduidel Apr 08 '13 at 13:18
  • or use the `neo4j-shell -path path/to/db -config config-with-allow-auto-upgrade-neo4j.properties` – Michael Hunger Apr 09 '13 at 06:45
  • @MichaelHunger by doin so, i won't use the embedded neo4j to perform my migration, but rather an external one. Right ? Unfortunatly waht I want is automatic migration from my neo4j connector. I don't say it's a bad solution, only that this solution is perfectly well suited for a test. In fact the biggest drawback of this solution is that it would require me to deploy independant neo4j instances on each machine where my application is deployed ... strange, considering the fact that my connector contains a working neo4j core. – Riduidel Apr 09 '13 at 08:14
  • @MichaelHunger Seems like it worked without any trouble, which leave me more than puzzled. How is Lucene migration performed by Neo4j shell ? And how is it different from what my connector can do ? – Riduidel Apr 09 '13 at 08:20
0

After further investigations, truth revealed to not be in multiple calls to the EmbeddedGraphDatabase constructor, but instead to multiple identicail IndexProvider being loaded.

I use neo4j embedded in an open-source JCA connector. In this connector, the org.neo4j.kernel.Service class is replaced by a custom one which contains a workaround regarding service loading for JBoss non shared libraries. Unfortunatly, in our context, this workaround implies loading twice the index provider :

  1. once using the EAR classloader
  2. once using the Glassfish library classloader.

Why ? Because, as our neo4j instance is using for application data AND for authentication, neo4j connector jar is put in ${domain}/lib. As a consequence, due to Classloader delegation in application server, the EAR classloader delegates to the Glassfish library classloader, and find this way the LuceneIndexProvider. Then, the Glassfish library classloader is directly used to load the same LuceneIndexProvider class.

This concludes by us having two LuceneIndexProvider objects, both trying to migrate the lucene index. Which lead to the AssertionError as the write.lock file created by the first object should be deleted by the second one, which can't do that.

I've then changed slightly that very specific class to use JBoss workaround only when default loading mechanism do not return any class (seee commit here). This small change worked like a charm, so I think you can considered this issue as fixed.

Riduidel
  • 22,052
  • 14
  • 85
  • 185