0

I have a MultiLayerNetwork which is loaded in an application which is loaded on a payara server. The model loads fine but on requesting an output on a INDArray I receive a RuntimeException.

The system I'm working on seems to work fine during the initialization:

org.nd4j.linalg.factory.Nd4jBackend - Loaded [CpuBackend] backend
org.nd4j.nativeblas.NativeOpsHolder - Number of threads used for linear algebra: 4
org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory - Binary level Generic x86 optimization level AVX512
org.nd4j.nativeblas.Nd4jBlas - Number of threads used for OpenMP BLAS: 8
org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Backend used: [CPU]; OS: [Linux]
org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Cores: [16]; Memory: [7,1GB]
org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Blas vendor: [OPENBLAS]

org.nd4j.linalg.cpu.nativecpu.CpuBackend - Backend build information:
 GCC: "7.5.0"
STD version: 201103L
DEFAULT_ENGINE: samediff::ENGINE_CPU
HAVE_FLATBUFFERS
HAVE_OPENBLAS

org.deeplearning4j.nn.multilayer.MultiLayerNetwork - Starting MultiLayerNetwork with WorkspaceModes set to [training: ENABLED; inference: ENABLED], cacheMode set to [NONE]

Requesting a prediction results in:

ERROR org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner - Failed to execute op matmul. Attempted to execute with 2 inputs, 1 outputs, 2 targs,0 bargs and 3 iargs. Inputs: [(FLOAT,[1,9],c), (FLOAT,[9,400],f)]. Outputs: [(FLOAT,[1,400],f)]. tArgs: [1.0, 0.0]. iArgs: [0, 0, 0]. bArgs: -. Op own name: "81f000d9-6011-4d1f-adc4-af57fb7d11e6" - Please see above message (printed out from c++) for a possible cause of error.

with the following stack trace:

java.lang.RuntimeException: Op [matmul] execution failed
        at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:1561) ~[nd4j-native-1.0.0-M2.jar:?]
        at org.nd4j.linalg.factory.Nd4j.exec(Nd4j.java:6522) ~[nd4j-api-1.0.0-M2.jar:?]
        at org.nd4j.linalg.api.blas.impl.BaseLevel3.gemm(BaseLevel3.java:62) ~[nd4j-api-1.0.0-M2.jar:?]
        at org.nd4j.linalg.api.ndarray.BaseNDArray.mmuli(BaseNDArray.java:3194) ~[nd4j-api-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.layers.BaseLayer.preOutputWithPreNorm(BaseLayer.java:322) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.layers.BaseLayer.preOutput(BaseLayer.java:295) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.layers.BaseLayer.activate(BaseLayer.java:343) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.layers.AbstractLayer.activate(AbstractLayer.java:262) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.outputOfLayerDetached(MultiLayerNetwork.java:1341) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2453) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2416) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2407) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2394) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2490) ~[deeplearning4j-nn-1.0.0-M2.jar:?]
        at com.[****].lambda$nnRegression$1(myCustomJavaClass.java:264)

Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
        at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.getCustomOperations(NativeOpExecutioner.java:1365) ~[nd4j-native-1.0.0-M2.jar:?]
        at org.nd4j.linalg.api.ops.DynamicCustomOp.opHash(DynamicCustomOp.java:392) ~[nd4j-api-1.0.0-M2.jar:?]
        at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:1900) ~[nd4j-native-1.0.0-M2.jar:?]
        at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:1540) ~[nd4j-native-1.0.0-M2.jar:?]
        ... 139 more

I have generated a MVP running the DL4J functions in a standalone application and the loading of the model as well as the inference work fine. I would expect the same behavior on the application server.

MultiLayerNetwork net = MultiLayerNetwork.load(nn, false);
List<Integer> obsList = new ArrayList<>();

obsList.add(1);obsList.add(18);
obsList.add(1);obsList.add(24);
obsList.add(1);obsList.add(15);
obsList.add(1);obsList.add(13);
obsList.add(2);

int[] obsArray = obsList.stream().mapToInt(Integer::intValue).toArray();
int[][] flat = new int[][] { obsArray };

INDArray test = Nd4j.create(flat);
INDArray y = net.output(test); // <---- THIS IS WHERE THE ERROR OCCURS

The application is packed in an ear-file, the dl4j depedencies are declared as provided in the POM.

        <dependency>
            <groupId>org.deeplearning4j</groupId>
            <artifactId>deeplearning4j-core</artifactId>
            <version>1.0.0-M2</version>
            <scope>provided</scope>
        </dependency>

        <dependency>
            <groupId>org.nd4j</groupId>
            <artifactId>nd4j-native-platform</artifactId>
            <version>1.0.0-M2</version>
            <scope>provided</scope>
        </dependency>

The server running this application is a Cent OS 8 with a Payara Server 5.193.1 #badassfish (build 275).

Using the same pattern results in the given stacktrace on the application server. I'm wondering what might cause the error.

  • Could you elaborate on your situation a bit? It's hard to tell much from the stack trace. Are you bundling it in an uber jar? A full payara server? When executing it appears something ends up being missing. That array out of bounds error also looks very weird. I've never seen that before. This is likely the side effect of something special in your server. I'll need to reproduce it though. If you want please file an issue on https://github.com/deeplearning4j/deeplearning4j/issues and I can take a closer look. – Adam Gibson Feb 03 '23 at 10:10
  • Dear Adam, I am packing the application as ear-File, having the DL4J Libraries on a provided scope in the POM. The place where the functions are being called is a stateless EJB Service, implementing a Local and a Remote Interface. The OS of the machine is a "CentOS Linux 8". The Payara Server is Version 5.193.1 #badassfish (build 275). I'll open a ticket as you suggested and try to provide an example to reproduce. I have adjusted the origrinal question. – ThomasVoss Feb 03 '23 at 11:04
  • Yes please do. This is a bit of an odd use case. At a minimum dl4j expects certain resources to be present locally. – Adam Gibson Feb 03 '23 at 11:13

0 Answers0