1

I want to consume Stream of json data using Netty or Ratpack. My use case is request body will contain large json data (array of json in MBs). One way of processing data is block until complete data is received and then start processing.But, I want asynchronous processing meaning as soon as one chunk of json object is received process it.

I came across with JsonObjectDecoder in Netty, but I have no luck using it. Here is my ChannelInitializer class:

public class ServerInitializer extends ChannelInitializer<SocketChannel> {

    @Override
    public void initChannel(SocketChannel ch) {
        ChannelPipeline p = ch.pipeline();

        p.addLast(new JsonObjectDecoder(true));

        // HttpServerCodec is a combination of HttpRequestDecoder and HttpResponseEncoder
        p.addLast(new HttpServerCodec());
        //
        // add gizp compressor for http response content
        p.addLast(new HttpContentCompressor());

        p.addLast(new HttpObjectAggregator(1048576));

        p.addLast(new ChunkedWriteHandler());

        p.addLast(new ServerHandler());
    }
} 

I am sending this data:

[
    {
        "timestamp": "2016-11-14 11:08:09+0100", 
        "message": "message 120", 
        "hostname": "myhost.com", 
        "device_product": "product123", 
        "device_vendor": "vendor123", 
        "device_version": "1", 
        "severity": "High"
    },
    .....
    {
        "timestamp": "2016-11-14 11:08:09+0100", 
        "message": "message 121", 
        "hostname": "myhost.com", 
        "device_product": "product123", 
        "device_vendor": "vendor123", 
        "device_version": "1", 
        "severity": "High"
    }
]

But I am getting this error:

io.netty.handler.codec.CorruptedFrameException: invalid JSON received at byte position 0: 504f5354202f6c6f677320485454502f312e310d0a486f73743a206c6f63616c686f73743a383038300d0a436f6e6e656374696f6e3a206b6565702d616c6976650d0a436f6e74656e742d4c656e6774683a203230380d0a4163636570743a206170706c69636174696f6e2f6a736f6e0d0a506f73746d616e2d546f6b656e3a2062383064306264352d663234302d346563622d353631322d3863376139396434633934360d0a43616368652d436f6e74726f6c3a206e6f2d63616368650d0a4f726967696e3a206368726f6d652d657874656e73696f6e3a2f2f6668626a676269666c696e6a62646767656863646463626e636464646f6d6f700d0a557365722d4167656e743a204d6f7a696c6c612f352e30202857696e646f7773204e5420362e313b2057696e36343b2078363429204170706c655765624b69742f3533372e333620284b48544d4c2c206c696b65204765636b6f29204368726f6d652f35352e302e323838332e3837205361666172692f3533372e33360d0a436f6e74656e742d547970653a206170706c69636174696f6e2f6a736f6e0d0a4163636570742d456e636f64696e673a20677a69702c206465666c6174652c2062720d0a4163636570742d4c616e67756167653a20656e2d55532c656e3b713d302e382c6a613b713d302e362c66722d46523b713d302e342c66723b713d302e322c66722d43413b713d302e320d0a0d0a7b2274696d657374616d70223a2022323031362d31312d31342031313a30383a30392b30313030222c226d657373616765223a20226d65737361676520313230222c22686f73746e616d65223a20226d79686f73742e636f6d222c200a09226465766963655f70726f64756374223a202270726f64756374313233222c200a09226465766963655f76656e646f72223a202276656e646f72313233222c200a09226465766963655f76657273696f6e223a202231222c200a09227365766572697479223a202248696768220a090a097d
    at io.netty.handler.codec.json.JsonObjectDecoder.decode(JsonObjectDecoder.java:163)
    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:316)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:230)
    at io.netty.channel.ChannelHandlerInvokerUtil.invokeChannelReadNow(ChannelHandlerInvokerUtil.java:84)
    at io.netty.channel.DefaultChannelHandlerInvoker.invokeChannelRead(DefaultChannelHandlerInvoker.java:153)
    at io.netty.channel.PausableChannelEventExecutor.invokeChannelRead(PausableChannelEventExecutor.java:86)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:389)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:956)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:127)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:514)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:471)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:385)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:351)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
    at io.netty.util.internal.chmv8.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1412)
    at io.netty.util.internal.chmv8.ForkJoinTask.doExec(ForkJoinTask.java:280)
    at io.netty.util.internal.chmv8.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:877)
    at io.netty.util.internal.chmv8.ForkJoinPool.scan(ForkJoinPool.java:1706)
    at io.netty.util.internal.chmv8.ForkJoinPool.runWorker(ForkJoinPool.java:1661)
    at io.netty.util.internal.chmv8.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:126)

I don't know what am I missing. If anyone knows a way to achieve this using Ratpack too please help me. Thanks in advance.

Md Zahid Raza
  • 941
  • 1
  • 11
  • 28

2 Answers2

1

The problem is the JSON decoder is the first handler in your pipeline, and it's attempting to decode an HTTP post. If I take the invalid data stream from the error message you posted, parse it back into bytes and create a string from it (in groovy)...

import javax.xml.bind.DatatypeConverter;
v = "504f5354202f6c6f677320485...<snip>";
byte[] bytes = DatatypeConverter.parseHexBinary(v);
println new String(bytes)

The result is:

POST /logs HTTP/1.1 Host: localhost:8080 Connection: keep-alive Content-Length: 208 Accept: application/json Postman-Token: <TOKEN REMOVED> Cache-Control: no-cache Origin: chrome-extension://<ID REMOVED> User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36 Content-Type: application/json Accept-Encoding: gzip, deflate, br Accept-Language: en-US,en;q=0.8,ja;q=0.6,fr-FR;q=0.4,fr;q=0.2,fr-CA;q=0.2

{"timestamp": "2016-11-14 11:08:09+0100","message": "message 120","hostname": "myhost.com", "device_product": "product123", "device_vendor": "vendor123", "device_version": "1", "severity": "High" }

So you need to add these into the pipeline before the JSON decoder:

  1. HttpServerCodec
  2. HttpObjectAggregator (for large posts, the data could be chunked)
  3. A MessageToMessageDecodee to accept a [Full]HttpRequest and forward the content (as a ByteBuf).

Then the JSON decoder will get a chunk of JSON bytes and start sending the parsed out messages upstream.

Nicholas
  • 15,916
  • 4
  • 42
  • 66
  • Thanks for response. But it will not meet my asynchronous requirement. I want to procees data in stream, but if I add HttpServerCodec, HttpObjectDecoder etc before then they will be blocked for constructing FullHttpRequest. So, It will not be actual stream processing. – Md Zahid Raza Feb 01 '17 at 18:10
  • The don't post HTTP. Just handle the JSON directly as a stream of bytes. HTTP doesn't stream that well. – Nicholas Feb 01 '17 at 20:34
  • And how can I do that...I was building rest API which involved very large size data post... – Md Zahid Raza Feb 01 '17 at 20:55
1

To do this in an HTTP POST, you would need to make sure the request is chunked. This is an approximation of what you would need to do with your pipeline:

  1. HttpServerCodec - Will forward instances of HttpContent, except for the first one which will be an HttpRequest.
  2. A MessageToMessageDecoder to accept HttpContent instances, extract the content ByteBuf and forward.
  3. The JSON decoder.
  4. Your JSON handler.

At some point, you will get a HttpContent which is also an instance of LastHttpContent which will be the last chunk.

The tricky part is that at some point, one of the HttpContents will have an incomplete sequence of JSON which will trigger an error in the JSON decoder, at which point you need to rewind the ByteBuf to the last known "good" position and wait for the next chunk to come along and finish it, because I don't think this is handled automatically.

Nicholas
  • 15,916
  • 4
  • 42
  • 66