6

I'm trying to create a custom BodyPublisher that would deserialize my JSON object. I could just deserialize the JSON when I'm creating the request and use the ofByteArray method of BodyPublishers but I would rather use a custom publisher.

public class CustomPublisher implements HttpRequest.BodyPublisher {
    private byte[] bytes;
    
    public CustomPublisher(ObjectNode jsonData) {
        ...
        // Deserialize jsonData to bytes
        ...
    }
    
    @Override
    public long contentLength() {
        if(bytes == null) return 0;
        return bytes.length
    }
    
    @Override
    public void subscribe(Flow.Subscriber<? super ByteBuffer> subscriber) {
        CustomSubscription subscription = new CustomSubscription(subscriber, bytes);
        subscriber.onSubscribe(subscription);       
    }

    private CustomSubscription implements Flow.Subscription {
         private final Flow.Subscriber<? super ByteBuffer> subscriber;
         private boolean cancelled;
         private Iterator<Byte> byterator;

         private CustomSubscription(Flow.Subscriber<? super ByteBuffer> subscriber, byte[] bytes) {
             this.subscriber = subscriber;
             this.cancelled = false;
             List<Byte> bytelist = new ArrayList<>();
             for(byte b : bytes) {
                 bytelist.add(b);
             }
             this.byterator = bytelist.iterator();
         }

         @Override
         public void request(long n) {
             if(cancelled) return;
             if(n < 0) {
                 subscriber.onError(new IllegalArgumentException());
             } else if(byterator.hasNext()) {
                 subscriber.onNext(ByteBuffer.wrap(new byte[]{byterator.next()));
             } else {
                 subscriber.onComplete();
             }
         }

         @Override
         public void cancel() {
             this.cancelled = true;
         }
    }
}

This implementation works, but only if subscriptions request method gets called with 1 as a parameter. But that's what happens when I am using it with the HttpRequest.

I'm pretty sure this is not any way preferred or optimal way of creating the custom subscription but I have yet to found better way to make it work.

I would greatly appreciate if anyone can lead me to a better path.

Challe
  • 599
  • 4
  • 19
  • In what format do you intend to publish your ObjectNode? As JSON? – VGR Aug 08 '20 at 20:29
  • @VGR The constructor uses Jackson to deserialize it to a byte array, and if I have understood correctly the HttpRequest wants back a `ByteBuffer`. – Challe Aug 09 '20 at 16:46
  • And what would be an example of the contents of that ByteBuffer? – VGR Aug 09 '20 at 18:06
  • @VGR The byte array that gets initialized in the constructor. Tho I don't know if the whole byte array should be just pushed into the ByteBuffer as a single byte array or if each element should get pushed one by one. Or even if that matters at all. – Challe Aug 09 '20 at 19:46
  • Don’t deliver one byte at a time, that is very wasteful and slow. Use a threshold, like `1 << 20`. And don’t ignore the argument to `request`; if it asks for three objects, you have to call `onNext` three times (or `onComplete` if you finish sending all the data before that). – VGR Aug 10 '20 at 19:00

1 Answers1

1

You are right to avoid making a byte array out of it, as that would create memory issues for large objects.

I wouldn’t try to write a custom publisher. Rather, just take advantage of the factory method HttpRequest.BodyPublishers.ofInputStream.

HttpRequest.BodyPublisher publisher =
    HttpRequest.BodyPublishers.ofInputStream(() ->  {
        PipedInputStream in = new PipedInputStream();

        ForkJoinPool.commonPool().submit(() -> {
            try (PipedOutputStream out = new PipedOutputStream(in)) {
                objectMapper.writeTree(
                    objectMapper.getFactory().createGenerator(out),
                    jsonData);
            }
            return null;
        });

        return in;
    });

As you have noted, you can use HttpRequest.BodyPublishers.ofByteArray. That is fine for relatively small objects, but I program for scalability out of habit. The problem with assuming code won’t need to scale is that other developers will assume it is safe to pass large objects, without realizing the impact on performance.

Writing your own body publisher will be a lot of work. Its subscribe method is inherited from Flow.Publisher.

The documentation for the subscribe method starts with this:

Adds the given Subscriber if possible.

Each time your subscribe method is called, you need to add the Subscriber to some sort of colllection, you need to create an implementation of Flow.Subscription, and you need to immediately pass it to the subscriber’s onSubscribe method. Your Subscription implementation object needs to send back one or more ByteBuffers, only when the Subscription’s request method is called, by invoking the corresponding Subscriber’s (not just any Subscriber’s) onNext method, and once you’ve sent all of the data, you must call the same Subscriber’s onComplete() method. On top of that, the Subscription implementation object needs to handle cancel requests.

You can make a lot of this easier by extending SubmissionPublisher, which is a default implementation of Flow.Publisher, and then adding a contentLength() method to it. But as the SubmissionPublisher documentation shows, you still have a fair amount of work to do, for even a minimal working implementation.

The HttpRequest.BodyPublishers.of… methods will do all of this for you. ofByteArray is okay for small objects, but ofInputStream will work for any object you could ever pass in.

VGR
  • 40,506
  • 4
  • 48
  • 63
  • Could you elaborate more on the "I wouldn't try to write a custom publisher", since this could accomplished much easily just by using the `BodyPublishers.ofByteArray()` method. And to clarify the object size is fixed and not even that big. – Challe Aug 10 '20 at 05:33
  • Updated answer. Writing a BodyPublisher means implementing Flow.Publisher, which is a lot of work. You can use ofByteArray, but sooner or later another developer will assume your code is safe to use with large objects without realizing its effect on performance. – VGR Aug 10 '20 at 16:08
  • I provided a working example of the implementation with some caveats. I decided to use the `BodyPublishers.ofByteArray()` since it allows me to have, in my opinion, much cleaner and more readable solution to my problem, and since you haven't provided any conclusive evidence that the size of the byte array would ever be limiting factor in the use case with HttpRequest. – Challe Aug 10 '20 at 17:30
  • Conclusive evidence? What do you think a byte array with a length of 1000000000 would do to your program’s performance? If you choose to assume that only small objects will ever be serialized, that is your choice, but this is just standard scalability practice. It’s the same reason we copy a file a few bytes at a time instead of reading the entire contents into a single byte array. – VGR Aug 10 '20 at 18:55
  • The application will need to have all the bytes loaded in the memory at a point or another any way because it's HttpRequest. – Challe Aug 10 '20 at 18:58
  • That is not true. HTTP is a socket based protocol. You certainly can, and should, deliver the data in small portions. That is why URLConnection has getInputStream and getOutputStream methods. – VGR Aug 10 '20 at 21:15
  • 2
    Submission of the function to stream the data into the PipedInputStream to the common pool should be highlighted as risky. Typically tasks submitted to the common pool should not block, as the pool is small (CPU count -1) and if it becomes fully utilized other tasks in the system may become unstable. For instance: HttpClient uses the common pool for async IO. – root Sep 02 '21 at 21:10