6

We have a use case where we have to transfer Large data files from environment A to environment B over http. What we want to achieve is that the sender sends the data in chunks and receiver starts writing it in file in chunks. Thus we decided to use MTOM.

Web Service Code :

@MTOM(enabled = true, threshold=1024)
@WebService(portName = "fileUploadPort", endpointInterface = "com.cloud.receiver.FileUploadService", serviceName = "FileUploadService")
@BindingType(value = SOAPBinding.SOAP12HTTP_MTOM_BINDING)
public class FileUploadServiceImpl implements FileUploadService {

@Override
public void uploadFile(FileUploader Dfile) {

    DataHandler handler = Dfile.getFile();
    try {
        InputStream is = handler.getInputStream();

        OutputStream os = new FileOutputStream(new File(absolutePath));
        byte[] b = new byte[10000000];
        int bytesRead = 0;
        while ((bytesRead = is.read(b)) != -1) {
            os.write(b, 0, bytesRead);
        }
        os.flush();
        os.close();
        is.close();

    } catch (IOException e) {
        e.printStackTrace();
    }

}
}

Client Code:

public static void main(String args[]) throws Exception {

    URL url = new URL("http://localhost:8080/CloudReceiver/FileUploadService?wsdl");

    QName qname = new QName("http://receiver.cloud.com/", "FileUploadService");

    Service service = Service.create(url, qname);
    FileUploadService port = service.getPort(FileUploadService.class);
    // enable MTOM in client
    BindingProvider bp = (BindingProvider) port;
    SOAPBinding binding = (SOAPBinding) bp.getBinding();
    binding.setMTOMEnabled(true);

    FileUploader f = new FileUploader();
    DataSource source = new FileDataSource(new File("G:\\Data\\Sender\\temp.csv"));
    DataHandler dh = new DataHandler(source);
    Map<String, Object> ctxt = ((BindingProvider) port).getRequestContext();
    // Marking Chunk size at Client.
    ctxt.put(JAXWSProperties.HTTP_CLIENT_STREAMING_CHUNK_SIZE, 100);

    f.setFile(dh);
    port.uploadFile(f);
}

Every things works fine when we transfer data less than 100MB. Data Files beyond (100MB+) App server (JBoss 8.2) throws below exception at the receivers end.

java.io.IOException: UT000020: Connection terminated as request was larger than 104857600

I under stand that this error is because of the property in standalone.xml

<http-listener name="default" socket-binding="http" max-post-size="104857600"/>

This means that Data is not written to file in Chunks instead it is kept in memory and than written to file after transmission completes.

How do we achieve writing of data to file in Chunks? We don't want the post memory to increase. The File size can go up to the scale of 1 TB.

Environment: JBoss 8.2 Wild Fly, Java 8

Sudeep
  • 149
  • 2
  • 10
  • 1
    Hi, i solved this problem by slicing file for byte arrays and sending arrays in loop, i think client sends to server all chanks in single request. – George Vassilev Dec 29 '15 at 08:38
  • If we have to slice the file than what is the need of MTOM? I feel MTOM should handle slicing... Slicing is well taken care of in the client code... – Sudeep Dec 29 '15 at 09:00
  • Can you add the code of `FileUploadService` ? – Anish B. Jun 19 '20 at 10:30
  • Hi, if you are satisfied with one of the answers, would you mind "accepting" it? This will help the community and especially reward those who help you. Thank you. – bsaverino Jul 01 '20 at 22:47

2 Answers2

0

First of all, the limit mentioned

<http-listener name="default" socket-binding="http" max-post-size="104857600"/>

denotes maximum size of the whole POST request to be accepted by the server.

No matter if MTOM is used or not, if chunks are used or not - you send a file with just one POST request here. Using chunks does not change that - all the chunks with file data are parts of the same POST request.

So it works as expected: 100M+ files are too big accordingly with the limit.

Then, I'm quite sure that this conclusion

This means that Data is not written to file in Chunks instead it is kept in memory and than written to file after transmission completes.

is simply wrong. Keeping potentially large volumes in memory is not safe, so servers normally don't do that, dropping those volumes into some temporary files instead.

user3714601
  • 1,156
  • 9
  • 27
0

Some explanations

Sudeep: This means that Data is not written to file in Chunks instead it is kept in memory and than written to file after transmission completes.

--> That is definitely not true.

The following limit:

<http-listener name="default" socket-binding="http" max-post-size="104857600"/>

.. indicates that the maximum body size of a single (eventually chunked) POST request cannot exceed 104857600 bytes. This has to do with the HTTP content-length header and configuration of JBoss's own capabilities (like buffers).

Cf. https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html The Content-Length entity-header field indicates the size of the entity-body, in decimal number of OCTETs, sent to the recipient or, in the case of the HEAD method, the size of the entity-body that would have been sent had the request been a GET.

I suggest to perform a Wireshark analysis to convince yourself.

.


Sudeep: What we want to achieve is that the sender sends the data in chunks and receiver starts writing it in file in chunks. Thus we decided to use MTOM.

--> I believe this is a wrong decision.

Firstly

MTOM is not especially designed for chunking data. The SOAP Message Transmission Optimization Mechanism (MTOM) mainly defines two optimizations:

  • an abstract feature for optimizing the transmission and/or wire format of a SOAP message by selectively encoding portions of the message, while still presenting an XML Infoset to the SOAP application.
  • an Optimized MIME Multipart/Related Serialization of SOAP Messages implementing the Abstract SOAP Transmission Optimization Feature in a binding independent way. This implementation relies on the XML-binary Optimized Packaging (XOP) format. Only element content can be optimized.
  • ... knowing that Attributes, non-base64-compatible character data, and data not in the canonical representation of the base64Binary datatype cannot be successfully optimized by XOP"

So in summary, MTOM is not the solution to your problem.

Secondly

The HTTP protocol is not designed to exchange very large data file(s). As you mentionned 1TB (!) of data I'd suggest a completely different strategy.

The reasons are the HTTP protocol has a lot of overhead compared to more suitable protocols (even though one can say the overhead ratio will decrease significantly if used optimally from A to Z i.e. with large MTU and chunk size). But the request-response model does not fit well with file transfer also. Without efficient parallelization of requests and recovery methods, you may also end up with serious transmission issues with very large files. Not even considering firewalls, (reverse) proxies and other tools that may consider this kind of traffic particularly abnormal or "out-of-bounds".

My recommendation really is to find your way to use (S)FTP for large file transfers. This protocol is meant to. Eventually consider peer-to-peer alternatives if your use case matches.

But definitely let aside HTTP for this use case. Except if you decide to make slices (and hence multiple requests), use compression methods (eventually stacked) and optimize the whole transmission chain.

Good luck.

bsaverino
  • 1,221
  • 9
  • 14