What is the relationship between request content size and request duration

Question

At the company I work, all our APIs send and expect requests/responses that follow the JSON:API standard, making the structure of the request/response content very regular.

Because of this regularity and the fact that we can have hundreds or thousands of records in one request, I think it would be fairly doable and worthwhile to start supporting compressed requests (every record would be something like < 50% of the size of its JSON:API counterpart).

To make a well informed judgement about the viability of this actually being worthwhile, I would have to know more about the relationship between request size and duration, but I cannot find any good resources on this. Anybody care to share their expertise/resources?

Bonus 1: If you were to have request performance issues, would you look at compression as a solution first, second, last?

Bonus 2: How does transmission overhead scale with size? (If I cut the size by 50%, by what percentage will the transmission overhead be cut?)

score 2 · Accepted Answer · answered Aug 28 '20 at 07:55

Request and response compression adds to a time and CPU penalty on both sender's side and receiver's side. The savings in time is in the transmission.

The weighing of the tradeoff depends a lot on the customers of the API -- when they make requests, how much do they request, what is requested, where they are located, type of device/os and capabilities etc.,

If the data is static -- for eg: a REST query apihost/resource/idxx returning a static resource, there are web standard approaches like caching of static resources that clients / proxies will be able to assist with.

If the data is dynamic -- there are architectural patterns that could be used.

If the data is huge -- eg: big scientific data sets, video etc., almost always you would find them being served statically with a metadata service that provides the dynamic layer. For eg: MPEG-DASH or HLS is just a collection of files.

I would choose compression as a last option relative to the other architectural options.

There are also implementation optimizations that would precede using compression of request/response. For eg:

Are your services using all available resources at disposal (cores, memory, i/o)
Does the architecture allow scale-up and scale-out and can the problem be handled effectively using that (remember the penalties on client side due to compression)
Can you use queueing, caching or other mechanisms to make things appear faster?

If you have explored all these and the answer is your system is optimal and you are looking at the most granular unit of service where data volume is an issue, by all means go after compression. Keep in mind that you need to budget compute resources for compression on the server side as well (for a fixed workload).

Your question#2 on transmission overhead vs size is a question around bandwidth and latency. Bandwidth determines how much you can push through the pipe. Latency governs the perceived response times. Whether the payload is 10 bytes or 10MB, latency for a client across the world encountering multiple hops will be larger relative to a client encountering only one or two hops and is bound by the round-trip time. So, a solution may be to distribute the servers and place them closer to your clients from across the world rather than compressing data. That is another reason why compression isn't the first thing to look at.

Baseline your performance and benchmark your experiments for a representative user base.

I understand your company has standardized on JSON for APIs. But if you are going down the path of compression because it is bubbling up as a biz tech issue, you may want to also question that fundamental. Another valid approach might be to provide an alternative API using a more efficient data representation such as Protocol Buffers (https://developers.google.com/protocol-buffers). This will remove a lot of the json overhead and speed up things. See: https://dzone.com/articles/is-protobuf-5x-faster-than-json. You can run both JSON and the protobuf impelementations and drive the migration. — vvg, Aug 28 '20 at 08:05

score 1 · Answer 2 · answered Aug 14 '20 at 12:57

I think what you are weighing here is going to be the speed of your processor / cpu vs the speed of your network connection.

Network connection can be impacted by things like distance, signal strength, DNS provider, etc; whereas, your computer hardware is only limited by how much power you've put in it.

I'd wager that compressing your data before you are sending would result in shorter response times, yes, but it's=probably going to be a very small amount. If you are sending json, usually text isn't all that large to begin with, so you would probably only see a change in performance at the millisecond level.

If that's what you are looking for, I'd go ahead and implement it, set some timing before and after, and check your results.

What is the relationship between request content size and request duration

2 Answers2