1

I am running a django app and I am trying to send an API POST-request which is pretty big to my production server.

Now the problem is that what takes 4 seconds locally takes 1 minute in production. That might not sound like much but I am planning on sending this request 100 of times and every second counts.

I circled in on the problem and I think it might be an nginx configuration problem but I can't be certain. Here is my troubleshooting process with respective code:

I have a big dictionary dataset_dict = {1:1234, 2:1244 ... 525600: 124345662}, so that means roughly half a million entries.

I send this and measure the time of my post-request:

            dataset_dict = {1:1234, 2:1244 ... 525600: 124345662}

            data = {
                "element": name_element,
                "description": f"description of element",
                "type_data": "datatype",
                "data_json": dataset_dict,
            }

            start = datetime.datetime.now()
            requests.request("post", url="myendpoint", json=data)
            end = datetime.datetime.now()
            runtime = end - start
            print("time-post-request:", runtime)

This takes 4 seconds locally and 50 seconds in production.

So I keep going and I measure the time of only the server code. With that I mean only the code that is executed in my view. I use raw-SQL to achieve maximum performance

    start_time = datetime.datetime.now()
    cursor = connection.cursor()
    data_json = json.dumps(request.data["data_json"])
    
    ##......code shortened for clarity

    cursor.execute(
        "INSERT INTO sql_table(data_json) VALUES ('{}')".format(data_json)
    )
    end_time = datetime.datetime.now()
    runtime = end_time - start_time
    
    print("success, time needed", runtime)

    msg = {"detail": "Created successfully"}
    return Response(msg, status=status.HTTP_201_CREATED)

This code on the server needs 3seconds locally and only 2seconds in production.

So my question is now: Where do the 56 seconds go?

I am inferring that I can exclude postgreSQL settings since the data injection seems to work quite fast.

Nginx settings could be a good start to look, so I was monitoring the server log and I got warn] 28#28: *9388 a client request body is buffered to a temporary file /var/cache/nginx/client_temp/0000000060,

In the official doc I read:

If the request body size is more than the buffer size, then the entire (or partial) request body is written into a temporary file

So I assume I lose time because nginx writes the request to disk. I adjust the request body size to 4GB. I run it again and it takes almost the same amount of time (55 seconds), and I still get the warning.... Should I go even higher?

Is there any other screws I could adjust to get the performance up and the creation time down to what I have locally?? Shouldn't production servers in general be faster than local dev servers?? Maybe it's the internet connection?

So my main question: How could I increase the performance so I could get to a comparable time (seconds) for my data creation?

Specs:

Linux server 
RAM: 16GB 
CPUS: 4

I will post my nginx settings. I am really no expert with nginx so any help how I could increase my performance for this use-case is highly appreciated also if it doesn't solve the problem.

Nginx:

worker_processes auto;

events {
    worker_connections 1024;
    use epoll;
    multi_accept on;
}


http {

    ...
    sendfile        on;
    tcp_nopush      on;
    tcp_nodelay    on;
    send_timeout       90;
    keepalive_timeout  90;
    fastcgi_read_timeout 120;
    proxy_read_timeout 120;
    fastcgi_buffers 8 128k;
    fastcgi_buffer_size 128k;
    client_body_timeout 120;
    client_body_buffer_size 4G;
    client_header_buffer_size 1k;
    large_client_header_buffers 4 8k;
    client_header_timeout 120;
    client_max_body_size 5G;

    reset_timedout_connection on;
    types_hash_max_size 2048;
    server_tokens off;

    gzip  on;
    gzip_static on;
    gzip_min_length 512;

}

Any more info needed, I'll be happy to post it.

Micromegas
  • 231
  • 3
  • 12
  • 2
    You could write your request (about 10MiB, correct?) to a file and upload it with a different client (try `curl --upload-file request --output response --progress-bar https://myendpoint`) to get a better understanding of whether the actual time spent transmitting the massive dictionary even matters here. – anx Nov 25 '20 at 21:24
  • 2
    Also, your "raw SQL for performance" is quite likely vulnerable to SQL injection (given the naive `format`)... There's a reason you shouldn't do raw SQL with user controlled input, especially when your connector almost certainly has support for parameters. – Ginnungagap Nov 26 '20 at 07:30
  • Thanks @Ginnungagap! So are you saying I shouldn't use raw SQL at all in this case or is there a way to avoid this problem? What do you mean with my "connector almost certainly has support for parameters"? – Micromegas Nov 26 '20 at 13:33
  • Thank you @anx. The original request is in file format and uploading it to a cloud provider is quite fast (like 33 times the size of the dict takes about 2 minutes to upload) – Micromegas Nov 26 '20 at 13:35
  • 2
    @Micromegas You are passing a string including the parameters to `cursor.execute` - but you [should be passing the query with placeholders, and the parameters in a separate argument, and let *Django* correctly escape & quote.](https://docs.djangoproject.com/en/3.1/topics/db/sql/#executing-custom-sql-directly) – anx Nov 26 '20 at 13:40
  • @Micromegas "like the size" & "a provider" makes me think you had trouble reproducing the exact query. Try writing out the request to a file and repeating it as-is outside its current context, doing the same step with reduced complexity is the way to go to figure out which assumption is off or which part is broken. – anx Nov 26 '20 at 13:51
  • Thanks again. So you mean like this: `cursor.execute( "INSERT INTO sql_table(%s) VALUES (%s)", [name_of_col, data_json]) )` ?? – Micromegas Nov 26 '20 at 13:53
  • @Micromegas look into prepared statements – Christophe Roussy Dec 09 '20 at 08:29

0 Answers0