I'm making my webserver in rust using warp and tokio. Here's what I'm doing :
- I'm creating three tokio runtimes, and executing 3 async functions on them, all of which communicate with each other using channels. I'm doing this because I'm making a webserver for model inference, and each part is responsible for one thing (preprocessing and batching, inferring from a tf model, and the server itself).
- The response handler for each request passes the data it recieves to a function running on another runtime through a mpsc transmitter, a clone of which is passed to all handlers. It also passes a oneshot Sender for the other runtime to send back the results to the response handler.
This works fine for moderate load, but under heavy load (50 threads, 100 loops), the oneshot receiver in the response handler seems to be dropped, and the server is unable to return the result.
I've attached a minimum reproducible example below:
#![allow(non_snake_case)]
#[macro_use]
extern crate log;
extern crate chrono;
use crossbeam_channel::{unbounded, Receiver, Sender};
use serde::{Deserialize, Serialize};
use std::time::Duration;
use tokio::runtime::Runtime;
use tokio::sync::oneshot;
use warp::{Filter, Rejection, Reply};
#[derive(Debug, Clone, Deserialize, Serialize)]
struct ServerResponse {
message: String,
}
impl warp::Reply for ServerResponse {
fn into_response(self) -> warp::reply::Response {
warp::reply::json(&self).into_response()
}
}
#[derive(Debug)]
struct HandlerData {
image_id: String,
send_results_oneshot: oneshot::Sender<ServerResponse>,
}
fn main() {
env_logger::init();
let (tx_batch_data, rx_batch_data) = unbounded::<Vec<HandlerData>>();
let server_thread = std::thread::spawn(move || match Runtime::new() {
Ok(rt) => {
rt.block_on(server(tx_batch_data));
}
Err(err) => error!("Error initializing runtime for server : {}", err),
});
let inference_and_cleanup_thread = std::thread::spawn(move || match Runtime::new() {
Ok(rt) => {
rt.block_on(inference_and_cleanup(&rx_batch_data));
}
Err(err) => error!("Error initializing runtime for inference : {}", err),
});
let _ = server_thread.join();
let _ = inference_and_cleanup_thread.join();
}
async fn server(tx_handler_data: Sender<Vec<HandlerData>>) {
let endpoint = warp::path!("imageId" / String)
.and(warp::any().map(move || tx_handler_data.clone()))
.and(warp::any().map(move || oneshot::channel::<ServerResponse>()))
.and_then(response_handler);
warp::serve(endpoint).run(([0, 0, 0, 0], 3030)).await;
}
async fn response_handler(
image_id: String,
tx_handler_data: Sender<Vec<HandlerData>>,
(send_results_oneshot, get_results): (
oneshot::Sender<ServerResponse>,
oneshot::Receiver<ServerResponse>,
),
) -> Result<impl Reply, Rejection> {
// create a oneshot sender and reciever for getting back the results
// send to batch and preprocess task
tx_handler_data
.send(vec![HandlerData {
image_id: image_id.clone(),
send_results_oneshot: send_results_oneshot,
}])
.unwrap_or_else(|e| {
error!(
"Error while sending the data from response handler! : {}",
e
)
});
let result: ServerResponse = get_results.await.unwrap_or_else(|e| {
error!(
"Error getting results from oneshot in response handler: {:?}",
e
);
// dummy val
ServerResponse {
message: "from error handler".to_string(),
}
});
Ok(result)
}
async fn inference_and_cleanup(rx_batch_data: &Receiver<Vec<HandlerData>>) {
loop {
let batch_received: Option<Vec<HandlerData>> = rx_batch_data.try_recv().ok();
if let Some(batch) = batch_received {
info!("Got a batch of size {}", batch.len());
tokio::time::sleep(Duration::from_millis(500)).await;
for ele in batch {
// tokio::time::delay_for(Duration::from_millis(10)).await;
let oneshot = ele.send_results_oneshot;
if !oneshot.is_closed() {
oneshot
.send(ServerResponse {
message: "worked successfully".to_string(),
})
.unwrap_or_else(|e| {
error!("Error while sending back results via oneshot : {:?}", e);
});
} else {
error!("Didn't send anything, the oneshot reciever was closed");
}
}
}
}
}
I keep getting Didn't send anything, the oneshot reciever was closed
in the logs under load.
What's going on? Is it something to do with the way it's architected or about how warp is handling the requests? I'd appreciate any help.