3

I have a task (downloading something from the Web) that runs regularly with pauses 10 min between runs.

If my program notices that the data is outdated, then it should run the download task immediately unless it is already running. If the download task happened out-of-time, the next task should be after 10 min since the out-of-time task so all future tasks and pauses are shifted later in time.

How do I do this with Tokio?

I made a library to run a sequence of tasks, but trying to use it for my problem failed.

mod tasks_with_regular_pauses;

use std::future::Future;
use std::pin::Pin;
use std::sync::Arc;
use tokio::spawn;
use tokio::sync::mpsc::{channel, Receiver, Sender};
use tokio::sync::Mutex;
use tokio::task::JoinHandle;
use tokio_interruptible_future::{
    interruptible, interruptible_sendable, interruptible_straight, InterruptError,
};

pub type TaskItem = Pin<Box<dyn Future<Output = ()> + Send>>;

/// Execute futures from a stream of futures in order in a Tokio task. Not tested code.
pub struct TaskQueue {
    tx: Sender<TaskItem>,
    rx: Arc<Mutex<Receiver<TaskItem>>>,
}

impl TaskQueue {
    pub fn new() -> Self {
        let (tx, rx) = channel(1);
        Self {
            tx,
            rx: Arc::new(Mutex::new(rx)),
        }
    }
    async fn _task(this: Arc<Mutex<Self>>) {
        // let mut rx = ReceiverStream::new(rx);
        loop {
            let this2 = this.clone();
            let fut = {
                // block to shorten locks lifetime
                let obj = this2.lock().await;
                let rx = obj.rx.clone();
                let mut rx = rx.lock().await;
                rx.recv().await
            };
            if let Some(fut) = fut {
                fut.await;
            } else {
                break;
            }
        }
    }
    pub fn spawn(
        this: Arc<Mutex<Self>>,
        notify_interrupt: async_channel::Receiver<()>,
    ) -> JoinHandle<Result<(), InterruptError>> {
        spawn(interruptible_straight(notify_interrupt, async move {
            Self::_task(this).await;
            Ok(())
        }))
    }
    pub async fn push_task(&self, fut: TaskItem) {
        let _ = self.tx.send(fut).await;
    }
}
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
porton
  • 5,214
  • 11
  • 47
  • 95
  • *unless it is already running* — what happens if... [(`time-0`, scheduled download begins), (`time-1`, outdated information is detected), (`time-2`, download finishes)] however, the download started **before** the server got the desired updated information and you still end up with old data? – Shepmaster Feb 15 '22 at 16:47
  • @porton, please clarify what do you mean by happened out-of-time (hopefully with an example)? How do you know that the data is outdated? You mention that you need to pause for 10 min regularly, but your code doesn't have it, could you update your example to include that? – battlmonstr Feb 15 '22 at 17:20
  • @Shepmaster I don't understand your question. – porton Feb 16 '22 at 19:05
  • @battlmonstr I download a list of blockchain nodes to connect to, with intervals between downloads 10 min. Sometimes a node is unresponsive (e.g. HTTP connection timed out), in this case I remove the node from the list of used nodes. When all or almost all nodes happen to be removed from the list, I start out-of-time download because without having nodes my software cannot function. but if I downloaded it out-of-time, I want to wait 10 min since the end of this out-of-time download, not since the end of the last regular download. – porton Feb 16 '22 at 19:07

1 Answers1

3

I'd recommend using select! instead of interruptible futures to detect one of 3 conditions in your loop:

  • download task is finished
  • the data is outdated signal
  • data expired timeout signal

"The data is outdated" signal can be conveyed using a dedicated channel.

select! allows waiting for futures (like downloading and timeouts), and reading from channels at the same time. See the tutorial for examples of that.

Solution sketch:

loop {
    // it is time to download
    let download_future = ...; // make your URL request
    let download_result = download_future.await;

    // if the outdated signal is generated while download
    // was in progress, ignore the signal by draining the receiver
    while outdated_data_signal_receiver.try_recv().is_ok() {}

    // send results upstream for processing
    download_results_sender.send(download_result); 

    // wait to re-download
    select! {
        // after a 10 min pause
        _ = sleep(Duration::from_minutes(10)) => break,
        // or by an external signal
        _ = outdated_data_signal_receiver.recv() => break,
    }
}

This logic can be simplified further by the timeout primitive:

loop {
    // it is time to download
    let download_future = ...; // make your URL request
    let download_result = download_future.await;

    // if the outdated signal is generated while download
    // was in progress, ignore the signal by draining the receiver
    while outdated_data_signal_receiver.try_recv().is_ok() {}

    // send results upstream for processing
    download_results_sender.send(download_result);

    // re-download by a signal, or timeout (whichever comes first)
    _ = timeout(Duration::from_minutes(10), outdated_data_signal_receiver.recv()).await;
}
battlmonstr
  • 5,841
  • 1
  • 23
  • 33
  • I will accept your answer: It is so simple, and I behaved like a bad mathematician writing very complex code. Apparently, my desire to apply the "hammer" (my interruptible futures) caused me to dig this "hole" in a very complex way, and the solution is easy. – porton Feb 16 '22 at 19:10