2

It is easy to start a process from a specific directory with Lwt using the functions Sys.getpwd, Lwt_unix.chdir and Lwt_process.exec:

  1. Use Sys.getpwd to save the current working directory
  2. Use Lwt_unix.chdir to change to the specific directory
  3. Use Lwt_process.exec to start the external process
  4. Use Lwt_unix.chdir to change to the saved current working directory

This logic is flawed, for it allows the scheduler to run another thread after after the first call to Lwt_unix.chdir and after the call to Lwt_process.exec which would lead this thread to be run in special directory rather than in the saved current directory. Is it possible to easily start a process from a special directory with Lwt without introducing a race condition such as the one I describe?

Michaël Le Barbier
  • 6,103
  • 5
  • 28
  • 57

1 Answers1

2

You can protect your current working directory with with some synchronization primitive, like Lwt_mutex. But there is some caveat here, suppose you have this chain:

lock dir_guard >> chdir dir >> exec proc >> chdir dir' >> unlock dir_guard

Which disallows changing the directory for the whole time the process proc is performing its task. This maybe overcautious and unnecessary. The following code doesn't have this problems:

let exec_in_folder guard dir proc = 
  with_lock guard (fun () -> 
     chdir dir >>= fun () -> return (exec proc)) >>= fun proc_t ->
  proc_t

But, this code, has an issue, it is correct only if process is started atomically, i.e., if there is no such possibility that during the process starting procedure there will be some rescheduling, that will allow other thread to interfere and to change current folder. To proof that it is atomic, you can either read sources, or implement your own process started, that will have such guarantees. If you will read the code, then you will figure out, that process is created with spawn function, that momentary will do a fork without any interspersed threads. So yes, this code is correct.

ivg
  • 34,431
  • 2
  • 35
  • 63
  • Thanks for your detailed answer. It seems your `exec_in_folder` function “forgets” to save and restore the cwd before the function is started. I guess, the full program needs to collaborate and use a global mutex to protect the cwd, it would be nice to have a *local* solution *i.e.* a solution which can be seen to be correct just by looking a the current portion of code. – I think this should be reported as an issue in **Lwt** what about you? – Michaël Le Barbier Jun 16 '15 at 22:07
  • yep, I've left details out of scope, to keep it succinct. I don't feel that there is any issue with lwt (expect, that they could document `process` better). Working dir is just a hidden global variable. But, what can be done in lwt, is that they can add `?working_dir` option to `exec` function, as guys from Janestreet did in their `Async` library. Underneath the hood, they just change cwd in a blocking way, so no race condition is possible. P.S. @Michael, sorry for answering late. Didn't mention your comment. – ivg Jun 21 '15 at 19:23
  • My bad, I forgot to mention you. :) I opened an issue for this in https://github.com/ocsigen/lwt/issues/163 I think I will prepare a PR in the coming weeks – I have a *tiiiight* schedule right now. :) Since it has been almost a week since you answered, every one had the chance to take it, so I will accept yours. – Michaël Le Barbier Jun 21 '15 at 19:31