I am trying to parallelize my codes using the crate rayon
. The process is to read a file, process it and output the processed file.
I want to take note of the result of the processing of each file such that I have an Arc<Mutex<Vec<anyhow::Result<()>>>>
which I lock and push each anyhow::Result<()>
resulting from the processing of one file.
fn main() {
let (mut files, _) = utils::get_files_from_folder(input_folder)?;
let results = Arc::new(Mutex::new(Vec::<anyhow::Result<()>>::new()));
files.par_iter_mut().for_each(|path| {
if let Some(extension) = path.extension() {
if extension == "txt" {
let result = redact::redact_txt_and_write_json(path, ®ex_vec, &output_folder); // processing done here
results.lock().expect("`results` cannot be locked").push(result); // lock the mutex and push done here
} else {
eprintln!(
"{}INVALID EXTENSION: {} - Not yet implemented",
*RED_ERROR_STRING,
extension.to_string_lossy(),
);
std::process::exit(1);
};
()
} else {
eprintln!("{}EXTENSION not found", *RED_ERROR_STRING);
std::process::exit(1);
}
}); // end of for_each
println!(
"{:?}", results.as_ref()
);
Ok(())
}
My question is: why is it apparently, that with locking, it takes longer than without locking?
With locking:
Finished dev [unoptimized + debuginfo] target(s) in 1m 34s
Without locking:
Finished dev [unoptimized + debuginfo] target(s) in 0.30s