Description
I'm writing a Rust program that includes unarchiving a .tar.gz file. I followed the conventional approach, using crates tar
and flate2
:
// the function returns Result((), Box<dyn Error>)
let file = std::fs::File::open(some_path)?;
let decoder = flate2::read::GzDecoder::new(file);
let mut arc = tar::Archive::new(decoder);
for entry in arc.entries()? {
let mut file = entry?; // <- where it errs
// ...
}
I then downloaded the target file from a source, say http://example.com/file.tar.gz
, and the program returned an error:
numeric field did not have utf-8 text: _����~�h when getting cksum for �
I searched the error message on the internet but none of the results looked like my case. I did remember to decompress the file using GzDecoder
, and the file didn't seem corrupted – I could just double-click it and the system archive utility would unarchive it successfully. As I continued to find where it could go wrong, I was puzzled by the finding.
Problem
Initially, I downloaded the .tar.gz file using Firefox Browser. When I switched to Edge and cURL, my program raised no error. I compared the checksums of files downloaded via different methods; the Firefox one's was different while the other two had the same checksums.
I wonder how would the download medium be the source of problem. Even so, the system archive utility seemed undisturbed by the difference. I wonder if I can modify my code to avoid such problem, too.