1

I'm looking for help with the correct syntax or Rust approach. My use case: I have a generic struct FileData, which has a variable called provider. Provider must implement AsRef<[u8]> so that data may come from static bytes, heap allocated memory, memory mapped, and possibly others. I have a couple methods to create FileData and they seem to be working well. But there is one

// ERROR: This is the line that I do not get right 
pub fn from_file<P: AsRef<Path>>(filename: P, mmap: bool) -> Result<FileData<T>, Box<dyn Error>> {
    if mmap == true {
        return FileData::mmap_file(filename)
    } else {
        return FileData::read_file(filename)
    }
}

which I don't get right. The method always returns a FileData, back depending on the 'mmap' argument, <T> is different. It can either be <Box<[u8]> or <Mmap>.

I searched for similar questions and articles, but could find one that matches my situation, e.g. (1), (2), (3).

#[derive(Debug)]
pub struct FileData<T: AsRef<[u8]>> {
    pub filename: String,              
    pub provider: T,                   // data block, file read, mmap, and potentially more
    pub fsize: u64,                    
    pub mmap: bool,                    
}

impl FileData<&[u8]> {
    /// Useful for testing. Create a FileData builder based on some bytes. 
    #[allow(dead_code)]
    pub fn from_bytes(data: &'static [u8]) -> Self {
        FileData {
            filename: String::new(),
            provider: data,
            fsize: data.len() as _,
            mmap: false,
        }
    }
}

pub fn path_to_string<P: AsRef<Path>>(filename: P) -> String {
    return String::from(filename.as_ref().to_str().unwrap_or_default());
}

pub fn file_size(file: &File) -> Result<u64, Box<dyn Error>> {
    Ok(file.metadata()?.len())
}

impl FileData<Box<[u8]>> {
    /// Read the full file content into memory, which will be allocated on the heap.
    #[allow(dead_code)]
    pub fn read_file<P: AsRef<Path>>(filename: P) -> Result<Self, Box<dyn Error>> {
        let mut file = File::open(&filename)?;
        let fsize = file_size(&file)?;

        let mut provider = vec![0_u8; fsize as usize].into_boxed_slice();
        let n = file.read(&mut provider)? as u64;
        assert!(fsize == n, "Failed to read all data from file: {} vs {}", n, fsize);

        Ok(FileData {
            filename: path_to_string(&filename),
            provider: provider,
            fsize: fsize,
            mmap: false,
        })
    }
}

impl FileData<Mmap> {
    /// Memory Map the file content
    #[allow(dead_code)]
    pub fn mmap_file<P: AsRef<Path>>(filename: P) -> Result<Self, Box<dyn Error>> {
        let file = File::open(&filename)?;
        let fsize = file_size(&file)?;
        let provider = unsafe { MmapOptions::new().map(&file)? };

        Ok(FileData {
            filename: path_to_string(&filename),
            provider: provider,
            fsize: fsize,
            mmap: true,
        })
    }
}

impl<T: AsRef<[u8]>> FileData<T> {
    #[allow(dead_code)]
    pub fn from_file<P: AsRef<Path>>(filename: P, mmap: bool) -> Result<FileData<_>, Box<dyn Error>> {
        if mmap == true {
            return FileData::mmap_file(filename)
        } else {
            return FileData::read_file(filename)
        }
    }

    pub fn as_ref(&self) -> &[u8] {
        return self.provider.as_ref()
    }
}

The error message is:

error[E0308]: mismatched types
  --> src\data_files\file_data.rs:87:20
   |
83 | impl<T: AsRef<[u8]>> FileData<T> {
   |      - this type parameter
84 |     #[allow(dead_code)]
85 |     pub fn from_file<P: AsRef<Path>>(filename: P, mmap: bool) -> Result<FileData<T>, Box<dyn Error>> {
   |                                                                  ----------------------------------- expected `std::result::Result<file_data::FileData<T>, 
std::boxed::Box<(dyn std::error::Error + 'static)>>` because of return type
86 |         if mmap == true {
87 |             return FileData::mmap_file(filename)
   |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected type parameter `T`, found struct `Mmap`
   |
   = note: expected enum `std::result::Result<file_data::FileData<T>, _>`
              found enum `std::result::Result<file_data::FileData<Mmap>, _>`
pretzelhammer
  • 13,874
  • 15
  • 47
  • 98
Juergen
  • 699
  • 7
  • 20
  • _"they seem to be working well. But there is one...which I don't get right."_ -- what error do you get? You didn't really explain what isn't working. – Peter Hall Jan 28 '21 at 12:45
  • Thanks, I updated it slightly. The compiler error msgs are different, depending on how I try to fix the mehtod signature. More precisely, the return type of the method. – Juergen Jan 28 '21 at 12:55

1 Answers1

1

Generics give the caller the right to decide what the return type of the function should be. Right now your function, the callee, is deciding the return type, which is why you're getting compiler errors.

You can refactor the code to give the right back to the caller by implementing an additional trait, IntoFileData, and then adding that as a trait bound to your generic FileData<T> implementation. Simplified commented example:

use memmap::Mmap;
use memmap::MmapOptions;
use std::error::Error;
use std::fs::File;
use std::io::Read;
use std::path::Path;

// simplified FileData for brevity
struct FileData<T: AsRef<[u8]>> {
    provider: T,
}

// new trait for converting types into FileData
trait IntoFileData<T: AsRef<[u8]>> {
    fn from_path(path: &Path) -> Result<FileData<T>, Box<dyn Error>>;
}

impl IntoFileData<Box<[u8]>> for Box<[u8]> {
    fn from_path(path: &Path) -> Result<FileData<Box<[u8]>>, Box<dyn Error>> {
        let mut file = File::open(path)?;
        let size = file.metadata()?.len();

        let mut provider = vec![0_u8; size as usize].into_boxed_slice();
        let read = file.read(&mut provider)? as u64;
        assert!(
            size == read,
            "Failed to read all data from file: {} vs {}",
            read,
            size
        );

        Ok(FileData { provider })
    }
}

impl IntoFileData<Mmap> for Mmap {
    fn from_path(path: &Path) -> Result<FileData<Mmap>, Box<dyn Error>> {
        let file = File::open(path)?;
        let provider = unsafe { MmapOptions::new().map(&file)? };

        Ok(FileData { provider })
    }
}

// this signature gives the caller the right to choose the type of FileData
impl<T: AsRef<[u8]> + IntoFileData<T>> FileData<T> {
    fn from_path(path: &Path) -> Result<FileData<T>, Box<dyn Error>> {
        T::from_path(path)
    }
}

fn example(path: &Path) {
    // caller asks for and gets file data as Box<[u8]>
    let file_data: FileData<Box<[u8]>> = FileData::from_path(path).unwrap();

    // caller asks for and gets file data as Mmap
    let file_data: FileData<Mmap> = FileData::from_path(path).unwrap();
}

playground


If you want to give the callee the right to decide the return type you must return a trait object. Simplified commented example:

use memmap::Mmap;
use memmap::MmapOptions;
use std::error::Error;
use std::fs::File;
use std::io::Read;
use std::path::Path;

// simplified FileData for brevity
struct FileData {
    provider: Box<dyn AsRef<[u8]>>,
}

fn vec_from_path(path: &Path) -> Result<FileData, Box<dyn Error>> {
    let mut file = File::open(path)?;
    let size = file.metadata()?.len();

    let mut provider = vec![0_u8; size as usize];
    let read = file.read(&mut provider)? as u64;
    assert!(
        size == read,
        "Failed to read all data from file: {} vs {}",
        read,
        size
    );

    Ok(FileData {
        provider: Box::new(provider),
    })
}

fn mmap_from_path(path: &Path) -> Result<FileData, Box<dyn Error>> {
    let file = File::open(path)?;
    let provider = unsafe { MmapOptions::new().map(&file)? };

    Ok(FileData {
        provider: Box::new(provider),
    })
}

impl FileData {
    fn from_path(path: &Path, mmap: bool) -> Result<FileData, Box<dyn Error>> {
        if mmap {
            mmap_from_path(path)
        } else {
            vec_from_path(path)
        }
    }
}

fn example(path: &Path) {
    // file data could be vec or mmap, callee decides
    let file_data = FileData::from_path(path, true).unwrap();
    let file_data = FileData::from_path(path, false).unwrap();
}

playground

pretzelhammer
  • 13,874
  • 15
  • 47
  • 98
  • Thanks a lot on the clarification about generics and giving power to the caller. Not sure it solves my problem though. Let's assume I have a filename and depending on some file characteristics (e.g. needs decompression; below a certain size; etc.) the file should be loaded either via mmap (preferred) or read the content (possibly decompressing on the fly). The library user (caller) should not need to worry about this internal logic. Provided I understand you correct, then I need to find an approach w/o generics.Any thoughts how to achieve this? May be a trait. – Juergen Jan 28 '21 at 13:29
  • @Juergen you can achieve this using trait objects. I've updated my answer with an example solution using trait objects. – pretzelhammer Jan 28 '21 at 13:41