0

tl;dr in Rust, I want a struct to provide an Iterator on a contained iterable struct member but to interrogate the iterated values before returning them

General Overview

  1. A "wrapper" struct contains a "specialized container".
  2. the "wrapper" should implement an Iterator function next
  3. within next the "wrapper" will call the "specialized container" next,
  4. the "wrapper" will analyze the value returned from the "specialized container"
  5. the "wrapper" will return the value to the caller

Specific Example

My specific case deals with rust crate evtx. I would like to do the following:

  1. my "wrapper" struct, RecordsReader, contains an EvtxParser.
  2. the RecordsReader implements Iterator function into_iter and next which returns a std::result::Result<SerializedEvtxRecord<String>, EvtxError>
  3. within function RecordsReader::next
    a. it calls self.evtxparser.records(), creates an iterator
    b. the result of self.evtxparser.records().next() is a SerializedEvtxRecord<String>. That instance is analyzed and various statistics about it stored within RecordsReader instance.
    c. the SerializedEvtxRecord<String> instance is then returned to the caller of recordsreader.next().
    d. the state of the iterator created by self.evtxparser.records() is maintained between calls to recordsreader.next(). (how to do this part?)

Partial rust code (rust playground (will not compile because evtx crate is not available)):

use ::evtx::{
    err::EvtxError,
    EvtxParser,
    SerializedEvtxRecord,
};
use std::fs::File;
use std::path::PathBuf;

pub type EvtxRS = SerializedEvtxRecord<String>;
pub type ResultEvtxRS = std::result::Result<EvtxRS, EvtxError>;
pub type EParser = EvtxParser<File>;


struct RecordsReader {
    pub eparser: EParser,
}
pub struct RecordsReaderIterator<'a> {
    recordsreader: &'a mut RecordsReader,
    index: usize, // hack iterator state
}
impl<'a> Iterator for RecordsReaderIterator<'a> {
    type Item = ResultEvtxRS;
    fn next(&mut self) -> Option<ResultEvtxRS> {
        // ----------------------------------------------
        // HOW TO AVOID RECREATING THE self.recordsreader.parser.records()
        // ON EVERY CALL OF next() ?
        // Or, at least, make next() into *O(n)* time?
        // ----------------------------------------------
        // Hack iterator approach; O(n^2)
        let result = 
            self.recordsreader.eparser.records().nth(self.index);
        self.index += 1;
        // ... store other statistics about `result` before returning it ...
        result
    }
}
impl<'a> IntoIterator for &'a mut RecordsReader {
    type Item = ResultEvtxRS;
    type IntoIter = RecordsReaderIterator<'a>;
    fn into_iter(self) -> Self::IntoIter {
        RecordsReaderIterator {
            recordsreader: self,
            index: 0,
        }
    }
}

fn main() {
    let pathb = PathBuf::from("file.evtx");
    let mut eparser: EParser = EParser::from_path(pathb).unwrap();
    let mut recordsreader: RecordsReader = RecordsReader {
        eparser,
    };
    for recordopt in recordsreader.into_iter() {
        match recordopt {
            ResultEvtxRS::Ok(record) => {
                eprintln!("record[{:?}] {:?}", record.event_record_id, record.timestamp);
            }
            ResultEvtxRS::Err(err) => {
                eprintln!("ERROR: {}", err);
            }
        }
    }
}

The prior code functions BUT next() is O(n^2) time! That's because the state of the self.recordsreader.eparser.records() iterator cannot be saved. That "inner iterator" must be recreated on every call to next().

How can I implement an Iterator function for RecordsReader that does not have to reset the "inner iterator" state of eparser.records() on every call?
Barring that question, how can next() become O(n) time?


Similar Questions

I read the Answers to these Questions. As far as I could determine, these are similar but not quite the same as this Question. (I tried to implement many of these Answers but was unable)

JamesThomasMoon
  • 6,169
  • 7
  • 37
  • 63
  • 1
    Should you not just store the result of calling `recordsreader.eparser.records()` in the iterator? Why are you recreating the iterator every invocation? `RecordsReaderIterator::recordsreader` should just be an iterator, then you delegate to its `next()`. This is exactly what `.map()` does... which you could probably use here instead of this construction. – cdhowie Mar 21 '23 at 09:05
  • note: `next()` already is O(n), you want calling `next()` n times to be O(n) which happens if `next()` is O(1) – cafce25 Mar 21 '23 at 11:53
  • @cdhowie "_Should you not just store the result of calling recordsreader.eparser.records() in the iterator?_" I played around with trying to do this; many permutations of this code. I couldn't figure out how to store the iterator instance. Every permutation refused to compile. I could have included all those failed attempts in my Question but it was too much to explain. "_Why are you recreating the iterator every invocation?_" That's what I'm trying to avoid doing, hence the Question. "_what .map() does..._" I'll try using `map`. Thanks! – JamesThomasMoon Mar 21 '23 at 20:42
  • @cafce25 "_next() already is O(n), you want calling next() n times to be O(n) which happens if next() is O(1)_". Yes, to clarify, I couldn't figure out how to only call `next` for _O(n)_ times. The example I provided is known to be a bad implementation. My Question is, how to fix that? – JamesThomasMoon Mar 21 '23 at 20:44

1 Answers1

2

Easiest is to just use Iterator::inspect:

use ::evtx::{
    err::EvtxError,
    EvtxParser,
    SerializedEvtxRecord,
};
use std::fs::File;
use std::path::PathBuf;

pub type EvtxRS = SerializedEvtxRecord<String>;
pub type ResultEvtxRS = std::result::Result<EvtxRS, EvtxError>;
pub type EParser = EvtxParser<File>;

fn main() {
    let pathb = PathBuf::from("file.evtx");
    let mut eparser: EParser = EParser::from_path(pathb).unwrap();
    for recordopt in eparser.into_iter()
            .inspect (|record| todo!("Add record to statistics") {
        match recordopt {
            ResultEvtxRS::Ok(record) => {
                eprintln!("record[{:?}] {:?}", record.event_record_id, record.timestamp);
            }
            ResultEvtxRS::Err(err) => {
                eprintln!("ERROR: {}", err);
            }
        }
    }
}
Jmb
  • 18,893
  • 2
  • 28
  • 55