How do I implement an iterator from a vector of std::Rc> smart pointers?

Question

I'm trying to understand how to work with interior mutability. This question is strongly related to my previous question.

I have a generic struct Port<T> that owns a Vec<T>. We can "chain" port B to port A so, when reading the content of port A, we are able to read the content of port B. However, this chaining is hidden to port A's reader. That is why I implemented the iter(&self) method:

use std::rc::Rc;

pub struct Port<T> {
    values: Vec<T>,
    ports: Vec<Rc<Port<T>>>,
}

impl <T> Port<T> {
    pub fn new() -> Self {
        Self { values: vec![], ports: vec![] }
    }

    pub fn add_value(&mut self, value: T) {
        self.values.push(value);
    }

    pub fn is_empty(&self) -> bool {
        self.values.is_empty() && self.ports.is_empty()
    }

    pub fn chain_port(&mut self, port: Rc<Port<T>>) {
        if !port.is_empty() {
            self.ports.push(port)
        }
    }

    pub fn iter(&self) -> impl Iterator<Item = &T> {
        self.values.iter().chain(
            self.ports.iter()
                .flat_map(|p| Box::new(p.iter()) as Box<dyn Iterator<Item = &T>>)
        )
    }

    pub fn clear(&mut self) {
        self.values.clear();
        self.ports.clear();
    }
}

The application has the following pseudo-code behavior:

create ports
loop:
- fill ports with values
- chain ports
- iterate over ports' values
- clear ports

The main function should look like this:

fn main() {
    let mut port_a = Rc::new(Port::new());
    let mut port_b = Rc::new(Port::new());

    loop {
        port_a.add_value(1);
        port_b.add_value(2);

        port_a.chain_port(port_b.clone());

        for val in port_a.iter() {
            // read data
        };

        port_a.clear();
        port_b.clear();
    }
}

However, the compiler complains:

error[E0596]: cannot borrow data in an `Rc` as mutable
  --> src/modeling/port.rs:46:9
   |
46 |         port_a.add_value(1);
   |         ^^^^^^ cannot borrow as mutable
   |
   = help: trait `DerefMut` is required to modify through a dereference, but it is not implemented for `Rc<Port<i32>>`

I've been reading several posts etc., and it seems that I need to work with Rc<RefCell<Port<T>>> to be able to mutate the ports. I changed the implementation of Port<T>:

use std::cell::RefCell;
use std::rc::Rc;

pub struct Port<T> {
    values: Vec<T>,
    ports: Vec<Rc<RefCell<Port<T>>>>,
}

impl<T> Port<T> {

    // snip

    pub fn chain_port(&mut self, port: Rc<RefCell<Port<T>>>) {
        if !port.borrow().is_empty() {
            self.ports.push(port)
        }
    }

    pub fn iter(&self) -> impl Iterator<Item = &T> {
        self.values.iter().chain(
            self.ports
                .iter()
                .flat_map(|p| Box::new(p.borrow().iter()) as Box<dyn Iterator<Item = &T>>),
        )
    }

    // snip
}

This does not compile either:

error[E0515]: cannot return value referencing temporary value
  --> src/modeling/port.rs:35:31
   |
35 |                 .flat_map(|p| Box::new(p.borrow().iter()) as Box<dyn Iterator<Item = &T>>),
   |                               ^^^^^^^^^----------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |                               |        |
   |                               |        temporary value created here
   |                               returns a value referencing data owned by the current function

I think I know what the problem is: p.borrow() returns a reference to the port being chained. We use that reference to create the iterator, but as soon as the function is done, the reference goes out of scope and the iterator is no longer valid.

I have no clue on how to deal with this. I managed to implement the following unsafe method:

    pub fn iter(&self) -> impl Iterator<Item = &T> {
        self.values.iter().chain(self.ports.iter().flat_map(|p| {
            Box::new(unsafe { (&*p.as_ref().as_ptr()).iter() }) as Box<dyn Iterator<Item = &T>>
        }))
    }

While this works, it uses unsafe code, and there must be a safe workaround.

I set a playground for more details of my application. The application compiles and outputs the expected result (but uses unsafe code).

Why must main look like this? Can't you make a `new()` port_a, port_b each iteration, and save the manual `clean()` call? If not, why? That would already fix all immediate ownership issues - because they are inherent to that formulation. To get rid of the remaining complexity perhaps rewrite the recursive iter into an iterative one. — L. Riemer, Jun 09 '21 at 15:08
The idea is that we have two different "entities" that behave independently. Each entity has a `Port` (or an `Rc>>`), and the entity containing `port_b` can send messages to the entity containing `port_a`. Of course, this is a very simple case. Usually, you will have tens of entities, each of them with 2 or 3 ports, and how ports are "chained" also depends on the use case. Is it efficient to create/destroy the ports on every iteration? — Román Cárdenas, Jun 09 '21 at 15:15
You already do most of the work by cleaning up the structs after each iteration manually. There is no use to carrying the structs across iterations, but a significant (in this case fatal) burden to ownership reasoning. — L. Riemer, Jun 09 '21 at 15:19

L. Riemer · Answer 1 · 2021-06-09T17:21:11.707

You can't modify anything behind an Rc, that's correct. While this might be solved with a RefCell, you don't want to go down that road. You might come into a situation where you'd need to enforce a specific clean() order or similar horrors.

More important: your main is fundamentally flawed, ownership-wise. Take these lines:

let mut port_a = Port::new();
let mut port_b = Port::new();

loop {
    // Creates an iummutable borrow of port_b with same lifetime as port_a!
    port_a.chain_port(port_b);

    // ...

    // A mutable borrow of port_b.
    // But the immutable borrow from above persists across iterations.
    port_b.clear();
    // Or, even if you do fancy shenanigans at least until this line.
    port_a.clear();
}

To overcome this, just constrain the ports lifetime to one iteration. You currently manually clean them up anyway, so that's already what you're doing conceptually.

Also, I got rid of that recursive iteration, just to simplify things a little more.

#[derive(Clone)]
pub struct Port<'a, T> {
    values: Vec<T>,
    ports: Vec<&'a Port<'a, T>>,
}

impl<'a, T> Port<'a, T> {
    pub fn new() -> Self {
        Self {
            values: vec![],
            ports: vec![],
        }
    }

    pub fn add_value(&mut self, value: T) {
        self.values.push(value);
    }

    pub fn is_empty(&self) -> bool {
        self.values.is_empty() && self.ports.is_empty()
    }

    pub fn chain_port(&mut self, port: &'a Port<T>) {
        if !port.is_empty() {
            self.ports.push(&port)
        }
    }

    pub fn iter(&self) -> impl Iterator<Item = &T> {
        let mut port_stack: Vec<&Port<T>> = vec![self];

        // Sensible estimate I guess.
        let mut values: Vec<&T> = Vec::with_capacity(self.values.len() * (self.ports.len() + 1));

        while let Some(port) = port_stack.pop() {
            values.append(&mut port.values.iter().collect());
            port_stack.extend(port.ports.iter());
        }

        values.into_iter()
    }
}

fn main() {
    loop {
        let mut port_a = Port::new();
        let mut port_b = Port::new();

        port_a.add_value(1);
        port_b.add_value(2);

        port_a.chain_port(&port_b);
 
        print!("values in port_a: [ ");
        for val in port_a.iter() {
            print!("{} ", val);
        }
        println!("]");
    }
}

I'll write a quick prototype of my application following your suggestions. Let's see how it goes — Román Cárdenas, Jun 09 '21 at 16:01
I implemented a new version of my application mixing your idea with the answer to my previous question. However, I changed significantly the code (I implemented the `Entity` struct). Should I post this as an alternative answer or should I edit my question? I created a [new playground](https://play.rust-lang.org) with my new code. I'd appreciate your opinion. Thanks in advance! — Román Cárdenas, Jun 11 '21 at 08:04
* Entities have input and output ports. * Input ports are owned by the entities, and exist forever * Output ports are created/filled/chained on every iteration * I didn't strictly follow your suggestion because you iterate and consume all the chained ports for creating an iterator of a port. That may degrade performance, isn't it? — Román Cárdenas, Jun 11 '21 at 08:06
I could take a look, if the link were working. If you have another question, open another question. If you have minor edits, edit the question. If you just actually had an entirely different problem, I see no point in posting an alternative answer - it wouldn't benefit future readers coming here by searching. — L. Riemer, Jun 11 '21 at 10:52
Concerning performance: it depends. If you have two vectors of a million elements, the first option might be better. If you have a million vectors of two elements, I could see mine performing better. Not that there would be any significant difference for such tiny quantities. — L. Riemer, Jun 11 '21 at 10:54

How do I implement an iterator from a vector of std::Rc> smart pointers?

1 Answers1

Linked