2

I'm trying to write this following code for a server:

use std::io::{BufReader, BufWriter};
use std::net::TcpStream;

struct User<'a> {
    stream: Arc<TcpStream>,
    reader: BufReader<&'a TcpStream>,
    writer: BufWriter<&'a TcpStream>,
}

fn accept_socket(users: &mut Vec<User>, stream: Arc<TcpStream>) {
    let stream_clone = stream.clone();
    let user = User {
        stream: stream_clone,
        reader: BufReader::new(stream_clone.as_ref()),
        writer: BufWriter::new(stream_clone.as_ref()),
    };
    
    users.push(user);
}

The stream is behind an Arc because it is shared across threads. The BufReader and BufWriter point to the User's own Arc, but the compiler complains that the reference stream_clone.as_ref() does not live long enough, even though it obviously does (it points to the Arc, which isn't dropped as long as the User is alive). How do I get the compiler to accept this code?

cry0genic
  • 544
  • 3
  • 15

3 Answers3

4

Self-referential structs are a no-go. Rust has no way of updating the address in the references if the struct is moved since moving is always a simple bit copy. Unlike C++ with its move constructors, there's no way to attach behavior to moves.

What you can do instead is store Arcs inside the reader and writer so they share ownership of the TcpStream.

struct User {
    stream: Arc<TcpStream>,
    reader: BufReader<IoArc<TcpStream>>,
    writer: BufWriter<IoArc<TcpStream>>,
}

The tricky part is that Arc doesn't implement Read and Write. You'll need a newtype that does (IoArc, above). Yoshua Wuyts wrote about this problem:

One of those patterns is perhaps lesser known but integral to std’s functioning: impl Read/Write for &Type. What this means is that if you have a reference to an IO type, such as File or TcpStream, you’re still able to call Read and Write methods thanks to some interior mutability tricks.

The implication of this is also that if you want to share a std::fs::File between multiple threads you don’t need to use an expensive Arc<Mutex<File>> because an Arc<File> suffices.

You might expect that if we wrap an IO type T in an Arc that it would implement Clone + Read + Write. But in reality it only implements Clone + Deref<T>... However, there's an escape hatch here: we can create a wrapper type around Arc<T> that implements Read + Write by dereferencing &T internally.

Here is his solution:

/// A variant of `Arc` that delegates IO traits if available on `&T`.
#[derive(Debug)]
pub struct IoArc<T>(Arc<T>);

impl<T> IoArc<T> {
    /// Create a new instance of IoArc.
    pub fn new(data: T) -> Self {
        Self(Arc::new(data))
    }
}

impl<T> Read for IoArc<T>
where
    for<'a> &'a T: Read,
{
    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
        (&mut &*self.0).read(buf)
    }
}

impl<T> Write for IoArc<T>
where
    for<'a> &'a T: Write,
{
    fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
        (&mut &*self.0).write(buf)
    }

    fn flush(&mut self) -> io::Result<()> {
        (&mut &*self.0).flush()
    }
}

MIT license

IoArc is available in the io_arc crate, though it is short enough to implement yourself if you don't want to pull in the dependency.

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
1

Simple answer: You can't.

In Rust, every type is implicitly movable by memcpy. So if your type stores references to itself, it would break as soon as the move happens; the references would be dangling.

More complex answer: You can't, unless you use Pin, unsafe and raw pointers.

But I'm pretty sure that using Arc for everything is the way to go instead.

Arc<TcpStream> does not implement Read or Write

You could just write a very thin wrapper struct around Arc<TcpStream> which implements Read and Write. It should be fairly easy.

Edit: Take a look at @JohnKugelman's anwser for such a wrapper.

Finomnis
  • 18,094
  • 1
  • 20
  • 27
0

Your code doesn't compile because you are using a reference that the compiler attributed as being good only while the function is running -- even if the Arc holding it may live much longer.

But what would happen in the following scenario:

  • a User holds the last Arc to TcpStream
  • User is dropped, but the stream field is dropped first ?

You should be careful when dropping -- re-arranjing the declaration order deals with that: fields are dropped in the order they were declared.

That being said, there is indeed a recommended way to achieve self references -- it is documented in the Pin module:

https://doc.rust-lang.org/std/pin/index.html#example-self-referential-struct

But you don't need it here, since you are not using a reference to your own struct -- and the "pinning" is played by Arc in your code.

Even so, if you want to go through the unsafe route, you'll need to tell the compiler to turn off some of its checks, as you "know what you are doing".

There is little to no benefit in going unsafe here (see John Kugelman's answer for a safe version), but, anyway, here it is, since it avoids creating & destroying 2 Arcs per connection, which might save you some dozens or hunderds of nano-seconds.

Without further ado, here is a version -- closer to your original code -- that compiles:

(but, please, notice the compiler is no longer proving this code is correct)

use std::io::{BufReader, BufWriter};
use std::net::TcpStream;
use std::sync::Arc;

struct User {
    reader: BufReader<&'static TcpStream>,
    writer: BufWriter<&'static TcpStream>,
    // notice the field order is important here:
    // `stream` must be dropped after `reader` and `writer` or else they
    // would hold invalid references. Dropping is done in the same order as declaration.
    stream: Arc<TcpStream>,
}

fn accept_socket(users: &mut Vec<User>, stream: Arc<TcpStream>) {
    let stream_ref = unsafe { &*Arc::as_ptr(&stream) };
    let user = User {
        reader: BufReader::new(stream_ref),
        writer: BufWriter::new(stream_ref),
        stream,
    };

    users.push(user);
}

fn main() {
    let stream = Arc::new(TcpStream::connect("google.com:80").unwrap());
    let mut users = Vec::with_capacity(16);  // giving a big enough capacity may squeeze a little bit more performance -- avoids reallocations
    accept_socket(&mut users, stream);
}
zertyz
  • 601
  • 6
  • 9
  • I see many problems here. Your use of `Arc::into_raw` without a corresponding `Arc::from_raw` will cause a memory leak. Calling `drop(self)` that way does nothing. *"makes sure the Arc is valid even if the stream field is deleted first"* - struct drop order is the same as the declared order, so if you order `stream` after `reader` and `writer`, you can be sure it is dropped last. Your `User` struct still has a lifetime when it shouldn't; you probably need an intermediary that implements `Read` and `Write` through a `*const TcpStream` to avoid that. – kmdreko Sep 02 '23 at 00:12
  • Thanks for the feedback. I've made some modifications that should address the topics you pointed out: Arc now has no leaks, Drop was fixed and references were simplified to 'static. I'd appreciate if you could evaluate this again, as I am considering using something similar to this to do a micro optimization in my own code. I still left the Arcs & references to the inner types as I want to try to avoid using pointers. Do you think it would be safer? – zertyz Sep 02 '23 at 00:57
  • What you are doing in the `Drop` implementation is still nonsense; the `arc_pointer` does nothing to preserve `stream` since it will be destroyed before the fields of `User` are. But fortunately you don't need the `Drop` implementation at all since you are now dropping fields in the right order. To avoid the strong-count issue from `into_raw`, just don't use that method; use `Arc::as_ptr` instead. As for using it yourself for micro optimizations, this has *very* little advantage over John Kugelman's solution. – kmdreko Sep 02 '23 at 01:09
  • Wonderful! Much simpler now! Thank you! – zertyz Sep 02 '23 at 01:30
  • Nice, I *think* this is fine (you still have the responsibility to never expose the reader and writer as `'static` because they're not) but it'd be good to verify `unsafe` code with `miri` (though this exact rendition of code can't be ran under miri since it tries to use OS utilities, but a variant should be made that can). – kmdreko Sep 02 '23 at 01:41