2

I am trying to copy a socket and send it to a different process in Python.

The socket is created in rust and is shared as a Python object through PyO3.

Here is the shared socket code


use pyo3::prelude::*;

use socket2::{Domain, Protocol, Socket, Type};
use std::net::SocketAddr;

#[pyclass]
#[derive(Debug)]
pub struct SocketHeld {
    pub socket: Socket,
}

#[pymethods]
impl SocketHeld {
    #[new]
    pub fn new(address: String, port: i32) -> PyResult<SocketHeld> {
        let socket = Socket::new(Domain::IPV4, Type::STREAM, Some(Protocol::TCP))?;
        println!("{}", address);
        let address: SocketAddr = address.parse()?;
        socket.set_reuse_address(true)?;
        //socket.set_reuse_port(true)?;
        socket.bind(&address.into())?;
        socket.listen(1024)?;

        Ok(SocketHeld { socket })
    }

    pub fn try_clone(&self) -> PyResult<SocketHeld> {
        let copied = self.socket.try_clone()?;
        Ok(SocketHeld { socket: copied })
    }
}

impl SocketHeld {
    pub fn get_socket(&self) -> Socket {
        self.socket.try_clone().unwrap()
    }
}


Below is the python code, where I am trying to start two different processes. I tried using the native multiprocessing library, the fork of multiprocess library and even the pathos library.



    def start(self, url="127.0.0.1", port=5000):
        """
        [Starts the server]

        :param port [int]: [reperesents the port number at which the server is listening]
        """
        socket = SocketHeld(f"0.0.0.0:{port}", port)
        if not self.dev:
            from pathos.pools import ProcessPool
            pool = ProcessPool(nodes=2)
            # spawned_process(url, port, self.routes, socket.try_clone(), f"Process {1}")
            pool.map(spawned_process, [(url, port, self.routes, socket.try_clone(), f"Process {1}"), (url, port, self.routes, socket.try_clone(), f"Process {2}")])
            # for i in range(2):
            #     copied = socket.try_clone()
            #     p = Pool().map(
            #         spawned_process,
            #         args=(self.routes, copied, f"Process {i}"),
            #     )
            #     p.start()

            # input("Press Cntrl + C to stop \n")
            # self.server.start(url, port)
        else:
            ...

However, I am still getting the error that the Object cannot be serialized.

I get the following error:


Traceback (most recent call last):
  File "integration_tests/base_routes.py", line 75, in <module>
    app.start(port=5000, url='0.0.0.0')
  File "/Users/bruhh/.pyenv/versions/maturin/lib/python3.8/site-packages/robyn/__init__.py", line 95, in start
    pool.map(spawned_process, [(url, port, self.routes, socket.try_clone(), f"Process {1}"), (url, port, self.routes, socket.try_clone(), f"Process {2}")])
  File "/Users/bruhh/.pyenv/versions/maturin/lib/python3.8/site-packages/pathos/multiprocessing.py", line 139, in map
    return _pool.map(star(f), zip(*args)) # chunksize
  File "/Users/bruhh/.pyenv/versions/maturin/lib/python3.8/site-packages/multiprocess/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/Users/bruhh/.pyenv/versions/maturin/lib/python3.8/site-packages/multiprocess/pool.py", line 771, in get
    raise self._value
  File "/Users/bruhh/.pyenv/versions/maturin/lib/python3.8/site-packages/multiprocess/pool.py", line 537, in _handle_tasks
    put(task)
  File "/Users/bruhh/.pyenv/versions/maturin/lib/python3.8/site-packages/multiprocess/connection.py", line 209, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/Users/bruhh/.pyenv/versions/maturin/lib/python3.8/site-packages/multiprocess/reduction.py", line 54, in dumps
    cls(buf, protocol, *args, **kwds).dump(obj)
  File "/Users/bruhh/.pyenv/versions/maturin/lib/python3.8/site-packages/dill/_dill.py", line 498, in dump
    StockPickler.dump(self, obj)
  File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 485, in dump
    self.save(obj)
  File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 558, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 899, in save_tuple
    save(element)
  File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 558, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 884, in save_tuple
    save(element)
  File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 558, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 884, in save_tuple
    save(element)
  File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 558, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 884, in save_tuple
    save(element)
  File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 558, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 884, in save_tuple
    save(element)
  File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 558, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 899, in save_tuple
    save(element)
  File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 576, in save
    rv = reduce(self.proto)
TypeError: cannot pickle 'builtins.SocketHeld' object


Am going wrong conceptually somewhere here? What is the solution for this?

PS:

I am trying to start a server runtime in the process.


def spawned_process(url, port, handlers, socket, name):
    import asyncio
    import uvloop

    uvloop.install()
    loop = uvloop.new_event_loop()
    asyncio.set_event_loop(loop)

    print(handlers)
    server = Server()


    for i in handlers:
        route_type, endpoint, handler, is_async, number_of_params = i
        print(i)
        server.add_route(route_type, endpoint, handler, is_async, number_of_params)

    print(socket, name)
    server.start(url, port, socket, name)
    asyncio.get_event_loop().run_forever()

Sanskar Jethi
  • 544
  • 5
  • 17
  • 1
    I dont think you can send a socket to another process (could be to another thread). Did you actually try to did the same in python with the socket library itself? – Netwave Oct 31 '21 at 16:04
  • @Netwave , I did not yet. Since the majority of the codebase is in rust only. I was trying to stick to it. Also, I don't understand how it would be any different conceptually. Since, the socket is being cloned fine. I thought this should work fine. – Sanskar Jethi Oct 31 '21 at 16:08

1 Answers1

4

Sockets are basically just a process-relativ reference to some OS kernel structure. Since pickling involves only the user space part of this reference and not the kernel structure, sockets can not be simply be pickled and restored at some other process, different machine etc.

In UNIX systems file descriptors can be passed between processes via UNIX domain sockets, which will create another reference to the same kernel structure in the other process. This will not work with SSL sockets though, since there is some user space state related to this socket which is neither part of the file descriptor process nor of the Python specific pickling.

Steffen Ullrich
  • 114,247
  • 10
  • 131
  • 172
  • what approach should I take instead? – Sanskar Jethi Oct 31 '21 at 16:10
  • @SanskarJethi, some producer/consumers pattern maybe will do. But we lack context on what you are trying to do. – Netwave Oct 31 '21 at 16:14
  • @SanskarJethi: You only ask about problem Y of [XY problem](https://en.wikipedia.org/wiki/XY_problem). This problem Y cannot be solved in a generic way. You therefore might need to rethink you approach of solving problem X. We cannot help here since we don't know X. – Steffen Ullrich Oct 31 '21 at 16:17
  • Sorry for the limited info. I am trying to run an execution runtime in each process. Basically, I am trying to read a TCP socket from multiple processes. I have updated the code snippet in the description, maybe that will be more helpful. – Sanskar Jethi Oct 31 '21 at 16:20
  • Giving it another thought, maybe trying queues would work. As I am copying sockets in the main process anyway and I don't need to share the single socket across the processes. – Sanskar Jethi Oct 31 '21 at 16:37
  • @SanskarJethi: OS is unknown. But at least in UNIX there are several models of servers are implemented and which don't need explicit sharing of file descriptors between processes. They make use of the fact that forked children inherit the file descriptors of the parent. In forking a new child is forked for each connection and the connected file descriptor is shared when forking. In pre-forking the server file descriptor is shared and each child will call its own accept loop. – Steffen Ullrich Oct 31 '21 at 16:54
  • Sockets _can_ be pickled, cf. https://github.com/python/cpython/blob/ef25febcf2ede92a03c5ea00a13e167e0b5cb274/Lib/multiprocessing/reduction.py#L230, as Linux supports copying a socket to another process. – auxsvr Apr 25 '23 at 18:43