If I got you right that you want to make WebRTC - aka primarily browser targeted feature to be used without browser:-)
I could imagine that "emulating" the browser behaviour can be done simply by implementing the necessary api via your own code, either directly inside the rhino or similar or by actually controlling the interface that handles the media streams in native code.
Thus what has to be done is implement the WebRTC api which controls capturing the A/V from input devices and sending it to the other side. As I understood it shall be no UI node, like embedded ethernet camera with mic that servers as capture A/V in conference room.
I am affraid that it could be a piece of work as the main part is the media a connection handling.