How do I play a stream of data with custom but partially supported MimeType in browser?

Question

Bringing this over from softwareengineering. Was told this question may be better for stackoverflow.

I am sending a video stream of data to another peer and want to reassemble that data and make that stream the source of a video element. I record the data using the npm package RecordRTC and am getting a Blob of data every 1s.

I send it over an WebRTC Data Channel and initially tried to reassemble the data using the MediaSource API but turns out that MediaSource doesn't support data with a mimetype of video/webm;codecs=vp8,pcm. Are there any thoughts on how to reassemble this stream? Is it possible to modify the MediaSource API?

My only requirement of this stream of data is that the audio be encoded with pcm but if you have any thoughts or questions please let me know!

P.S. I thought opinion based questioned weren't for stackoverflow so that's why I posted there first.

Why don't you simply pass the raw MediaStream through your webRTC server? — Kaiido, Dec 01 '20 at 01:14
@Kaiido because webrtc doesn't support that mimetype. I specifically need wav as the audio codec but pcm works since the only difference is the magic number. The idea is to bypass that with an RTC Datachannel and reassembling the data on the other end so to be able to use a custom mimetype — Malcolm, Dec 01 '20 at 01:35
Then, why do you think you need PCM? If it's to be read by the other peer, letting the raw MediaStream is the best you can do. If you were hoping for better quality because "PCM is lossless", then know that the data RecordRTC produces anyway comes from that stream (which is probably encoded in ogg) so this is only generating a bigger version of **the same** data. — Kaiido, Dec 01 '20 at 01:47
I need a lossless audio codec for my applications specific needs. WebRTC's default audio codec is Opus but since it is lossy it can't be used for my applications needs. SDP munging is a viable option but again WebRTC doesn't support pcm from what I understand. — Malcolm, Dec 01 '20 at 02:04
And what is "your applications specific needs"? Can't you do the decoding to PCM on the fly where you need that PCM data? The Web Audio API can do it for you (it's what RecordRTC does in the first place). So you send the raw MediaStream through webRTC, you let the peer read this stream directly using the video's `srcObject` and only "for your applications specific needs" you decode that stream to PCM. — Kaiido, Dec 01 '20 at 02:09
I've been wondering this - is the raw mediastream returned from something like getusermedia what format is that in? I specifically need lossless audio because I am feeding that data into a DAW like logic pro, fla studio, garageband, etc for live mixing and recording. RecordRTC gives me the options needed for things like sample rate, audio bits per second, etc.. — Malcolm, Dec 01 '20 at 02:20
You can check it using https://developer.mozilla.org/en-US/docs/Web/API/RTCRtpSender/getCapabilities generally it's Opus. — Kaiido, Dec 01 '20 at 02:42

score 1 · Accepted Answer · answered Dec 01 '20 at 17:53

1

The easiest way to handle this is to proxy the stream through a server where you can return the stream as an HTTP response. Then, you can do something as simple as:

<video src="https://example.com/your-stream"></video>

The downside of course is that now you have to cover the bandwidth cost, since the connection is no longer peer-to-peer.

What would be nice is if you could use a Service Worker and have it return a faked HTTP response from the data you're receiving from the peer. Unfortunately, the browser developers have crippled the Service Worker standards by disabling it if the user reloads the page, or uses privacy modes. (It seems that they assumed Service Workers were only useful for caching.)

Also, a note on WebRTC... what you're doing is fine. You don't want to use the normal WebRTC media streams, as not only are they lossy compressed, but they will drop segments to prioritize staying realtime over quality. This doesn't sound like what you want.

I've been wondering this - is the raw mediastream returned from something like getusermedia what format is that in?

The MediaStream is the raw data, but it isn't accessible directly. If you attach the MediaStream to a Web Audio API graph, whatever format the sound card captured in is converted to 32-bit floating point PCM. At this point, you can use a script processor node to capture the raw PCM data.

answered Dec 01 '20 at 17:53

Brad

159,648
54
349
530

Is there a way to append a new ArrayBuffer if I went the route of the http proxy? For the second part regarding service workers, it should be ok since regardless if a user refreshed the page the video chat would drop. As for the getusermedia and attaching it to a WebAudio graph is there a way to do that for the entire video stream being collected and only modify the audio settings because if it was just audio wouldn't I have to split the audio and video tracks and then put them together? Latency is my biggest fear in the project right now and I'm trying to minimize it as much as possible – Malcolm Dec 01 '20 at 18:27
@Malcolm Your server side would just constantly receive data from the sending client (presumably via Web Socket), and would continue to send data via HTTP to the receiving end. – Brad Dec 01 '20 at 18:29
@Malcolm Service Workers are still a problem... as in, the feature is disabled if the user refreshes the page because the assumption is that service workers are only for caching, and refreshing is to reload without cache. – Brad Dec 01 '20 at 18:29
@Malcolm Yes, there is no *good* way to split and rejoin the audio/video. You can, and you can make a new MediaStream out of it again, but synchronization can be a problem because the inherent timestamps are now gone. – Brad Dec 01 '20 at 18:30
@Malcolm RE:Latency, you have to pick what you want... quality or low latency. You can't really have both. :-) You're either going to have to drop samples to keep realtime, or keep them all, buffer a decent amount, and keep that buffer full as you stream. You'll have to pick the best balance. If latency is an issue, you need WebRTC straight-up, and you'll want to pick a high bitrate for the Opus stream with your munging. – Brad Dec 01 '20 at 18:31
Thank you so much Brad! I'm going to mark this as the correct answer. Is it ok if I DM you with a couple more Q's by any chance? – Malcolm Dec 01 '20 at 20:48
@Malcolm I am available for paid consulting if you're interested. You can contact me at brad@audiopump.co. Otherwise, if your questions are broadly applicable, you can continue posting on Stack Overflow and I will usually answer here. – Brad Dec 01 '20 at 20:55

How do I play a stream of data with custom but partially supported MimeType in browser?

1 Answers1