How client side handle so many audio streams in large Conference?

Question

In a SFU audio conference platform, media server simply route audio packets. Lets say in client side I keep audio packet queue for each present participant (updated by signaling server) and at a certain rate I simply dequeue from every queue, handle, pick top 4-6 voice packets and mix for play. If sequence number is missing for some participants I even send nack and wait for some threshold time for that participants queue to be dequeued (to maintain the voice flow).

But to make this solution scalable, I have to do this dequeue then pick top 4-6 voice from media server side and send it to every one. Now, from client side, even if some participant's packet sequence gets missing I am not sure whether it was actually missing or it was not able to make it to top 4-6 voice packets in server (as I need to send nNack and wait if packet actually got missing).

How can I handle this usecase efficiently and any suggestion with top mixing numbers or anything is highly appreciable?

Maybe you need MCU, not SFU? In this case you mix all signals on the server-side and don't need a complex state machine on the client-side. — Ivan, Jun 21 '22 at 10:54
You also implement muting on application level. Then if server knows client is muted it can be excluded from mixing — Ivan, Jun 21 '22 at 10:56

How client side handle so many audio streams in large Conference?

0 Answers0