Discord with 3-Million Concurrent Voice Users!

Discord with 3-Million Concurrent Voice Users!

·

2 min read

We will be sketching up how discord is different on web-browser Vs cross-Platform App.

Discord uses client-server architecture rather than peer-to-peer networking.

  • client server here, refers that you act as a server and the people connected to you act as clients to the server.

  • where as peer to peer refers that each connection between two of you operates on the respective server.

DIF.png

Traffic control efficiency:

It uses webRTC (tool to infuse Real-time communication feature in the web-app) where as for native apps it uses c++ media engine built on top of webRTC native library. Basically, app version is crucial and uses the customized version of webRTC.

Why implementing own version is better?

  1. Library of webRTC allow them to use a lower level API to send stream and receive call. Say, they use minimum info to set up connection.
  2. Implement their own media control to avoid global system setting.
  3. Access raw audio data to perform voice activity detection and share both game audio and video.
  4. Reduce your bandwidth and CPU consumption during periods of silence.
  5. Send extra info along with audio/video packets(like priority speaker).
  6. Rather than using DTLS / SRTP they used Salsa20 encryption.

ASSIGNING VOICE SERVER:

When you start a call or stream, discord assigns a least utilized voice server in the given region.

  • Also a minimal amount of info is exchanged while joining a channel. Like encryption Keys, codec.

you can see those instances in the below snippet:

webrtc::AudioSendStream* createAudioSendStream(
  uint32_t ssrc,
  uint8_t payloadType,
  webrtc::Transport* transport,
  rtc::scoped_refptr<webrtc::AudioEncoderFactory> audioEncoderFactory,
  webrtc::Call* call)
{
    webrtc::AudioSendStream::Config config{transport};
    config.rtp.ssrc = ssrc;
    config.rtp.extensions = {{"urn:ietf:params:rtp-hdrext:ssrc-audio-level", 1}};
    config.encoder_factory = audioEncoderFactory;
    const webrtc::SdpAudioFormat kOpusFormat = {"opus", 48000, 2};
    config.send_codec_spec =
      webrtc::AudioSendStream::Config::SendCodecSpec(payloadType, kOpusFormat);
    webrtc::AudioSendStream* audioStream = call->CreateAudioSendStream(config);
    audioStream->Start();
    return audioStream;
}

If you see a preloader while starting a stream, it means discord is trying to assign best utilized server. As soon as you see "Voice connected", it means that the client(connected members) successfully exchanged UDP (low latency transmission tool) messages.

Streaming video:

for this, it uses Scalable Video Coding which standardizes the HD video by decoding it and reconstructing similar quality video.

Well, you can expect a lot more advancements in future : )

"There's more and more to discover and upgrade backend tech to meet the standards that are as efficient without compromising user security."