EP06 - Discord Architecture

How Discord handles 50-person voice chats. Learn about UDP vs TCP, Selective Forwarding Units (SFU), Elixir/BEAM VM, client-side noise suppression (Krisp), and bandwidth optimization strategies.

Tweet coming soon

My internet connection is a potato held together with string. How is Discord not lagging?

Because Discord not a phone call. It's a radio broadcast.

Kurumi! Do you live in my walls?

Anyway, how is Discord able to do all of these without lagging? On a Zoom call with 10 people, someone always sounds like a walkie talkie.

Because Zoom wants to be perfect. Discord just wants to be fast.

Discord uses **UDP**.

UDP??

A phone call uses TCP. It guarantees every packet of your voice arrives. If one gets lost, the whole system waits for it to be resent.

That's why calls freeze and then suddenly speed up. The truck was waiting for a signature.

So... the UDP truck is faster?

Much faster. It doesn't care if a package gets dropped. For voice, a dropped millisecond of audio is better than a 2-second delay. It prioritizes speed over perfection.

Okay, so it's a fast truck. But how do 50 people send packages to each other at once?

That's the second trick. You aren't sending your voice to 49 people.

You are sending it to **one**.

This is Discord. It is built with WebRTC and comes with a **Selective Forwarding Unit (SFU)**.

You send one clean stream to the Discord server. The server acts as a giant copy machine and broadcasts it to everyone else.

It's a copy machine. Upload once, download many. That saves your bandwidth.

Okay, but what about the noise? My friend is typing like a woodpecker on speed, but I can barely hear it. Just his voice.

Ah. It's called **Krisp**. It's an integrated AI.

It's a machine learning model that was trained to know the difference between a human voice and... everything else.

It cleans your audio *before* it even gets sent to the UDP truck.

The AI runs on Discord's servers?

No. It runs on your computer. Discord offloads the expensive work of noise filtering to your CPU.

They save millions in server processing costs by making your PC do the labor.

Alright, now the messages. How do these GIFs and memes show up instantly for everyone in the server?

Every online user has a persistent connection to a Gateway server. It's a WebSocket, a live tunnel for real-time events.

So my own personal mailman?

Exactly. The channel itself is managed by a Guild Server, written in Elixir. Elixir is built for millions of cheap, concurrent processes. Perfect for a chat room with 100,000 members.

Okay. UDP, SFUs, client-side AI, Elixir...

But I am currently sending a high-definition video stream across the planet for free along with my server members.

How is Discord still not bankrupted?

They would be if they weren't smart. But they are ruthless about not sending packets.

But... they have to send packets. That's what streaming is!

First, your voice. When you paused just now to take a breath, what did your PC send?

My... voice?

Nothing.

Voice Activity Detection (VAD). The client is smart enough to know when you're not talking. It just stops sending audio. That alone cuts bandwidth by half.

Now the video. You think you're sending one 720p stream. You're not. You're sending three.

It's called Simulcast.

All three go to the Discord SFU. The SFU acts as an operator.

It looks at your five friends.

'Friend A' is watching full-screen on a fiber connection? He gets the 720p stream.

'Friend B' is on her phone with two bars of signal? She gets the 240p stream so it doesn't buffer.

And 'Friend C', who has your stream minimized while he plays his own game?

It just... cuts him off?

Until he maximizes the window again. Why waste bandwidth sending video to a screen that isn't even looking at it?

TLDR. They offload noise filtering to you. They don't send your audio when you're quiet. They don't send video to people who aren't watching.

They took a process that should be expensive and, through a thousand tiny cuts, bled the cost dry.

So my 'free' stream isn't free. It's just ultra-light?

They engineered the cost down from dollars to fractions of a penny. The OpEx for a free user is so low that a Nitro subscription pays for thousands of them.