Skip to content

What Does Low Latency Really Mean and 7 More Questions About Live Streaming at Scale

Jerod Venema Nov 19, 2021 5:12:15 PM
The WWE ThunderDomme

The following are my responses from a recent Streaming Media panel discussion on “Latency & Live Streaming at Scale.”

Low latency, ultra low latency, super-low latency, near real time streaming — What do all of these terms mean and what’s the difference between them?

Low latency means something different to everyone. As the speed of delivery gets faster and faster, the definition of low latency changes over time. It used to be that low latency meant sub 30 seconds, then 10 seconds and then 5 seconds. We try to use the term “real time” for things that are sub-second vs. seconds.

We run into questions about these terms all the time at LiveSwitch because our customers always require at least some component of their application to be truly real time, delivering video from presenter to viewer within 200 milliseconds or less. When people ask, “Do you do low latency?” I say “Well, what do you mean by that? Do you really need sub-second or are you OK with a 30-second delayed feed?”

Is low latency always a top priority for ensuring a high quality experience?

Absolutely not. It depends what you’re trying to do. The moment the experience you’re creating includes bidirectional communication, then it is important. If there are people watching a feed and interacting live, real-time is critical because if that interaction doesn’t happen in real time, the experience would be disjointed and suffer terribly. All of the augmented reality (AR) and virtual reality (VR) experiences that are really succeeding now are interactive and the moment you have that, then you need super super low latency — real-time.

If it’s a truly one-way experience, a one-way feed, low latency doesn’t matter. But those one-way experiences are becoming less desired because of the appetite for social aspects alongside live streaming.

There are so many protocols related to low latency and live streaming. Is there one protocol that people can count on for most use cases or do you really need a bunch of different protocols for a bunch of different use cases?

I spend my life in the sub-second world and in that world right now, WebRTC is king, the protocol of choice. With Google, Mozilla, Apple and Microsoft backing WebRTC, saying they support it and it’s the out-of-the-box standard on all of their browsers, that lends itself to WebRTC getting a lot of adoption.

However, we provide SDKs for AR/VR and that’s an area that I don’t think has really solidified yet. There could be some other protocol that becomes the standard in that world. We have WebRTC working in that environment but that doesn’t mean something won’t come along and be better or differently suited for it.

A lot of the other protocols are more suited to one way rather than interactive streaming.

For live streaming, can we require different clients for end users for different use cases?

It depends on your customer. We had a lot of customers in the telehealth space choose our platform for one reason — no download is required. In some cases, they had an app and they were trying to get it to customers for remote telehealth and consumers couldn’t get to the app to download it. The demographic for this is typically older, or those with language and/or technical barriers. They forgot their app store password or they couldn’t get it to work for some other reason. The ability to simply “click” and be in a session was fundamentally critical to the whole workflow. Those customers are very different from a gamer, for example, for whom a download is absolutely something they would spend time on. It’s just different.

How do you address the scale part of live streaming at low latency?

That’s an internally solved problem. At LiveSwitch, we do very well at scale.

How do you deal with other scaling issues, even with other protocols?

A lot of times the answer is more infrastructure or global infrastructure, changing your ingestion and having some redistribution (e.g. 200 milliseconds point-to-point, but we are ok with 250 milliseconds if we rebroadcast out to cascaded servers).

Capacity planning definitely comes into it too. Ideally, you want to have a little bit of predictive capability; you should know whether you expect 10,000 people to show up for an event or 100,000 people. If there are going to be 100,000 people, you might want to have a little extra capacity to deal with that influx. Or if you see a ramp up start to happen, predict the rate. The biggest challenge tends to be around just that initial shock point if you have a hard start.

Fortunately, what we found when we started doing very large scale events and broadcasts is that attendance doesn’t usually come in with a massive spike, there is a ramp up and ramp down. Rarely would 10,000 people show up at the same second. However, it was on us at a software level to figure out how to pre-allocate resources so that if 10,000 do show up, we could make sure that signalling could take place for those people and that we could make the feed available to them in that sub-second time frame.

Basically, the challenges that come with scaling often need to be addressed through a combination of software design, capacity planning and predictive scaling.

How do you plan for different end user hardware/tech setups?

When you have people joining in from the public, there is a lot that you can’t control — you have no idea what devices they will be joining in from. You could have someone with a giant 40 inch Mac screen and a 10 GB downlink and someone else who is on 4G on his Android phone. Can we deliver the same experiences to these people? Should we even be trying to? How do you deal with one ingestion feed and splitting it out to so many people and still trying to do that at 200 ms? There are a lot of user experience and scaling challenges there, not just from a technology standpoint but from a capacity and resource planning standpoint. The best suggestion I have is that if you have an event and expect a million people, estimate that you expect X percent to be end users on a given setup, Y percent to be end users of another type, etc. and go from there.

What is one thing you think the industry has learned in terms of improving low latency with streaming experiences?

The biggest thing that I’d say has happened within the last year or so, is that a lot of the big players, like Microsoft, Akamai, Adobe, etc., who have been doing some sort of streaming for some time have acknowledged the appeal of ultra low latency in creating an inherently better experience for end users and have realized that people are willing to pay for better experiences.

Margins in the historic low latency world were not big but when creating an interactive experience, adding that bidirectional part of it, all of a sudden people’s willingness to pay goes up. The tech giants have realized they can make a lot more money if they go down this path and they’d still use existing infrastructure and existing global deployments for something bigger. They’re seeing that in creating that interactive, two-way experience for the end user, there is money to be made.

The full panel discussion on latency and live streaming at scale can be viewed here. Comment here if you have any related questions.