Optimizing StreamFX in your OBS Scenes

Performance is important, and even more so in live streaming. Every streamer and content creator absolutely hates it to see the FPS number dip below the configured number - especially if it is a far drop below. But what can you actually do against that as a streamer or content creator?

First I'll prefix all of this by saying that this is by no means a complete guide. It is completely based on performance profiling I've done on my own systems, which all vary in hardware but saw the same improvement after applying the fixes. These are not guaranteed to solve all issues, but they will help reduce issues - and perhaps even allow an old laptop to stream at 60 FPS.

All of the measurements from here on out were done with NVIDIA Nsight Graphics and mostly were done on the following Hardware configurations: AMD Ryzen 9 3950X with a NVIDIA GeForce RTX 2080 Ti, AMD Ryzen 7 3600 with a NVIDIA GeForce GTX 1650 Super, Intel i5-7300HQ with a NVIDIA GeForce GTX 1050 Ti, and Intel i5-4690 with an GTX 1650 Super.

So let's get started with optimizing your setup!

Blur and the common Pitfalls

The Blur filter is a relatively simple case and one of the most common pitfalls as well. Solving many of the performance problems that comes from using it is also super easy and can be done in less than 30 minutes - so what are they?

Blur Masking vs Cropping

The biggest case for performance problems is apply a Blur filter to the entirety of the source, without applying a Crop/Pad filter first, and then relying on the Masking option to do it for you. The problem with this is that Masking runs after the blur happens, not before - it can't reduce work that was already done.

When all you want to do is blur a small part of the screen, applying a blur filter to the entire thing is super wasteful. An alternative to this is to duplicate the source using Source Mirror, then apply a Crop/Pad filter, and then apply the Blur. This reduces the impact of blurring by a lot.

Note that applying a Crop/Pad after the Blur also acts the same as Masking does. The same applies to the cropping provided by the scene editor.

Wrong Blur Types for Large Size Blurs

But what if you do want to blur everything? In that case the above doesn't help you, but you can help yourself in another way. Often there are more optimized variants of a blur available, which may reduce the impact by half or even more.

For example you can completely replace a Gaussian Area Blur with a Dual-Filtering Blur and the latter will be significantly faster. Just look at this table to see replacements that work up to 5 times faster on any hardware:

Original Blur Replacement Blur Up to x% faster
Box Area Blur
Box Directional Blur
Box Linear Area Blur
Box Linear Directional Blur
~200%
Gaussian Area Blur Dual-Filtering Blur ~500%
Gaussian Area Blur
Gaussian Directional Blur
Gaussian Linear Area Blur
Gaussian Linear Directional Blur
(Not identical to Gaussian Blur)
~200%
Possible replacement blurs that take siginificantly less CPU and GPU time.

Full Resolution Blurring is Wasteful

The final optimization you can do for your blurs is to scale the input to them down. This is a trick that Games and Web Browsers have been doing for years in order to do shadows, glows, and similar blur based effects in real time. As an example, let's start with a 2560x1440 source that you want to blur with a 64px wide Box Area blur.

This is super expensive to do - even on modern hardware - and that's never good for reaching a specific framerate target. But there's something we can do: Downscaling! By putting a Scaling/Aspect Ratio filter before the Blur filter, setting it's Scale Filtering to Bilinear and the size to 1280x720 (exactly 50% of the original) we can now reduce the blur size to just 32px.

This can be repeated until you're no longer happy with the blur quality - in my case I stop this at around 8px width. By doing this we can approach the time savings that Dual-Filtering allows us to do with any of the Area or Directional Blurs - as long as the direction is aligned with the pixel grid.

Real-Time Shader Optimizations

Shaders are one of the features in StreamFX that allow you to do so much cool stuff - and at the same time mess everything up. The following is a list of things you should do:

  • Avoid the use of integers unless absolutely necessary. Integers have a significant overhead in pixel and vertex shaders, and you should always opt for floats instead. While the math might end up slightly more complicated, it will run faster than integer math.
  • Prefer unsigned integers over signed integers. Unsigned integers have a smaller overhead than signed integers, but they are still on the list of things to avoid using. If you don't need an integer value to be less than zero, use uint!
  • Manually unroll loops. Automatic unrolling often produces functional but inefficient code, which can be avoided by manually unrolling. In the ideal case put the content of the loop into an inline bool myfunction(...params...) {...code...} function which returns true if the loop should be interrupted - allows for easy unrolling.
  • Render at a lower resolution. Not many shaders actually need to be rendering at 100% of the parents resolution - many actually look perfectly fine at 75% or even 50%. Some can even look nearly identical at 25% - experiment with this to see what works for you.
  • Group mathematical operations by what they do. Multiply next to multiply operations, additions next to additions, subtractions next to subtractions. This helps the shader transpiler generate more efficient code, and can bring performance boosts of up to 10%.
  • Avoid excessive use of if, for and while. Ideally you want your arguments to be known at the time of compiling, but that is not always possible. So in those cases you should keep your branching to the minimum possible - either by manually unrolling loops or by adding techniques to select features.
  • Don't calculate everything in the pixel shader. Not all calculations need to be done in the pixel shader, for example calculating UVs in the vertex shader and directly using that value as an input to a texture sampling command allows the compiler to optimize the sample to a better location. This can get you around 20% extra performance.

Other Improvements

Update OBS Studio and StreamFX often!

Many of the performance problems that get reported often come from using outdated or even ancient versions of OBS Studio and the plugin. Updating to a more recent version of both usually instantly resolves these due to newer versions having received more optimizations.

Reduce SDF Effects Texture Size

Dynamic generation of Signed Distance Fields (furthermore called SDF) is incredibly expensive, with a single SDF Effects filter on a 512x512 source taking as much resources as a Box Area Blur at 64px width. The impact of it though can be reduced by checking the Advanced Options property and then reducing the SDF Texture Scale.

Most sources look fine with the SDF texture scale set at 12.5%, others might need a little more, but almost nothing actually requires a 100% sized SDF texture. Which scale setting you end up using is up to you, but beware of scaling artifacts.

Avoid duplicating Sources

While this is technically advice for anyone using OBS Studio, it also applies to StreamFX. Many of your filter graphs will probably have some overlapping elements, and you can drastically reduce the rendering impact by reusing results.

For example if you have a Video Capture Device source with Chroma/Color Key and Color Grading, and want to have two different filters going from there, it is a better option to use Source Mirror to mirror the source. Especially for costly filters, such as Blur and SDF Effects, this is a very efficient way to solve a performance problem.

Anything else?

And that's it. When you apply all the fixes mentioned here you should see a decrease in GPU usage, which for some may be massive, while for others it might be very small. But even a small GPU usage decrease can allow you to hit a slightly higher framerate target. Maybe with this you'll be able to go from 30 to 60 fps, or go from 720p to 1080p.

In my case switching out a few of the blurs with Dual-Filtering allowed me to record and stream at 1440p144 instead of 1440p30, which means that I've more than tripled the available GPU time just with a single fix.

So what are you waiting for? Delve into your scene setup and look for things that you can optimize!

Comments for: Optimizing StreamFX in your OBS Scenes