Usually when mixing HDR rendering with MSAA, the super-simplified pipeline looks something like this:

Render the scene with MSAA

Compute lighting with sub-sample accuracy where it's needed

Resolve MSAA texture to a normal HDR texture

Postprocessing

Tone mapping

Bloom

However, we actually want to do the tone mapping per sample, not on the already resolved sample. If we don't, we get this:

This is a triangle rendered with 4xMSAA. The corners have the colors (0, 0, 0), (1, 1, 1) and (50, 50, 50), respectively. Notice how the anti-aliasing looks pretty good near the left edge of the triangle while to the right, the effect is completely lost. That's because since we're computing the average of samples close to (50, 50, 50) and samples close to (0, 0, 0) (the black background), we still get very high values after averaging them together even if just a single sample was (50, 50, 50). In a way, the bright samples completely take over the pixel. Tone mapping doesn't help either, since even values like (10, 10, 10) get mapped close to white. In short, we completely lose the anti-aliasing effect.

We need to do the tone-mapping per sample, but that's not practical in a real game. Running post processing effects like depth of field and motion blur on an MSAA render target would be incredibly expensive. We also can't move those two effects to after tone mapping since a big part of the effect is that they should be done in HDR.

Yesterday I had an idea that "solves" this. The idea is that we need to do the resolve with tone mapped colors, but afterwards we also want the post processing to take advantage of HDR. Therefore I tone mapped each sample, averaged them together and then simply ran the new value through the inverse of the tone mapping function to get back a HDR value. When the tone mapping is later redone as usual after post-processing, the result will be perfectly anti-aliased tone-mapped values. Although the post processing may blur the values, I'm assuming that the blur will hide the aliasing introduced by messing with the result of my little trick, but I haven't tested that yet.

For the most simple tone mapping function out there,

color = color / (color + 1);

called Reinhard tone mapping, calculating the inverse of it was very simple and the result was extremely satisfying:

Standard resolve:

Inverse tone mapping resolve:

The result is simply perfect. However, Reinhard's function is rarely used since while it does the job, it doesn't look very good and desaturates colors and blacks and whatever. More advanced functions can cause a few problems. John Hable's blog post on the tone mapping function used by Uncharted 2 seemed to be a respectable candidate: http://filmicgames.com/archives/75

The function looks like this:

float A = 0.15; float B = 0.50; float C = 0.10; float D = 0.20; float E = 0.02; float F = 0.30; float W = 11.2; vec3 toneMap(vec3 x) { return ((x*(A*x+C*B)+D*E)/(x*(A*x+B)+D*F))-E/F; }

First I just ran the function through an equation solver to get the inverse function, which horribly enough looks like this:

vec3 inverseToneMap(vec3 x)

{

return (sqrt((4*x-4*x*x)*A*D*F*F*F+(-4*x*A*D*E+B*B*C*C-2*x*B*B*C+x*x*B*B)*F*F+(2*x*B*B-2*B*B*C)*E*F+B*B*E*E)+(B*C-x*B)*F-B*E)/((2*x-2)*A*F+2*A*E);

}

Although most of this will be precomputed by the shader, that's still quite a few operations. Luckily, the resolve shader is incredibly bandwidth limited thanks to the HDR texture and multiple samples, so increasing the amount of computations needed won't have a very big impact on performance.

The results were a bit disappointing.

Normal resolve:

Inverse tone mapping resolve:

Oops. There wasn't enough floating precision to rebuild the HDR values after tone mapping. As color intensity approaches infinity, the tone mapped color approaches 1.0. Currently, we have lots of precision close to 0.0 but relatively bad precision close to 1.0 since we're using floating point values. If we invert the tone mapped color during the resolve, much more precision will be available where it matters.

vec3 toneMap(vec3 x)

{

//return ((x*(A*x+C*B)+D*E)/(x*(A*x+B)+D*F)-E/F);

return 1.0-((x*(A*x+C*B)+D*E)/(x*(A*x+B)+D*F)-E/F);

}

vec3 inverseToneMap(vec3 x)

{

//return (sqrt((4*x-4*x*x)*A*D*F*F*F+(-4*x*A*D*E+B*B*C*C-2*x*B*B*C+x*x*B*B)*F*F+(2*x*B*B-2*B*B*C)*E*F+B*B*E*E)+(B*C-x*B)*F-B*E)/((2*x-2)*A*F+2*A*E);

return (sqrt((4*x-4*x*x)*A*D*F*F*F+((4*x-4)*A*D*E+B*B*C*C+(2*x-2)*B*B*C+(x*x-2*x+1)*B*B)*F*F+((2-2*x)*B*B-2*B*B*C)*E*F+B*B*E*E)+((1-x)*B-B*C)*F+B*E)/(2*x*A*F-2*A*E);

}

The result is perfect!

Performance:

First of all, ignore the FPS values on the pictures! They're affected by all kinds of YouTube videos and other instances of the test program running simultaneously.

I tested the different resolve algorithms at 1920x1080 with 4xMSAA on a GTX 295 (SLI disabled) which performance-wise lies somewhere in between a GTX 260 and a GTX 275. These numbers include the cost of rendering the triangle, resolving the MSAA texture and tone mapping the final resolve.

Hardware blitting resolve: 906 FPS (1.104 ms)

Custom shader resolve (no tone mapping): 850 FPS (1.176 ms)

Custom shader resolve (inverse Reinhard): 770 FPS (1.2987 ms)

Custom shader resolve (inverse Uncharted 2): 485 FPS (2.062 ms)

The numbers would look a lot better if I could get SLI working. Anyway, a GTX 680 has around 150% better performance compared to a GTX 295 with SLI enabled, so expect around 250% better performance with today's high-end hardware compared to this benchmark (around 0.6ms for the inverse Uncharted 2 resolve).

And finally, here's the source code for the GLSL resolve fragment shader. Note that this shader is fed with unnormalized texture coordinates.

http://pastie.org/5777050

And for reference, here's the source code for the tone mapping shader, which uses normalized texture coordinates.

http://pastie.org/5777160

Now get out there and fix your AA resolves! I'm looking at you, BF3!

PS: After Googling a bit, I found out that this trick has already been invented. Figures. -_-'

Fucking fantastic!

SvaraRadera