Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor postshader handling to apply for softgpu / D3D9 #12883

Merged
merged 33 commits into from
May 14, 2020

Conversation

unknownbrackets
Copy link
Collaborator

@unknownbrackets unknownbrackets commented May 11, 2020

This splits out the FramebufferManagerCommon class, creating a PresentationCommon to handle things like post-processing shaders, Google Cardboard, and anything else with the final display.

With that split out, it's now used everywhere and more consistent. Benefits:

  • Software renderer now supports Google Cardboard and post-processing shaders (fixes Post-processing shaders don't get applied in the software renderer #12445.)
  • Google Cardboard now supports render rotation.
  • Direct3D 9 now supports (some) post-processing shaders. All ours work if you have PS 3.0 hardware.
  • There are a bit less differences between backends in post-processing shader handling, so features like chaining shaders will be easier to add now.

Also some cleanup and code centralization.

Some remaining concerns:

  • No idea if this actually works with DarkStalkers.
  • Flipped buffers are still a bit messy.
  • Some minor optimizations might need reapplying (currently does 2x triangles rather than a strip, for example.)
  • OpenGL (non-Core) and GLES may not report errors as well as before.
  • Vulkan now uses the thin3d texture allocator for draw pixels...

-[Unknown]

@LunaMoo
Copy link
Collaborator

LunaMoo commented May 11, 2020

Doesn't work with DarkStalkers even after disabling it's compat hacks, through it does work with some other games with software renderer.

@LunaMoo
Copy link
Collaborator

LunaMoo commented May 11, 2020

As for shaders failing in D3d9 I tried to figure out what fails as the error is very generic and only says it failed to build and it seems that calling texture2D with texcoord changed by different values more than 9 times or so are causing the failure. If I try to call the function many times, but with same value, it doesn't fail, so I'm guessing the compiler optimizes that situation, but it can't optimize the calls to different coordinates. Not sure how, but maybe this could be detected and instead of generic failure error, show something like "This effect doesn't work under D3D9"?

Edit: actually there's seems to be more to it, previously I tried by modifying bloom shader which calls it a lot, but I tried to reproduce it with simplest shader possible and there's something more to it, or at least my simplest attempt could still be optimized by drivers while it could not for bloom shader.

Edit2: Actually the limit was much greater it seems, maybe I had some other effect active as I while I based it on bloom I did it by modifying my custom shader with other effects possibly active, the simplest effect to reproduce D3d9 problem:

#ifdef GL_ES
precision mediump float;
precision mediump int;
#endif

uniform sampler2D sampler0;
varying vec2 v_texcoord0;

void main() {
	vec3 color = texture2D(sampler0, v_texcoord0.xy+0.1).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+0.2).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+0.3).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+0.4).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+0.5).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+0.6).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+0.7).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+0.8).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+0.9).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+1.0).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+1.1).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+1.2).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+2.0).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+2.1).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+2.2).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+2.3).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+2.4).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+2.5).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+2.6).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+2.7).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+2.8).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+2.9).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+3.0).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+3.1).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+3.2).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+3.3).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+3.4).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+3.5).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+3.6).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+3.7).xyz;
	color += texture2D(sampler0, v_texcoord0.xy+3.8).xyz;
//	color += texture2D(sampler0, v_texcoord0.xy+3.8).xyz; // repeated coord is optimized and will not cause issue
//	color += texture2D(sampler0, v_texcoord0.xy+3.9).xyz; // uncomment to cause issue

	gl_FragColor.rgb = color;
	gl_FragColor.a = 1.0;
}

Meh if we're using basic d3d9 it's limited to ps 2.0 and it's limitations are very poor ~ https://en.wikipedia.org/wiki/High-Level_Shading_Language

@unknownbrackets
Copy link
Collaborator Author

Yeah, we indeed use ps 2.0:

const char *profile = stage_ == ShaderStage::FRAGMENT ? "ps_2_0" : "vs_2_0";

Updated this with some of the things implemented, like the pixel changes. Now it should work with homebrew etc. and the RB swizzle thing might be better.

-[Unknown]

@LunaMoo
Copy link
Collaborator

LunaMoo commented May 12, 2020

In Darkstalkers menu using post process shaders now causes:
Clipboard02
and in-game it's just black screen. Previously it just wasn't showing any change. Dump can reproduce it when using software renderer:
ULUS10005_0001.zip

Also in other games using 5xBR shows artifacts in software renderer, kind of like the coords, pixel or texel size was wrong or at least less accurate.

Edit: And yeah upgrade to PS 3.0 solved the broken shaders in D3d9, they will still fail if someone is limited to 2.0, but at GPU's limited to that, complex post process effects would be too heavy anyway.

@unknownbrackets
Copy link
Collaborator Author

Ah, the no-16 bit support path was just broken, even before these changes. It should work now at least for the menu.

-[Unknown]

@unknownbrackets unknownbrackets marked this pull request as ready for review May 12, 2020 02:27
@LunaMoo
Copy link
Collaborator

LunaMoo commented May 12, 2020

In-game still black screen and on the boot screen it shows enlarged upper left of the screen blinking, unfortunately non of that shows in dumps, but I noticed it's caused by the (speed)hack used for this game, without it, it works correctly, in fact better than in other games now as it's accurate there:).

The lack of accuracy in other games persist through, try using any 5xBR variant on this dump with software renderer:
ULES00469_0001.zip

@unknownbrackets
Copy link
Collaborator Author

unknownbrackets commented May 12, 2020

Hm, I get artifacts (but much less) even on OpenGL (non-software) with that dump. Also, for some reason Direct3D 11 is not showing anything.. hmm.

Latest master has this artifact in all backends, and Direct3D 11 works fine (with artifact.)

The formula for pixelDelta in softgpu for that dump is:

u_pixelDelta.x = 1/w * 480/bufferWidth
w = 1280.000000 # Was showing this at 1280x1080
1/w = 0.000781
bufferWidth = 512 (size of texture)
480/bufferWidth = 0.9375
--
u_pixelDelta.x  = 0.000781 * 0.9375 = 0.0007321875

But, that seems wrong? What does the buffer (texture) width have to do with pixelDelta?

In hardware mode, fb textures will usually but not always be a multiple of 480 wide, and bufferWidth will almost always be 480. Maybe this just wasn't noticed, but was always wrong? It looks like hardware if I set it to 480.

But #8016 says I'm wrong and that this is very intentional. The buffer is definitely 512 wide, though... note that the 481 thing is fixed separately.

-[Unknown]

@LunaMoo
Copy link
Collaborator

LunaMoo commented May 12, 2020

I can improve the xBR in software renderer by changing the one u_pixelDelta to u_texelDelta, this will break it for hardware renderers so not sure what happens there.

I also have a version which uses only u_pixelDelta to work in opposite and much more wasteful way to make use of the increased rendering res as well as support custom stretch values to perfectly match ports from older consoles which had different pixel sizes and a lot of blur introduced, that seems to work fine, but it was made for SFA3Max weird pixel sizes and needs to be tweaked.

@hrydgard
Copy link
Owner

Whoa this change grew :)

I'll try to review it tomorrow but looking very promising. The u_pixelDelta thing looks suspicious indeed.

@unknownbrackets
Copy link
Collaborator Author

unknownbrackets commented May 12, 2020

Heh, sorry. If needed, I can break out the Draw changes and maybe some of the non-postshader stuff.

I haven't written any post shaders, but from the names, here's what I'd imagine those variables to do:

Name Imagination Actual Actual @ output res
pixelDelta Value to add to gl_Position to get to next dest pixel, i.e. 2.0 / targetSize Half this value Half this value scaled down by the padding in the texture
texelDelta Value to add to texture2D(param) to get the next texel, i.e. 1.0 / bufferSize 1.0 / renderResolutionSize (buffer may not be 480 * res) Same
time Values that increase each frame Same Same
video 1 if there are videos Same Same

I suspect maybe texelDelta is wrong, and this motivated us to make pixelDelta also wrong to compensate?

-[Unknown]

@unknownbrackets
Copy link
Collaborator Author

See d6d1fe6 - this makes the uniforms match my expectation. I think it should merge separately, because the current behavior is in line with what it's been doing. I don't know what this would break...

-[Unknown]

@hrydgard
Copy link
Owner

hrydgard commented May 13, 2020

Looks right - we could merge that one first actually. I think it's more likely to fix stuff than to break stuff, actually.

@hrydgard
Copy link
Owner

There's a conflict, but tried building directly from your branch and it crashes in GPUCommon::ReInitialize during SoftGPU::SoftGPU, calling textureCache_->Clear(true); We need to initialize textureCache_ to nullptr, and check for it here to make sure we don't do anything with it if it's not set. Dunno if that really was introduced here though...

Anyway, that should be trivial, and I'll look through code now.

Copy link
Owner

@hrydgard hrydgard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really a pretty great refactoring, and nice fixes to thin3d stuff!

if (preferredPixelsFormat_ == Draw::DataFormat::B8G8R8A8_UNORM)
ConvertRGB565ToBGRA8888(dst, src16, width);
else
ConvertRGBA565ToRGBA8888(dst, src16, width);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, just noticed, we should rename ConvertRGBA565ToRGBA8888 to ConvertRGB565ToRGBA8888 (removed an A).

SetViewport2D(0, 0, pixelWidth_, pixelHeight_);
DrawActiveTexture(x, y, w, h, (float)pixelWidth_, (float)pixelHeight_, u0, v0, u1, v1, uvRotation, flags);
// DrawActiveTexture reverses these, probably to match "up".
if (GetGPUBackend() == GPUBackend::DIRECT3D9 || GetGPUBackend() == GPUBackend::DIRECT3D11) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll clean up this later by fixing the shaders.

@hrydgard
Copy link
Owner

DarkStalkers is not working as-is (glitches, blackness) but I'll help fix that up once this is ready (which it very nearly is).

Mainly happened when we had wide textures and split them up between GPUs.
Only some things moved over so far.

FramebufferCommon does too much, we want to share it with softgpu without
all the buffer management stuff.
Now this works on softgpu as well.

Some hacks for backend differences...
This is also relevant where softgpu is forced, i.e. Darkstalkers.
This way they are not always "thin3d" when using Draw.
Only because we might have float a[2]; float b[2];...
Currently, Vulkan is not working properly and direct (RAM -> output) is
not hooked up.  But in general, it works.
Since they work now - at least, some of the shaders do.
I'm not sure we need it, but having a param used only on one backend is
confusing.
This was causing a lost device.
Was silently ignoring them.  Caused stretch in postshaders.
So that it can post-process correctly.
This makes it easier to do things with it.
Also centralize the pixel texture code while at it.
So that postshaders can work.  Downside is, if we write a UI shader that
requires 2.0, we may not notice... but I think the risk is not that high.
This also fixes rendering on Windows 7 Direct3D 11.
This may have been uploading twice before, so there may be a small
performance improvement in some games now.
These will be used when a game does multiple transfers from RAM to the
screen, rather than one big one.  Wasn't clearing some state, though.
This cleans up the postshader update code.
@unknownbrackets
Copy link
Collaborator Author

Rebased - yeah, I must've broke softgpu in centralizing DestroyAllFBOs(), oops. Fixed now.

Not sure what's happening in Darkstalkers - it should be right, unless something is wrong with the UVs?

-[Unknown]

@LunaMoo
Copy link
Collaborator

LunaMoo commented May 14, 2020

DarkStalkersPresentHack is causing the failure, it works without it just fine.

Althrough when I disabled it and tried to take a screenshot I noticed one other bad thing about software renderer - the screenshots will always be x1 res, so if I want to show results of upscaling shader I'd have to take screenshot externally without using the built-in function.

@unknownbrackets
Copy link
Collaborator Author

unknownbrackets commented May 14, 2020

I know, but outside the smaller UVs, that code path is very similar...

Tried again to force the path, it draws fine in for example Ridge Racer... not sure how I could've made the hack not trigger like it did before?

-[Unknown]

This allows the callback to say "I didn't copy, please copy for me."  This
helps when additional conversions will be applied during the copy, or a
copy can be skipped.

Also, fixed a couple cases of the wrong mip level being used.
@hrydgard
Copy link
Owner

I'm not sure either, but I'll look into it tonight. It looks like it should work.

@LunaMoo
Copy link
Collaborator

LunaMoo commented May 14, 2020

Weird thing is it bugs out even with my empty effect that I used in the past just for adding the 60fps flag. So just setting texture2D(sampler0, v_texcoord0.xy) into output color is enough to break things and it breaks differently on different screens as well.

@hrydgard
Copy link
Owner

Alright, I found the problem. In DarkStalkers we directly blit from a specific address where the game draws the buffer to be displayed. The function ConvertTextureDescFrom16 did not take that into account but directly used displayFramebuf_, so added a parameter to override. Will push that after merging - which I will do now.

@hrydgard hrydgard merged commit cbb9b32 into hrydgard:master May 14, 2020
@hrydgard
Copy link
Owner

The fix commit, for reference: 864d138

@unknownbrackets unknownbrackets deleted the softgpu-postshader branch May 14, 2020 21:53
@unknownbrackets unknownbrackets added this to the v1.10.0 milestone Jun 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Post-processing shaders don't get applied in the software renderer
3 participants