Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VertexDecoder: Minor optimization for x86/64 CPUs not supporting SSE4. #18476

Merged
merged 1 commit into from
Dec 5, 2023

Conversation

hrydgard
Copy link
Owner

@hrydgard hrydgard commented Dec 5, 2023

I just saw this while looking into other stuff.

If we use the unpack instruction on a register with itself, we get the value both in the upper and lower parts of the subregister, hence we don't need the left shift, we can just do the arithmetic shift down for the sign extension.

@hrydgard hrydgard added the GE emulation Backend-independent GPU issues label Dec 5, 2023
@hrydgard hrydgard added this to the v1.17.0 milestone Dec 5, 2023
@hrydgard hrydgard changed the title VertexDecoder: Minor optimization for CPUs not supporting SSE4. VertexDecoder: Minor optimization for x86/64 CPUs not supporting SSE4. Dec 5, 2023
@hrydgard hrydgard merged commit 73d3de7 into master Dec 5, 2023
18 checks passed
@hrydgard hrydgard deleted the vertex-decoder-sse2-opt branch December 5, 2023 09:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GE emulation Backend-independent GPU issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant