-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sad_32x32 and 64x64 AVX2 has poor cache locality #3247
Comments
Could you please add how you determined that so willing people can repeat the exercise? :) |
Yes, this was measured using valgrind, specifically in this case |
This might not be the SAD itself really but rather the nature of e.g. motion compensation. Is this specific to AVX2? |
This at least applies to the HBD ASM, I have not tested against LBD. Benchmarking is showing a large number of cache read misses. Noting this as a possible area for performance improvement.
The text was updated successfully, but these errors were encountered: