Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about performance comparison #323

Open
shkatebi97 opened this issue Nov 15, 2022 · 1 comment
Open

Question about performance comparison #323

shkatebi97 opened this issue Nov 15, 2022 · 1 comment

Comments

@shkatebi97
Copy link

shkatebi97 commented Nov 15, 2022

Hi.
I was looking for a performance comparison between ruy and OpenBLAS and I came across this.
But when I benchmark the ruy (almost for any shape with single thread execution and on raspberry pi 4), my results are far behind the reported results.
For example, for the 512x512x512 Int8 benchmark, I can only get ~10 GOPs but excel reported 40 GOPs.
I know Raspberry Pi 4 CPU frequency can be maxed out to 1.5 GHz while Pixel 4 max frequency is 2.84 GHz, but it does not justify the 30 GOPs gap.
So I thought it might be better to ask it here.
How did you measure GOPs for ruy?
I calculate the GOPs for the method with the ((2 * N * K * M * iterations) / time) / 10e+9 formula (time is the sum of the execution time of ruy::Mul for each iteration) (I pack the RHS matrix beforehand).
Am I doing anything wrong?

@WilliamTambellini
Copy link

Same question here: any comparison with oneDNN ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants