You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've noticed QuaRot and other KV cache papers include perplexity, but it is unclear to me how a quantized KV cache is used during perplexity calculation. Do you have a detailed writeup of how you calculate ppl with the quantized kv cache? Thanks
The text was updated successfully, but these errors were encountered:
I've noticed QuaRot and other KV cache papers include perplexity, but it is unclear to me how a quantized KV cache is used during perplexity calculation. Do you have a detailed writeup of how you calculate ppl with the quantized kv cache? Thanks
The text was updated successfully, but these errors were encountered: