Skip to content

Why does vLLM use a custom all-reduce method? #6159

Answered by simon-mo
SamKG asked this question in Q&A
Discussion options

You must be logged in to vote

See perf result here #2192. In certain cases, the custom topology drastically boosts performance compared to nccl's implementation. vLLM still uses nccl in majority of cases.

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@SamKG
Comment options

@SamKG
Comment options

@SamKG
Comment options

Answer selected by SamKG
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants