Skip to content

Pull requests: triton-inference-server/tensorrtllm_backend

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Update llama.md
#604 opened Sep 25, 2024 by surprisedPikachu007 Loading…
Add missing kv_cache related metrics
#592 opened Sep 3, 2024 by Pernekhan Loading…
[Bugfix]fix the thread lock when user input same id
#585 opened Aug 27, 2024 by GGBond8488 Loading…
Replace subprocess.Popen with subprocess.run triaged Issue has been triaged by maintainers
#452 opened May 14, 2024 by rlempka Loading…
Fixed Whitespace Error in Streaming mode
#423 opened Apr 19, 2024 by enochlev Loading…
Update end_to_end_test.py
#409 opened Apr 14, 2024 by r0cketdyne Loading…
fix: add foreground argument
#343 opened Feb 21, 2024 by pfldy2850 Loading…
Expose verbose as pram in launch triton script
#295 opened Jan 12, 2024 by ekagra-ranjan Loading…
Add example of tensorrt-llm usage
#225 opened Dec 15, 2023 by Pernekhan Loading…
Wrap long command-lines in README.md
#134 opened Nov 15, 2023 by wangkuiyi Loading…
draft pr about non-streaming output
#95 opened Nov 3, 2023 by BasicCoder Loading…
ProTip! What’s not been updated in a month: updated:<2024-09-21.