Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Bugfix] [AMD] [ROCm] pd disagg for rocm fix rocm Related to AMD ROCm
#21724 opened Jul 28, 2025 by seungrokj
3 of 4 tasks
fix the mxfp4 packed qk weight loading issue for llama4 llama Related to Llama models
#21722 opened Jul 28, 2025 by xuebwang Draft
4 tasks
[Frontend] Add LLM.reward specified to reward models documentation Improvements or additions to documentation frontend
#21720 opened Jul 28, 2025 by noooop
4 tasks
[Model] Prioritize Transformers fallback over suffix matching multi-modality Related to multi-modality (#4194) new-model Requests to new models ready ONLY add when PR is ready to merge/full CI is needed
#21719 opened Jul 28, 2025 by DarkLight1337
1 of 4 tasks
[Feat] Support Flashinfer TRT-LLM FP8-query/output Attention Kernel llama Related to Llama models performance Performance-related issues rocm Related to AMD ROCm v1
#21716 opened Jul 28, 2025 by elvischenv Draft
4 tasks
update flashinfer to v0.2.9rc2 ci/build
#21701 opened Jul 28, 2025 by weireweire
3 of 4 tasks
[Benchmark] Support ready check timeout in vllm bench serve performance Performance-related issues ready ONLY add when PR is ready to merge/full CI is needed
#21696 opened Jul 28, 2025 by yeqcharlotte
3 of 4 tasks
[Misc] Add unit tests for chunked local attention v1
#21692 opened Jul 27, 2025 by sarckk
3 of 4 tasks
Deprecate V0 ci/build v1
#21690 opened Jul 27, 2025 by WoosukKwon
Migrate InternVLImageInputs and InternVLVideoInputs to TensorSchema ready ONLY add when PR is ready to merge/full CI is needed
#21684 opened Jul 27, 2025 by bbeckca
[Model] [Draft PR] Add support for SmallThinker model series documentation Improvements or additions to documentation new-model Requests to new models
#21670 opened Jul 27, 2025 by SorryMaker2022 Draft
4 tasks
Introduce RayPPCommunicator for ray-based PP
#21660 opened Jul 26, 2025 by ruisearch42
3 of 4 tasks
[Misc] refactor code return slice without brackets frontend llama Related to Llama models new-model Requests to new models performance Performance-related issues structured-output tool-calling tpu Related to Google TPUs v1
#21649 opened Jul 26, 2025 by andyxning
4 tasks
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.