-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Usage]: How to save log to local path and control log rate?
usage
How to use vllm
#9146
opened Oct 8, 2024 by
lzcchl
1 task done
[Bug]: vllm much slower on long context inputs when using --enable-lora even when lora is not used
bug
Something isn't working
#9143
opened Oct 8, 2024 by
badrjd
1 task done
[Bug]: free_seq is invoked multiple times unnecessarily when one request is finished.
bug
Something isn't working
#9142
opened Oct 8, 2024 by
tongping
[Bug]: out-of-bound in attention.cu
bug
Something isn't working
#9136
opened Oct 7, 2024 by
molly-ting
1 task done
Issue Running Inference on a Model with Multi Nodes and Inference
usage
How to use vllm
#9134
opened Oct 7, 2024 by
kogans1107
1 task done
[Bug]: Unable to use --enable-lora on latest vllm docker container (v0.6.2)
bug
Something isn't working
#9133
opened Oct 7, 2024 by
noelo
1 task done
[Bug]: assert len(indices) == len(inputs) with Something isn't working
Qwen/Qwen2-VL-2B-Instruct
bug
#9128
opened Oct 7, 2024 by
sayakpaul
1 task done
[Bug]: Error Encountered in vLLM Benchmarking with Input Length greater than 8192 in Llama 3.1 405B Model
bug
Something isn't working
#9127
opened Oct 7, 2024 by
Bihan
1 task done
[Usage]: Not getting the infrence metrics in the api response
usage
How to use vllm
#9126
opened Oct 7, 2024 by
vverma01232
1 task done
[Bug] BlockSpaceManagerV1.get_common_computed_block_ids returns empty string, causing msgspec decode failure
bug
Something isn't working
#9122
opened Oct 7, 2024 by
amberOoO
1 task done
[Bug]: Unsupported base layer: QKVParallelLinear when loading lora to a quantized model
bug
Something isn't working
#9120
opened Oct 7, 2024 by
fahadh4ilyas
1 task done
[Bug]: Installation from last commit (version wrong)
bug
Something isn't working
#9119
opened Oct 7, 2024 by
johnnynunez
1 task done
[Bug]: Issue Running VLLM Open AI using nonroot user in K8s
bug
Something isn't working
#9118
opened Oct 7, 2024 by
luhurfth
1 task done
[Bug] In v0.6.2, when tp=1, TPOT becomes very slow for batch sizes of 10 or so. (not happened in v0.5.5)
bug
Something isn't working
#9113
opened Oct 7, 2024 by
ashgold
1 task done
[Feature]: LLMEngine and ModelConfig explicitly require path or HF model id, but no InferenceClient class for locally running VLLM server
feature request
#9110
opened Oct 6, 2024 by
DanielViglione
1 task done
[RFC]: hide continuous batching complexity through forward context
RFC
#9098
opened Oct 5, 2024 by
youkaichao
1 task done
[Bug]: vllm serve Exception in ASGI application
bug
Something isn't working
#9096
opened Oct 5, 2024 by
SpaceHunterInf
1 task done
[Bug]: VLLM Model Fails on Kubernetes with "CUDA error: operation not permitted when stream is capturing"
bug
Something isn't working
#9094
opened Oct 5, 2024 by
CREESTL
1 task done
[Installation]: cannot install vllm with openvino backend
installation
Installation problems
#9092
opened Oct 5, 2024 by
guanxiang
1 task done
[Bug]: vLLM MQLLMEngine Timeout - Json Schema
bug
Something isn't working
#9082
opened Oct 4, 2024 by
wrisigo
1 task done
[Doc]: Clear documentation about function / tool calling with examples
documentation
Improvements or additions to documentation
#9074
opened Oct 4, 2024 by
greg2705
1 task done
[Misc]: Need to understand support for torch.compile in Q4 roadmap
misc
#9072
opened Oct 4, 2024 by
amd-abhikulk
1 task done
[Usage]: Benchmarking Issues: Low Success Rate and Tensor Parallel Size Constraints on 8x AMD MI300x GPUs
rocm
usage
How to use vllm
#9070
opened Oct 4, 2024 by
Bihan
1 task done
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.