../aten/src/ATen/native/cuda/Indexing.cu:1236: indexSelectSmallIndex: block: [40,0,0], thread: [26,0,0] Assertion srcIndex < srcSelectDimSize
failed.
#473
Labels
srcIndex < srcSelectDimSize
failed.
#473
client:
The pload also has the image in bytes form, e.g.:
For this image:
client code:
stream False or True doesn't help. url is just
'http://xxx.xxx.xxx.xxx:80/generate'
.The
conv_chatml_direct
was used to construct the above.I get the same problem if I remove the sampling parameters.
This model works perfectly well on original llava worker-server-gradio setup, but has tons of issues with sglang. This includes no response or total failure on the server. This isn't just random issue, it happens repeatedly always and constantly, rare that things work.
Other models like llama3 work perfectly fine with same code.
The error:
Related? #461
The text was updated successfully, but these errors were encountered: