Inference Endpoints?
#11
by
iamrobotbear
- opened
Does anyone know what the correct configuration is for deploying this on Huggingface Inference Endpoints?
Exit code: 1. Reason: self._call_impl(*args, **kwargs)\n File \"/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py\", line 1747, in _call_impl\n return forward_call(*args, **kwargs)\n File \"/usr/src/server/text_generation_server/layers/rotary.py\", line 57, in forward\n rotary_emb.apply_rotary(q1, q2, cos, sin, q1, q2, False)\nRuntimeError: The size of tensor a (16) must match the size of tensor b (32) at non-singleton dimension 0"},"target":"text_generation_launcher"}
{"timestamp":"2025-03-04T13:27:40.781124Z","level":"ERROR","message":"Server error: The size of tensor a (16) must match the size of tensor b (32) at non-singleton dimension 0","target":"text_generation_router_v3::client","filename":"backends/v3/src/client/mod.rs","line_number":45,"span":{"name":"warmup"},"spans":[{"max_batch_size":"None","max_input_length":"None","max_prefill_tokens":4096,"max_total_tokens":"None","name":"warmup"},{"name":"warmup"}]}
Error: Backend(Warmup(Generation("The size of tensor a (16) must match the size of tensor b (32) at non-singleton dimension 0")))
{"timestamp":"2025-03-04T13:27:40.865075Z","level":"ERROR","fields":{"message":"Webserver Crashed"},"target":"text_generation_launcher"}
{"timestamp":"2025-03-04T13:27:40.865107Z","level":"INFO","fields":{"message":"Shutting down shards"},"target":"text_generation_launcher"}
{"timestamp":"2025-03-04T13:27:40.931987Z","level":"INFO","fields":{"message":"Terminating shard"},"target":"text_generation_launcher","span":{"rank":0,"name":"shard-manager"},"spans":[{"rank":0,"name":"shard-manager"}]}
{"timestamp":"2025-03-04T13:27:40.932184Z","level":"INFO","fields":{"message":"Waiting for shard to gracefully shutdown"},"target":"text_generation_launcher","span":{"rank":0,"name":"shard-manager"},"spans":[{"rank":0,"name":"shard-manager"}]}
{"timestamp":"2025-03-04T13:27:41.332741Z","level":"INFO","fields":{"message":"shard terminated"},"target":"text_generation_launcher","span":{"rank":0,"name":"shard-manager"},"spans":[{"rank":0,"name":"shard-manager"}]}
Error: WebserverFailed