Template error
#4
by
coder543 - opened
On recent llama-server, I can no longer load this model.
Error:
llama_kv_cache: CUDA0 KV buffer size = 49152.00 MiB
llama_kv_cache: size = 49152.00 MiB (262144 cells, 48 layers, 1/1 seqs), K (f16): 24576.00 MiB, V (f16): 24576.00 MiB
sched_reserve: reserving ...
sched_reserve: CUDA0 compute buffer size = 804.01 MiB
sched_reserve: CUDA_Host compute buffer size = 522.01 MiB
sched_reserve: graph nodes = 1495
sched_reserve: graph splits = 2
sched_reserve: reserve took 122.97 ms, sched copies = 1
srv load_model: initializing slots, n_slots = 1
slot load_model: id 0 | task -1 | new slot, n_ctx = 262144
srv load_model: prompt cache is enabled, size limit: 8192 MiB
srv load_model: use `--cache-ram 0` to disable the prompt cache
srv load_model: for more info see https://github.com/ggml-org/llama.cpp/pull/16391
srv init: init: chat template parsing error:
------------
While executing FilterExpression at line 119, column 75 in source:
...'] is not none and message['tool_calls']|length > 0 -%}↵ {{ '\n<tool_...
^
Error: Unknown (built-in) filter 'length' for type Undefined (hint: 'tool_calls')
srv init: init: please consider disabling jinja via --no-jinja, or use a custom chat template via --chat-template
srv init: init: for example: --no-jinja --chat-template chatml
srv operator(): operator(): cleaning up before exit...
main: exiting due to model loading error
File:
$ sha256sum apriel-1.6-15b-thinker-q8_0.gguf
1feee4edd5ee37d5a7b434904d88a33cfd4708e08a395d7e39b88de78fa89c87 apriel-1.6-15b-thinker-q8_0.gguf