Template error

#4
by coder543 - opened

On recent llama-server, I can no longer load this model.

Error:

llama_kv_cache:      CUDA0 KV buffer size = 49152.00 MiB
llama_kv_cache: size = 49152.00 MiB (262144 cells,  48 layers,  1/1 seqs), K (f16): 24576.00 MiB, V (f16): 24576.00 MiB
sched_reserve: reserving ...
sched_reserve:      CUDA0 compute buffer size =   804.01 MiB
sched_reserve:  CUDA_Host compute buffer size =   522.01 MiB
sched_reserve: graph nodes  = 1495
sched_reserve: graph splits = 2
sched_reserve: reserve took 122.97 ms, sched copies = 1
srv    load_model: initializing slots, n_slots = 1
slot   load_model: id  0 | task -1 | new slot, n_ctx = 262144
srv    load_model: prompt cache is enabled, size limit: 8192 MiB
srv    load_model: use `--cache-ram 0` to disable the prompt cache
srv    load_model: for more info see https://github.com/ggml-org/llama.cpp/pull/16391
srv          init: init: chat template parsing error: 
------------
While executing FilterExpression at line 119, column 75 in source:
...'] is not none and message['tool_calls']|length > 0 -%}↵            {{ '\n<tool_...
                                           ^
Error: Unknown (built-in) filter 'length' for type Undefined (hint: 'tool_calls')
srv          init: init: please consider disabling jinja via --no-jinja, or use a custom chat template via --chat-template
srv          init: init: for example: --no-jinja --chat-template chatml
srv    operator(): operator(): cleaning up before exit...
main: exiting due to model loading error

File:

$ sha256sum apriel-1.6-15b-thinker-q8_0.gguf 
1feee4edd5ee37d5a7b434904d88a33cfd4708e08a395d7e39b88de78fa89c87  apriel-1.6-15b-thinker-q8_0.gguf

Sign up or log in to comment