Can we control level of thinking ?
#170
by
kalashshah19
- opened
I am using llama server directly from llama cpp, is there any parameter or a way to set the level of thinking like Low, Medium or High? Many times the model thinks too much and uses all the remaining tokens from the context length and couldn't return the final message.