This gives gibberish nonsense in text-generation-server

by Websteria - opened Apr 11, 2023

Apr 11, 2023

I have not been able to get the quantized version to make any sense. The HF version works great, just a bit slow.

Websteria changed discussion title from This gives gibberish in text-generation-server to This gives gibberish nonsense in text-generation-server Apr 11, 2023

TheBloke

Owner Apr 11, 2023

Please read the README.md. You either need to update text-generation-webui's GPTQ-for-LLaMa to the latest version, or else use file koala-13B-4bit-128g.no-act-order.ooba.pt

Hypersniper

Apr 15, 2023

I am up to date with the latest files for text-generation-webui and GPTQ-for-LLaMa and I can confirm I get gibberish as well on the 7B and 13B quantized versions

TheBloke

Owner Apr 15, 2023

When you say you're up-to-date, are you sure you're using the right GPTQ-for-LLaMa version? It needs to be the qwopqwop repo, not the oobabooga fork.

If the update to GPTQ-for-LLaMa is not working for you, just use koala-13B-4bit-128g.no-act-order.ooba.pt. Remove any other pt/safetensors files from your model directory, such that you just have koala-13B-4bit-128g.no-act-order.ooba.pt and that will work with any version of GPTQ-for-LLaMa

Websteria

Apr 15, 2023

This fixes my problem. Thank you!!!

Websteria changed discussion status to closed Apr 15, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment