Add batch size 4 configurations for LLama 1B and 3B models 3b6312a verified dacorvo HF Staff commited on Jun 25, 2025
Added TinyLlama as requested by Jim burtoft d9640f4 verified dacorvo HF Staff commited on May 12, 2025
Rename inference-cache-config/llama-3.1-8B.json to inference-cache-config/llama.json 14844a0 verified dacorvo HF Staff commited on Sep 26, 2024
Rename inference-cache-config/llama.json to inference-cache-config/llama2.json f06a55a verified dacorvo HF Staff commited on Apr 19, 2024
Added Llama-70b batch_size 4 to inference cache 593822e verified dacorvo HF Staff commited on Mar 8, 2024