Training Domain Draft Models for Speculative Decoding: Best Practices and Insights Paper โข 2503.07807 โข Published Mar 10
On the Tool Manipulation Capability of Open-source Large Language Models Paper โข 2305.16504 โข Published May 25, 2023 โข 2
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models Paper โข 2510.04618 โข Published Oct 6 โข 123
view post Post 1702 Mini-QwQ an edge device friendly reasoning model distilled from QwQ-32B ๐ค: kz919/QwQ-0.5B-Distilled-SFT๐ฌ ๐ฌ ๐บ ๐ซ: kz919/QwQ-0.5B-Distilled-SFT-gguf๐ค: kz919/Mini-QwQ See translation ๐ 7 7 + Reply
Cautious Optimizers: Improving Training with One Line of Code Paper โข 2411.16085 โข Published Nov 25, 2024 โข 21
view post Post 1645 Just for the meme.But the clear lesson I learnt from building these demos are, the more powerful the underlying base model is, the closer you will get to GPT4o1. CoT is nothing more than simply inducing the latent reasoning capability from the model. kz919/GPT4-O1-Proximas ๐ 6 6 ๐ฅ 2 2 ๐ 1 1 + Reply
view post Post 1922 https://huggingface.co/spaces/kz919/Llama3.1-Instruct-O1 ๐ 5 5 ๐ฅ 2 2 + Reply
view post Post 2464 "It's Sunday night, fancy a game?"https://kz919-can-you-beat-405b-in-chess.hf.space/built with the one and only SN fast API:https://sambanova.ai/fast-api?api_ref=907266 7 replies ยท ๐ง 8 8 ๐ฅ 2 2 + Reply
view post Post 648 Good lord... Spent almost a day debugging this and it turns out it was an issue of gradio update incompatible with the new fastapi.https://discuss.huggingface.co/t/huggingface-space-failed-after-working-initially/105514/8Finally got it back online! Come chat with your favorite anime characters here: kz919/Persona-AI ๐ 3 3 + Reply
view post Post 1609 Spent a few minutes to build an alternative to Character AI on top of llama3.1 405B through SambaNova's super fast inference API Space: kz919/Persona-AIAPI referral link: https://sambanova.ai/fast-api?api_ref=907266 3 replies ยท ๐ฅ 3 3 ๐ 3 3 ๐ 2 2 ๐ค 2 2 ๐คฏ 2 2 ๐ง 2 2 + Reply