SANU AI v0.1

Nepal's First Agentic AI Assistant

SANU (Smart Agentic Neural Unit) is a bilingual AI assistant that speaks Nepali and English fluently — built by Nepalis, for Nepal, for the world.

What is SANU AI?

SANU AI is a fine-tuned language model designed to understand Nepal's unique context:

Bilingual: Fluent in both Nepali and English, including Romanized Nepali (like "bro kasto cha?")
Nepal Knowledge: Trained on Nepal-specific topics — taxes, NEPSE, government services, culture, festivals
For Every Nepali: From students to professionals, farmers to IT workers, children to elders
Culturally Aware: Understands Dashain, Tihar, momo culture, Kathmandu traffic, and more
8+ Languages: Samples in Maithili, Bhojpuri, Newari, Tamang, Tharu, Gurung, Sherpa, Rajbanshi

Model Details

Property	Value
Base Model	Qwen/Qwen2.5-7B-Instruct
Method	QLoRA (4-bit NF4, double quantization)
LoRA Rank	r=16, alpha=16, dropout=0.05
Target Modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training Data	290 bilingual samples (hand-crafted + synthetic)
Epochs	3
Final Loss	1.3724
Training Time	68.9 minutes on Kaggle P100
Trainable Params	~160M / 7.6B (2.1%)
Budget	$0 (free Kaggle GPU)

Quick Start

Option 1: Use with Ollama (Recommended)

Download the GGUF version: SANU-AI-7B-v0.1-GGUF

Option 2: Use with Python

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base + adapter
base = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-7B-Instruct",
    torch_dtype=torch.float16,
    device_map="auto",
)
model = PeftModel.from_pretrained(base, "Haubaa/SANU-AI-7B-v0.1")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")

# Chat with SANU
messages = [
    {"role": "system", "content": "You are SANU AI, Nepal's first agentic AI assistant."},
    {"role": "user", "content": "bro NEPSE ma invest garna ke garne?"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

output = model.generate(**inputs, max_new_tokens=512, temperature=0.7, top_p=0.9)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Training Data Categories

Category	Samples	Description
SANU Identity	20+	"Who are you?" in Nepali/English
Nepal Knowledge	40+	Tax, NEPSE, government, geography
Children/Education	15	ABCs, counting, stories, animals
Family/Parenting	7	Screen time, pregnancy, teen safety
Professional	6	Doctor, engineer, lawyer, teacher
Emotional Support	9	Depression, crisis, migrant workers
Citizen Lifecycle	17	Baby to elderly, farmer to IT professional
Diverse Citizens	16	Dalit, deaf, LGBTQ+, orphan, journalist
Viral/Funny	8	Momo debates, traffic, NEPSE memes
Multi-language	8	Maithili, Newari, Tamang, Sherpa, etc.
Agentic/Tool Use	30+	Function calling, multi-step reasoning
Synthetic (API)	100+	Generated via Groq API

Limitations

Phase 1 MVP: Trained on 290 samples — covers core identity and Nepal basics, but not comprehensive
Knowledge Cutoff: Training data reflects 2024-2025 Nepal context
Not Medical/Legal Advice: Always consult professionals for critical decisions
May Hallucinate: Like all LLMs, SANU can generate incorrect information

Roadmap

Phase	Status	Goal
Phase 1 — Lite	Complete	290 samples, GGUF on Ollama
Phase 2 — Core	Next	10K+ samples, improved accuracy
Phase 3 — Pro	Planned	50K+ samples, tool calling, RAG
Phase 4 — Enterprise	Planned	Multi-modal, voice, deployment

Acknowledgements

Qwen Team for the excellent base model
Kaggle for free P100 GPU access
Hugging Face for model hosting
Every Nepali who dreams of technology made for us, by us

License

Apache 2.0 — free for commercial and personal use.

Built in Nepal, for Nepal, for the world.

Haubaa | SANU AI Project

Downloads last month: 19

Model tree for Haubaa/SANU-AI-7B-v0.1

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Adapter

(1618)

this model