Instructions to use TurboAiLabs/turbo-ai-1.5b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use TurboAiLabs/turbo-ai-1.5b with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="TurboAiLabs/turbo-ai-1.5b", filename="turbo-ai-1.5b.Q4_K_M.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use TurboAiLabs/turbo-ai-1.5b with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M # Run inference directly in the terminal: llama-cli -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M # Run inference directly in the terminal: llama-cli -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M
Use Docker
docker model run hf.co/TurboAiLabs/turbo-ai-1.5b:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use TurboAiLabs/turbo-ai-1.5b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "TurboAiLabs/turbo-ai-1.5b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TurboAiLabs/turbo-ai-1.5b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/TurboAiLabs/turbo-ai-1.5b:Q4_K_M
- Ollama
How to use TurboAiLabs/turbo-ai-1.5b with Ollama:
ollama run hf.co/TurboAiLabs/turbo-ai-1.5b:Q4_K_M
- Unsloth Studio
How to use TurboAiLabs/turbo-ai-1.5b with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for TurboAiLabs/turbo-ai-1.5b to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for TurboAiLabs/turbo-ai-1.5b to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for TurboAiLabs/turbo-ai-1.5b to start chatting
- Pi
How to use TurboAiLabs/turbo-ai-1.5b with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "TurboAiLabs/turbo-ai-1.5b:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use TurboAiLabs/turbo-ai-1.5b with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default TurboAiLabs/turbo-ai-1.5b:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use TurboAiLabs/turbo-ai-1.5b with Docker Model Runner:
docker model run hf.co/TurboAiLabs/turbo-ai-1.5b:Q4_K_M
- Lemonade
How to use TurboAiLabs/turbo-ai-1.5b with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull TurboAiLabs/turbo-ai-1.5b:Q4_K_M
Run and chat with the model
lemonade run user.turbo-ai-1.5b-Q4_K_M
List all available models
lemonade list
Turbo AI 1.5B — Flagship Edge Email Model ⚡️
At TurboMail, we believe your personal data — especially your emails — should never leave your device.
Turbo AI 1.5B is the flagship edge model from TurboAI Labs, designed specifically for private, offline email workflows.
It is built for fast local inference and optimized for practical inbox productivity tasks such as summarization, professional drafting, intent parsing, and task extraction.
Turbo AI 1.5B powers the local-first AI vision behind TurboMail: useful AI assistance without sending private email content to external servers.
Why Turbo AI?
Most AI email tools depend on cloud inference, which means private email content may leave the user's device.
Turbo AI is built for a different future: local-first AI for email.
With Turbo AI, email workflows can run directly on the user's machine, improving privacy, speed, and offline access.
Key Benefits
Absolute Privacy
Email content can be processed locally without sending private threads to external AI servers.
Fast Local Inference
Turbo AI 1.5B is lightweight enough for practical on-device use.
Offline Access
Summarize, draft, and extract tasks from emails even without an internet connection.
Purpose-Built for Email
Optimized for inbox workflows, professional communication, and local productivity.
Edge-Ready
Designed for desktop apps, local AI assistants, and privacy-first software.
Model Details
| Field | Value |
|---|---|
| Model Name | Turbo AI 1.5B |
| Organization | TurboAI Labs |
| Product | TurboMail |
| Model Type | Local edge language model |
| Format | GGUF |
| Quantization | Q4_K_M |
| Primary Use Case | Local email productivity |
| Inference Target | Desktops, edge devices, and local-first apps |
| Core Focus | Email summarization, drafting, task extraction, and intent parsing |
Intended Use Cases
Turbo AI 1.5B is designed for:
- Email summarization
- Draft reply generation
- Intent parsing
- Task extraction
- Inbox triage
- Follow-up detection
- Local productivity workflows
- Offline AI assistant experiences
- Privacy-first email applications
Example workflows:
- Summarizing long email threads
- Creating professional reply drafts
- Extracting action items from emails
- Detecting whether an email needs a response
- Identifying deadlines, meetings, and follow-ups
- Helping users manage their inbox locally
Example Prompts
Email Summarization
Summarize this email in 3 bullet points:
Hi Sam, just checking whether you are available for a quick call tomorrow afternoon.
We want to discuss the onboarding timeline, the internship documents, and the first sprint tasks.
Draft Reply
Write a polite professional reply to this email:
Hi Sam, are you available tomorrow after 2 PM for a quick onboarding call?
We would like to go over the project timeline and your first sprint tasks.
Task Extraction
Extract action items from this email:
Please send your updated resume by Friday, confirm your start date, and share your GitHub username with the engineering team.
Intent Parsing
Classify the intent of this email:
Can you please review the attached document and send your feedback by tomorrow morning?
Run Locally
llama.cpp
llama-cli -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M --jinja
llama.cpp Server
llama-server -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M
Ollama
ollama run hf.co/TurboAiLabs/turbo-ai-1.5b:Q4_K_M
Use with Python
Install:
pip install llama-cpp-python
Example:
from llama_cpp import Llama
llm = Llama.from_pretrained(
repo_id="TurboAiLabs/turbo-ai-1.5b",
filename="turbo-ai-1.5b.Q4_K_M.gguf",
)
response = llm.create_chat_completion(
messages=[
{
"role": "system",
"content": "You are Turbo AI, a helpful local email assistant. Keep responses concise, professional, and useful."
},
{
"role": "user",
"content": "Summarize this email: Please confirm your availability for tomorrow's onboarding call and send your GitHub username."
}
],
temperature=0.4,
)
print(response["choices"][0]["message"]["content"])
Recommended System Prompt
You are Turbo AI, a local-first email assistant.
Help users summarize emails, draft replies, extract action items, classify intent, and understand email context.
Be concise, professional, privacy-aware, and avoid adding details that are not present in the email.
Example Output
Input:
Extract action items from this email:
Hi Sam, please send your updated resume by Friday, confirm your internship start date, and share your GitHub username with the engineering team.
Expected output:
Action items:
1. Send updated resume by Friday.
2. Confirm internship start date.
3. Share GitHub username with the engineering team.
About TurboMail
TurboMail is a local-first desktop email client focused on privacy, speed, and AI-powered productivity.
TurboMail is designed around a simple belief: personal email data should stay on the user's device whenever possible.
Turbo AI models support this mission by enabling fast, private, local AI workflows for email.
About TurboAI Labs
TurboAI Labs builds optimized edge AI models for local-first productivity applications.
Our focus is on small, fast, practical models that can run close to the user — on desktops, personal devices, and privacy-first software environments.
Turbo AI 1.5B is our flagship model for local email intelligence.
Links
Limitations
Turbo AI 1.5B is a compact edge model optimized for local email workflows. It may not perform as well as larger cloud models on broad reasoning, advanced coding, complex mathematics, or highly specialized tasks.
Users should review generated summaries, replies, extracted tasks, and classifications before taking action.
Disclaimer
This model is intended for productivity assistance and local email workflows. Generated outputs may be incomplete or incorrect. Always verify important information before sending emails, making decisions, or taking action.
Citation
If you use Turbo AI 1.5B, please link back to:
https://huggingface.co/TurboAiLabs/turbo-ai-1.5b
Built by TurboAI Labs for TurboMail ⚡️
- Downloads last month
- 429
4-bit
