Instructions to use TurboAiLabs/turbo-ai-1.5b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use TurboAiLabs/turbo-ai-1.5b with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="TurboAiLabs/turbo-ai-1.5b",
	filename="turbo-ai-1.5b.Q4_K_M.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use TurboAiLabs/turbo-ai-1.5b with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M

Use Docker

docker model run hf.co/TurboAiLabs/turbo-ai-1.5b:Q4_K_M

LM Studio
Jan

vLLM

How to use TurboAiLabs/turbo-ai-1.5b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "TurboAiLabs/turbo-ai-1.5b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TurboAiLabs/turbo-ai-1.5b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/TurboAiLabs/turbo-ai-1.5b:Q4_K_M

Ollama
How to use TurboAiLabs/turbo-ai-1.5b with Ollama:
```
ollama run hf.co/TurboAiLabs/turbo-ai-1.5b:Q4_K_M
```

Unsloth Studio

How to use TurboAiLabs/turbo-ai-1.5b with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for TurboAiLabs/turbo-ai-1.5b to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for TurboAiLabs/turbo-ai-1.5b to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for TurboAiLabs/turbo-ai-1.5b to start chatting

How to use TurboAiLabs/turbo-ai-1.5b with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "TurboAiLabs/turbo-ai-1.5b:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use TurboAiLabs/turbo-ai-1.5b with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default TurboAiLabs/turbo-ai-1.5b:Q4_K_M

Run Hermes

hermes

Docker Model Runner
How to use TurboAiLabs/turbo-ai-1.5b with Docker Model Runner:
```
docker model run hf.co/TurboAiLabs/turbo-ai-1.5b:Q4_K_M
```

Lemonade

How to use TurboAiLabs/turbo-ai-1.5b with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull TurboAiLabs/turbo-ai-1.5b:Q4_K_M

Run and chat with the model

lemonade run user.turbo-ai-1.5b-Q4_K_M

List all available models

lemonade list

Turbo AI 1.5B — Flagship Edge Email Model ⚡️

At TurboMail, we believe your personal data — especially your emails — should never leave your device.

Turbo AI 1.5B is the flagship edge model from TurboAI Labs, designed specifically for private, offline email workflows.

It is built for fast local inference and optimized for practical inbox productivity tasks such as summarization, professional drafting, intent parsing, and task extraction.

Turbo AI 1.5B powers the local-first AI vision behind TurboMail: useful AI assistance without sending private email content to external servers.

Why Turbo AI?

Most AI email tools depend on cloud inference, which means private email content may leave the user's device.

Turbo AI is built for a different future: local-first AI for email.

With Turbo AI, email workflows can run directly on the user's machine, improving privacy, speed, and offline access.

Key Benefits

Absolute Privacy
Email content can be processed locally without sending private threads to external AI servers.

Fast Local Inference
Turbo AI 1.5B is lightweight enough for practical on-device use.

Offline Access
Summarize, draft, and extract tasks from emails even without an internet connection.

Purpose-Built for Email
Optimized for inbox workflows, professional communication, and local productivity.

Edge-Ready
Designed for desktop apps, local AI assistants, and privacy-first software.

Model Details

Field	Value
Model Name	Turbo AI 1.5B
Organization	TurboAI Labs
Product	TurboMail
Model Type	Local edge language model
Format	GGUF
Quantization	Q4_K_M
Primary Use Case	Local email productivity
Inference Target	Desktops, edge devices, and local-first apps
Core Focus	Email summarization, drafting, task extraction, and intent parsing

Intended Use Cases

Turbo AI 1.5B is designed for:

Email summarization
Draft reply generation
Intent parsing
Task extraction
Inbox triage
Follow-up detection
Local productivity workflows
Offline AI assistant experiences
Privacy-first email applications

Example workflows:

Summarizing long email threads
Creating professional reply drafts
Extracting action items from emails
Detecting whether an email needs a response
Identifying deadlines, meetings, and follow-ups
Helping users manage their inbox locally

Example Prompts

Email Summarization

Summarize this email in 3 bullet points:

Hi Sam, just checking whether you are available for a quick call tomorrow afternoon.
We want to discuss the onboarding timeline, the internship documents, and the first sprint tasks.

Draft Reply

Write a polite professional reply to this email:

Hi Sam, are you available tomorrow after 2 PM for a quick onboarding call?
We would like to go over the project timeline and your first sprint tasks.

Task Extraction

Extract action items from this email:

Please send your updated resume by Friday, confirm your start date, and share your GitHub username with the engineering team.

Intent Parsing

Classify the intent of this email:

Can you please review the attached document and send your feedback by tomorrow morning?

Run Locally

llama.cpp

llama-cli -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M --jinja

llama.cpp Server

llama-server -hf TurboAiLabs/turbo-ai-1.5b:Q4_K_M

Ollama

ollama run hf.co/TurboAiLabs/turbo-ai-1.5b:Q4_K_M

Use with Python

Install:

pip install llama-cpp-python

Example:

from llama_cpp import Llama

llm = Llama.from_pretrained(
    repo_id="TurboAiLabs/turbo-ai-1.5b",
    filename="turbo-ai-1.5b.Q4_K_M.gguf",
)

response = llm.create_chat_completion(
    messages=[
        {
            "role": "system",
            "content": "You are Turbo AI, a helpful local email assistant. Keep responses concise, professional, and useful."
        },
        {
            "role": "user",
            "content": "Summarize this email: Please confirm your availability for tomorrow's onboarding call and send your GitHub username."
        }
    ],
    temperature=0.4,
)

print(response["choices"][0]["message"]["content"])

Recommended System Prompt

You are Turbo AI, a local-first email assistant.

Help users summarize emails, draft replies, extract action items, classify intent, and understand email context.

Be concise, professional, privacy-aware, and avoid adding details that are not present in the email.

Example Output

Input:

Extract action items from this email:

Hi Sam, please send your updated resume by Friday, confirm your internship start date, and share your GitHub username with the engineering team.

Expected output:

Action items:
1. Send updated resume by Friday.
2. Confirm internship start date.
3. Share GitHub username with the engineering team.

About TurboMail

TurboMail is a local-first desktop email client focused on privacy, speed, and AI-powered productivity.

TurboMail is designed around a simple belief: personal email data should stay on the user's device whenever possible.

Turbo AI models support this mission by enabling fast, private, local AI workflows for email.

About TurboAI Labs

TurboAI Labs builds optimized edge AI models for local-first productivity applications.

Our focus is on small, fast, practical models that can run close to the user — on desktops, personal devices, and privacy-first software environments.

Turbo AI 1.5B is our flagship model for local email intelligence.

Limitations

Turbo AI 1.5B is a compact edge model optimized for local email workflows. It may not perform as well as larger cloud models on broad reasoning, advanced coding, complex mathematics, or highly specialized tasks.

Users should review generated summaries, replies, extracted tasks, and classifications before taking action.

Disclaimer

This model is intended for productivity assistance and local email workflows. Generated outputs may be incomplete or incorrect. Always verify important information before sending emails, making decisions, or taking action.

Citation

If you use Turbo AI 1.5B, please link back to:

https://huggingface.co/TurboAiLabs/turbo-ai-1.5b

Built by TurboAI Labs for TurboMail ⚡️

Downloads last month: 429

GGUF

Model size

2B params

Architecture

qwen2

Hardware compatibility

4-bit

TurboAiLabs
/

turbo-ai-1.5b

Turbo AI 1.5B — Flagship Edge Email Model ⚡️

Why Turbo AI?

Key Benefits

Model Details

Intended Use Cases

Example Prompts

Email Summarization

Draft Reply

Task Extraction

Intent Parsing

Run Locally

llama.cpp

llama.cpp Server

Ollama

Use with Python

Recommended System Prompt

Example Output

About TurboMail

About TurboAI Labs

Links

Limitations

Disclaimer

Citation