Instructions to use Minibase/Detoxify-Language-Small with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Minibase/Detoxify-Language-Small with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Minibase/Detoxify-Language-Small",
	filename="detoxify-small-q8_0.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use Minibase/Detoxify-Language-Small with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Minibase/Detoxify-Language-Small:Q8_0
# Run inference directly in the terminal:
llama-cli -hf Minibase/Detoxify-Language-Small:Q8_0

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Minibase/Detoxify-Language-Small:Q8_0
# Run inference directly in the terminal:
llama-cli -hf Minibase/Detoxify-Language-Small:Q8_0

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Minibase/Detoxify-Language-Small:Q8_0
# Run inference directly in the terminal:
./llama-cli -hf Minibase/Detoxify-Language-Small:Q8_0

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Minibase/Detoxify-Language-Small:Q8_0
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Minibase/Detoxify-Language-Small:Q8_0

Use Docker

docker model run hf.co/Minibase/Detoxify-Language-Small:Q8_0

LM Studio
Jan
Ollama
How to use Minibase/Detoxify-Language-Small with Ollama:
```
ollama run hf.co/Minibase/Detoxify-Language-Small:Q8_0
```

Unsloth Studio new

How to use Minibase/Detoxify-Language-Small with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Minibase/Detoxify-Language-Small to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Minibase/Detoxify-Language-Small to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Minibase/Detoxify-Language-Small to start chatting

Docker Model Runner
How to use Minibase/Detoxify-Language-Small with Docker Model Runner:
```
docker model run hf.co/Minibase/Detoxify-Language-Small:Q8_0
```

Lemonade

How to use Minibase/Detoxify-Language-Small with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Minibase/Detoxify-Language-Small:Q8_0

Run and chat with the model

lemonade run user.Detoxify-Language-Small-Q8_0

List all available models

lemonade list

Detoxify-Language-Small / README.md

Minibase

Update README.md

cf1c97e verified 8 months ago

preview code

raw

history blame contribute delete

10 kB

	---
	language:
	- en
	tags:
	- text-detoxification
	- text2text-generation
	- detoxification
	- content-moderation
	- toxicity-reduction
	- llama
	- gguf
	- minibase
	license: apache-2.0
	datasets:
	- paradetox
	metrics:
	- toxicity-reduction
	- semantic-similarity
	- fluency
	- latency
	model-index:
	- name: Detoxify-Small
	results:
	- task:
	type: text-detoxification
	name: Toxicity Reduction
	dataset:
	type: paradetox
	name: ParaDetox
	config: toxic-neutral
	split: test
	metrics:
	- type: toxicity-reduction
	value: 0.032
	name: Average Toxicity Reduction
	- type: semantic-similarity
	value: 0.471
	name: Semantic to Expected
	- type: fluency
	value: 0.919
	name: Text Fluency
	- type: latency
	value: 66.4
	name: Average Latency (ms)
	---

	# Detoxify-Small 🤖

	<div align="center">

	A highly compact (~100 MB) and efficient text detoxification model for removing toxicity while preserving meaning.

	[![Model Size](https://img.shields.io/badge/Model_Size-138MB-blue)](https://huggingface.co/)
	[![Architecture](https://img.shields.io/badge/Architecture-LlamaForCausalLM-green)](https://huggingface.co/)
	[![License](https://img.shields.io/badge/License-Apache_2.0-yellow)](LICENSE)
	[![Discord](https://img.shields.io/badge/Discord-Join_Community-5865F2)](https://discord.com/invite/BrJn4D2Guh)

	Built by [Minibase](https://minibase.ai) - Train and deploy small AI models from your browser.
	*Browse all of the models and datasets available on the [Minibase Marketplace](https://minibase.ai/wiki/Special:Marketplace).

	</div>

	## 📋 Model Summary

	Minibase-Detoxify-Small is a compact language model fine-tuned specifically for text detoxification tasks. It takes toxic or inappropriate text as input and generates cleaned, non-toxic versions while preserving the original meaning and intent as much as possible.

	### Key Features
	- ⚡ Fast Inference: ~66ms average response time
	- 🎯 High Fluency: 91.9% well-formed output text
	- 🧹 Effective Detoxification: 3.2% average toxicity reduction
	- 💾 Compact Size: Only 138MB (GGUF quantized)
	- 🔒 Privacy-First: Runs locally, no data sent to external servers

	## 🚀 Quick Start

	### Local Inference (Recommended)

	1. Install llama.cpp (if not already installed):
	```bash
	git clone https://github.com/ggerganov/llama.cpp
	cd llama.cpp && make
	```

	2. Download and run the model:
	```bash
	# Download model files
	wget https://huggingface.co/minibase/detoxify-small/resolve/main/model.gguf
	wget https://huggingface.co/minibase/detoxify-small/resolve/main/run_server.sh

	# Make executable and run
	chmod +x run_server.sh
	./run_server.sh
	```

	3. Make API calls:
	```python
	import requests

	# Detoxify text
	response = requests.post("http://127.0.0.1:8000/completion", json={
	"prompt": "Instruction: Rewrite the provided text to remove the toxicity.\n\nInput: This is fucking terrible!\n\nResponse: ",
	"max_tokens": 200,
	"temperature": 0.7
	})

	result = response.json()
	print(result["content"]) # "This is really terrible!"
	```

	### Python Client

	```python
	from detoxify_inference import DetoxifyClient

	# Initialize client
	client = DetoxifyClient()

	# Detoxify text
	toxic_text = "This product is fucking amazing, no bullshit!"
	clean_text = client.detoxify_text(toxic_text)

	print(clean_text) # "This product is really amazing, no kidding!"
	```

	## 📊 Benchmarks & Performance

	### ParaDetox Dataset Results (1,008 samples)

	\| Metric \| Score \| Description \|
	\|--------\|-------\|-------------\|
	• Original Toxicity: 0.051 (5.1%)
	• Final Toxicity: 0.020 (2.0%)

	\| Toxicity Reduction \| 0.051 (ParaDetox) --> 0.020 \| Reduced toxicity scores by more than 50% \|
	\| Semantic to Expected \| 0.471 (47.1%) \| Similarity to human expert rewrites \|
	\| Semantic to Original \| 0.625 (62.5%) \| How much original meaning is preserved \|
	\| Fluency \| 0.919 (91.9%) \| Quality of generated text structure \|
	\| Latency \| 66.4ms \| Average response time \|
	\| Throughput \| ~15 req/sec \| Estimated requests per second \|

	### Dataset Breakdown

	#### General Toxic Content (1,000 samples)
	- Semantic Preservation: 62.7%
	- Fluency: 91.9%

	### Comparison with Baselines

	\| Model \| Semantic Similarity \| Toxicity Reduction \| Fluency \|
	\|-------\|-------------------\|-------------------\|---------\|
	\| Detoxify-Small \| 0.471 \| 0.032 \| 0.919 \|
	\| BART-base (ParaDetox) \| 0.750 \| ~0.15 \| ~0.85 \|
	\| Human Performance \| 0.850 \| ~0.25 \| ~0.95 \|

	## 🏗️ Technical Details

	### Model Architecture
	- Architecture: LlamaForCausalLM
	- Parameters: 49,152 (extremely compact)
	- Context Window: 1,024 tokens
	- Quantization: GGUF (4-bit quantization)
	- File Size: 138MB
	- Memory Requirements: 8GB RAM minimum, 16GB recommended

	### Training Details
	- Base Model: Custom-trained Llama architecture
	- Fine-tuning Dataset: Curated toxic-neutral parallel pairs
	- Training Objective: Instruction-following for detoxification
	- Optimization: Quantized for edge deployment

	### System Requirements
	- OS: Linux, macOS, Windows
	- RAM: 8GB minimum, 16GB recommended
	- Storage: 200MB free space
	- Dependencies: llama.cpp, Python 3.7+

	## 📖 Usage Examples

	### Basic Detoxification
	```python
	# Input: "This is fucking awesome!"
	# Output: "This is really awesome!"

	# Input: "You stupid idiot, get out of my way!"
	# Output: "You silly person, please move aside!"
	```

	### API Integration
	```python
	import requests

	def detoxify_text(text: str) -> str:
	"""Detoxify text using Detoxify-Small API"""
	prompt = f"Instruction: Rewrite the provided text to remove the toxicity.\n\nInput: {text}\n\nResponse: "

	response = requests.post("http://127.0.0.1:8000/completion", json={
	"prompt": prompt,
	"max_tokens": 200,
	"temperature": 0.7
	})

	return response.json()["content"]

	# Usage
	toxic_comment = "This product sucks donkey balls!"
	clean_comment = detoxify_text(toxic_comment)
	print(clean_comment) # "This product is not very good!"
	```

	### Batch Processing
	```python
	import asyncio
	import aiohttp

	async def detoxify_batch(texts: list) -> list:
	"""Process multiple texts concurrently"""
	async with aiohttp.ClientSession() as session:
	tasks = []
	for text in texts:
	prompt = f"Instruction: Rewrite the provided text to remove the toxicity.\n\nInput: {text}\n\nResponse: "
	payload = {
	"prompt": prompt,
	"max_tokens": 200,
	"temperature": 0.7
	}
	tasks.append(session.post("http://127.0.0.1:8000/completion", json=payload))

	responses = await asyncio.gather(*tasks)
	return [await resp.json() for resp in responses]

	# Process multiple comments
	comments = [
	"This is fucking brilliant!",
	"You stupid moron!",
	"What the hell is wrong with you?"
	]

	clean_comments = await detoxify_batch(comments)
	```

	## 🔧 Advanced Configuration

	### Server Configuration
	```bash
	# GPU acceleration (macOS with Metal)
	llama-server \
	-m model.gguf \
	--host 127.0.0.1 \
	--port 8000 \
	--n-gpu-layers 35 \
	--metal

	# CPU-only (lower memory usage)
	llama-server \
	-m model.gguf \
	--host 127.0.0.1 \
	--port 8000 \
	--n-gpu-layers 0 \
	--threads 8

	# Custom context window
	llama-server \
	-m model.gguf \
	--ctx-size 2048 \
	--host 127.0.0.1 \
	--port 8000
	```

	### Temperature Settings
	- Low (0.1-0.3): Conservative detoxification, minimal changes
	- Medium (0.4-0.7): Balanced approach (recommended)
	- High (0.8-1.0): Creative detoxification, more aggressive changes

	## 📚 Limitations & Biases

	### Current Limitations
	- Vocabulary Scope: Trained primarily on English toxic content
	- Context Awareness: May not detect sarcasm or cultural context
	- Length Constraints: Limited to 1024 token context window
	- Domain Specificity: Optimized for general web content

	### Potential Biases
	- Cultural Context: May not handle culture-specific expressions
	- Dialect Variations: Limited exposure to regional dialects
	- Emerging Slang: May not recognize newest internet slang

	## 🤝 Contributing

	We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.

	### Development Setup
	```bash
	# Clone the repository
	git clone https://github.com/minibase-ai/detoxify-small
	cd detoxify-small

	# Install dependencies
	pip install -r requirements.txt

	# Run tests
	python -m pytest tests/
	```

	## 📜 Citation

	If you use Detoxify-Small in your research, please cite:

	```bibtex
	@misc{detoxify-small-2025,
	title={Detoxify-Small: A Compact Text Detoxification Model},
	author={Minibase AI Team},
	year={2025},
	publisher={Hugging Face},
	url={https://huggingface.co/minibase/detoxify-small}
	}
	```

	## 📞 Contact & Community

	- Website: [minibase.ai](https://minibase.ai)
	- Discord Community: [Join our Discord](https://discord.com/invite/BrJn4D2Guh)
	- GitHub Issues: [Report bugs or request features on Discord](https://discord.com/invite/BrJn4D2Guh)
	- Email: hello@minibase.ai

	### Support
	- 📖 Documentation: [help.minibase.ai](https://help.minibase.ai)
	- 💬 Community Forum: [Join our Discord Community](https://discord.com/invite/BrJn4D2Guh)

	## 📋 License

	This model is released under the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0)).

	## 🙏 Acknowledgments

	- ParaDetox Dataset: Used for benchmarking and evaluation
	- llama.cpp: For efficient local inference
	- Hugging Face: For model hosting and community
	- Our amazing community: For feedback and contributions

	---

	<div align="center">

	Built with ❤️ by the Minibase team

	Making AI more accessible for everyone

	[📖 Minibase Help Center](https://help.minibase.ai) • [💬 Join our Discord](https://discord.com/invite/BrJn4D2Guh)

	</div>