Toucan SFT 350M LoRA
Fine-tuned LoRA adapter for LFM2 350M model on Toucan-1.5M SFT dataset.
Model Details
- Base Model: unsloth/LFM2-350M-unsloth-bnb-4bit
- Fine-tuning Dataset: Agent-Ark/Toucan-1.5M (SFT split)
- Training Method: LoRA (Low-Rank Adaptation)
- LoRA Rank: 64
- LoRA Alpha: 128
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training Configuration
- Epochs: 1
- Batch Size: 8 per device
- Gradient Accumulation: 16 steps
- Effective Batch Size: 128
- Learning Rate: 3e-4
- Max Sequence Length: 4096 (with packing)
- Optimizer: paged_adamw_8bit
- Precision: bfloat16
- Optimizations: Flash Attention 2, Gradient Checkpointing, TF32
Evaluation Results
Perplexity on Toucan Dataset
| Model | Perplexity | Loss | Improvement |
|---|---|---|---|
| Base Model | 12.33 | 2.51 | - |
| Fine-tuned Model | 5.52 | 1.71 | 55.21% |
Evaluation Details:
- Samples: 200 per model
- Total tokens: 82,517 per model
- The fine-tuned model achieves a 55.21% reduction in perplexity, indicating significantly better domain adaptation.
Standard Benchmarks
| Benchmark | Base Model | Fine-tuned Model | Change |
|---|---|---|---|
| GSM8K (Accuracy (flexible)) | 0.130 | 0.130 | +0.000 (+0.0%) |
| HELLASWAG (Accuracy (normalized)) | 0.430 | 0.430 | +0.000 (+0.0%) |
Note: Benchmarks evaluated on 100 samples each for faster evaluation.
Key Findings
Significant Perplexity Improvement: The fine-tuned model shows a 55.21% reduction in perplexity on the Toucan dataset, indicating successful domain adaptation.
Domain-Specific Training: Fine-tuned on 119K examples from Toucan-1.5M SFT dataset, optimized for tool-use and agentic interactions.
Efficient Training: Using LoRA with rank 64 allows for efficient fine-tuning while maintaining model quality.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model
model = AutoModelForCausalLM.from_pretrained(
"unsloth/LFM2-350M-unsloth-bnb-4bit",
device_map="auto",
torch_dtype=torch.bfloat16
)
# Load LoRA adapter
model = PeftModel.from_pretrained(model, "ethanker/toucan-sft-350m-lora")
# Use model for inference
tokenizer = AutoTokenizer.from_pretrained("unsloth/LFM2-350M-unsloth-bnb-4bit")
# ...
Training Details
The model was trained using:
- Training Script:
train_toucan_a100.py - Hardware: NVIDIA A100 80GB PCIe
- Training Time: ~8 minutes per epoch
- Dataset Size: 119,287 examples
- Sequence Packing: Enabled (2-3x speedup)
Citation
@misc{toucan-sft-350m-lora,
title={Toucan SFT 350M LoRA - Fine-tuned LFM2 Model for Tool-Use},
author={Ethan},
year={2025},
howpublished={\url{https://huggingface.co/ethanker/toucan-sft-350m-lora}}
}
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support