Toucan SFT 350M LoRA

Fine-tuned LoRA adapter for LFM2 350M model on Toucan-1.5M SFT dataset.

Model Details

Base Model: unsloth/LFM2-350M-unsloth-bnb-4bit
Fine-tuning Dataset: Agent-Ark/Toucan-1.5M (SFT split)
Training Method: LoRA (Low-Rank Adaptation)
LoRA Rank: 64
LoRA Alpha: 128
Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Training Configuration

Epochs: 1
Batch Size: 8 per device
Gradient Accumulation: 16 steps
Effective Batch Size: 128
Learning Rate: 3e-4
Max Sequence Length: 4096 (with packing)
Optimizer: paged_adamw_8bit
Precision: bfloat16
Optimizations: Flash Attention 2, Gradient Checkpointing, TF32

Evaluation Results

Perplexity on Toucan Dataset

Model	Perplexity	Loss	Improvement
Base Model	12.33	2.51	-
Fine-tuned Model	5.52	1.71	55.21%

Evaluation Details:

Samples: 200 per model
Total tokens: 82,517 per model
The fine-tuned model achieves a 55.21% reduction in perplexity, indicating significantly better domain adaptation.

Standard Benchmarks

Benchmark	Base Model	Fine-tuned Model	Change
GSM8K (Accuracy (flexible))	0.130	0.130	+0.000 (+0.0%)
HELLASWAG (Accuracy (normalized))	0.430	0.430	+0.000 (+0.0%)

Note: Benchmarks evaluated on 100 samples each for faster evaluation.

Key Findings

Significant Perplexity Improvement: The fine-tuned model shows a 55.21% reduction in perplexity on the Toucan dataset, indicating successful domain adaptation.
Domain-Specific Training: Fine-tuned on 119K examples from Toucan-1.5M SFT dataset, optimized for tool-use and agentic interactions.
Efficient Training: Using LoRA with rank 64 allows for efficient fine-tuning while maintaining model quality.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model
model = AutoModelForCausalLM.from_pretrained(
    "unsloth/LFM2-350M-unsloth-bnb-4bit",
    device_map="auto",
    torch_dtype=torch.bfloat16
)

# Load LoRA adapter
model = PeftModel.from_pretrained(model, "ethanker/toucan-sft-350m-lora")

# Use model for inference
tokenizer = AutoTokenizer.from_pretrained("unsloth/LFM2-350M-unsloth-bnb-4bit")
# ...

Training Details

The model was trained using:

Training Script: train_toucan_a100.py
Hardware: NVIDIA A100 80GB PCIe
Training Time: ~8 minutes per epoch
Dataset Size: 119,287 examples
Sequence Packing: Enabled (2-3x speedup)

Citation

@misc{toucan-sft-350m-lora,
  title={Toucan SFT 350M LoRA - Fine-tuned LFM2 Model for Tool-Use},
  author={Ethan},
  year={2025},
  howpublished={\url{https://huggingface.co/ethanker/toucan-sft-350m-lora}}
}

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ethanker/toucan-sft-350m-lora

Base model

LiquidAI/LFM2-350M

Quantized

unsloth/LFM2-350M-unsloth-bnb-4bit

Adapter

(3)

this model