Toucan SFT 350M LoRA

Fine-tuned LoRA adapter for LFM2 350M model on Toucan-1.5M SFT dataset.

Model Details

Training Configuration

  • Epochs: 1
  • Batch Size: 8 per device
  • Gradient Accumulation: 16 steps
  • Effective Batch Size: 128
  • Learning Rate: 3e-4
  • Max Sequence Length: 4096 (with packing)
  • Optimizer: paged_adamw_8bit
  • Precision: bfloat16
  • Optimizations: Flash Attention 2, Gradient Checkpointing, TF32

Evaluation Results

Perplexity on Toucan Dataset

Model Perplexity Loss Improvement
Base Model 12.33 2.51 -
Fine-tuned Model 5.52 1.71 55.21%

Evaluation Details:

  • Samples: 200 per model
  • Total tokens: 82,517 per model
  • The fine-tuned model achieves a 55.21% reduction in perplexity, indicating significantly better domain adaptation.

Standard Benchmarks

Benchmark Base Model Fine-tuned Model Change
GSM8K (Accuracy (flexible)) 0.130 0.130 +0.000 (+0.0%)
HELLASWAG (Accuracy (normalized)) 0.430 0.430 +0.000 (+0.0%)

Note: Benchmarks evaluated on 100 samples each for faster evaluation.

Key Findings

  1. Significant Perplexity Improvement: The fine-tuned model shows a 55.21% reduction in perplexity on the Toucan dataset, indicating successful domain adaptation.

  2. Domain-Specific Training: Fine-tuned on 119K examples from Toucan-1.5M SFT dataset, optimized for tool-use and agentic interactions.

  3. Efficient Training: Using LoRA with rank 64 allows for efficient fine-tuning while maintaining model quality.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model
model = AutoModelForCausalLM.from_pretrained(
    "unsloth/LFM2-350M-unsloth-bnb-4bit",
    device_map="auto",
    torch_dtype=torch.bfloat16
)

# Load LoRA adapter
model = PeftModel.from_pretrained(model, "ethanker/toucan-sft-350m-lora")

# Use model for inference
tokenizer = AutoTokenizer.from_pretrained("unsloth/LFM2-350M-unsloth-bnb-4bit")
# ...

Training Details

The model was trained using:

  • Training Script: train_toucan_a100.py
  • Hardware: NVIDIA A100 80GB PCIe
  • Training Time: ~8 minutes per epoch
  • Dataset Size: 119,287 examples
  • Sequence Packing: Enabled (2-3x speedup)

Citation

@misc{toucan-sft-350m-lora,
  title={Toucan SFT 350M LoRA - Fine-tuned LFM2 Model for Tool-Use},
  author={Ethan},
  year={2025},
  howpublished={\url{https://huggingface.co/ethanker/toucan-sft-350m-lora}}
}
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ethanker/toucan-sft-350m-lora

Base model

LiquidAI/LFM2-350M
Adapter
(3)
this model