chimbiwide's picture
Update README.md
33da35f verified
metadata
base_model: unsloth/gemma-3n-e4b-it-unsloth-bnb-4bit
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - gemma3n
license: apache-2.0
language:
  - en
datasets:
  - chimbiwide/pippa_filtered
pipeline_tag: image-text-to-text

Gemma3NPC-filtered-v2

Another one of those test models

we were chatting and just decided: "What would happen if we have a higher learning rate and run 3 epochs🤓"

So here it is, the second generation of filtered Gemma3NPC, with bare minimum effort of typing 6 characters and a little patience(3 hours).

Again, our training notebook is on Github.


Training Parameters

Parameter Gemma3NPC-Filtered v2
Learning rate 2e-5 4e-5
Warmup Steps 150 100
Gradient clipping 0.5 1.0

Training Loss

This time, we accidentally stopped the training when it reach step 200, so when we resumed, the training started from scratch but seems to have used the last checkpoint.

Here is a graph of the training loss, saved after after 5 steps.

chart


Next Steps

Our top priority is now the gathering of more datasets, such as SODA and some real video game data.

We might try to switch to a new model(Qwen?), as the Gemma3n license is a little restrictive.

New methods other than SFT to improve performance, like GRPO.