---
title: Math Conjecture Trainer
sdk: gradio
sdk_version: "6.6.0"
python_version: "3.10"
app_file: app.py
pinned: false
emoji: 🧮

---

# Math Conjecture Command Center

An autonomous training app for Lean-aware and math-reasoning base models that runs multi-stage curriculum fine-tuning on Space GPU, executes post-training quality evaluation, and publishes only qualified adapters, checkpoints, and run reports to your Hugging Face model repository.

The UI now uses a dark graphite command-center aesthetic with tactical green/cyan telemetry accents, higher-contrast cards, and mobile-friendly stacking while preserving the existing training controls.

This Space is the training app for `maths-conjuncture-solutions` and is wired to:

- dataset: `NorthernTribe-Research/math-conjecture-training-corpus`
- model repo: `NorthernTribe-Research/math-conjecture-model`

## End-to-end flow

1. Download released parquet splits (`train/validation/test`).
2. Build runtime config from `configs/math_conjecture_sota.yaml` (default answered-conjecture solver profile).
3. Run 4-stage curriculum LoRA fine-tuning with `scripts/train_sota.py`.
4. Run post-train evaluation (`pass@1`, `pass@k`, exact/boxed, family metrics).
5. Apply quality gate thresholds before hub push.
6. Emit `training_summary.json` + `post_eval_report.json` and stream live telemetry in UI.

## Access posture

Credentials and publish permissions are handled by deployment runtime settings.

## Operations controls

- `Autonomous Mode`: enabled by default; applies full-stage training/eval/gate/publish profile automatically.
- `Continuous Auto-Restart`: enabled by default; after each cycle ends, the next training cycle starts automatically until cancelled.
- `Run Evaluation After Training`: toggles post-train eval in runtime config.
- `Enforce Quality Gate`: enables/disables promotion gate checks.
- `Gate Min pass@1`, `Gate Min pass@k`, `Gate Min Rows`: runtime gate thresholds.
- `Telemetry Deck`: real-time stage progression, runtime posture, training-loss graph (sparkline), artifact index, and recent-run outcomes.
- `Run Log (Live Stream + Summary JSON)`: unified panel for line-by-line runtime stream, heartbeats, and structured run summary.
- `Validation Mode (No Training)`: validates pipeline with `--dry-run`.
- `Force Dataset Redownload`: bypasses cached parquet files.
- `Abort Active Run`: cancels active subprocess tree.

## Artifacts

- runtime config: `workspace/runtime/math_conjecture_sota.runtime.yaml`
- run output root: `workspace/runs/math-conjecture-sota`
- final adapter: `workspace/runs/math-conjecture-sota/final_adapter`
- training summary: `workspace/runs/math-conjecture-sota/training_summary.json`
- post-eval report: `workspace/runs/math-conjecture-sota/post_eval_report.json`
- run history index: `workspace/runtime/run_history.json`
- per-run records: `workspace/runtime/run_records/<run_label>.json`

## Notes

- Full training runs on GPU when available and automatically falls back to CPU mode when CUDA is unavailable.
- Default SOTA profile uses `Qwen/Qwen2.5-Math-7B-Instruct` for answered-conjecture solving and Lean/formal-proof alignment.
- Alternate SOTA profile `configs/qwen25_math_sota.yaml` targets `Qwen/Qwen2.5-Math-7B-Instruct`.
- App handles Gradio copy-button compatibility across versions automatically.
- The interface is optimized to stay legible on mobile, with telemetry cards collapsing into a single-column stack.

## Production Hardening

Before deployment, run:

```bash
python scripts/preflight_check.py
python -m unittest discover -s tests -v
```

Optional deeper validation (runs stage-1 dry-run training check):

```bash
python scripts/preflight_check.py --run-training-dry-run
```

Continuous mode now includes two production fail-safes:

- restart cooldown between cycles (default `15s`)
- circuit breaker that stops after consecutive non-success cycles (default `3`)

Environment overrides:

- `CONTINUOUS_RESTART_DELAY_SECONDS` (integer, `>=0`)
- `CONTINUOUS_MAX_CONSECUTIVE_FAILURES` (integer, `>=1`)
- `APP_LOG_MAX_CHARS` (integer, `>=20000`; default `200000`)
- `RUN_HISTORY_LIMIT` (integer, `>=5`; default `80`)

Recommended runtime secrets posture:

- set `HF_TOKEN` / `HUGGINGFACE_HUB_TOKEN` from Space secrets
- avoid storing long-lived API tokens in repository files

Detailed deployment/rollback steps are documented in `PRODUCTION.md`.

## Validation Record

- Latest verification and hardening run details are recorded in `VALIDATION_LOG.md`.