A newer version of the Gradio SDK is available: 6.11.0
title: Math Conjecture Trainer
sdk: gradio
sdk_version: 6.6.0
python_version: '3.10'
app_file: app.py
pinned: false
emoji: 🧮
Math Conjecture Command Center
An autonomous training app for Lean-aware and math-reasoning base models that runs multi-stage curriculum fine-tuning on Space GPU, executes post-training quality evaluation, and publishes only qualified adapters, checkpoints, and run reports to your Hugging Face model repository.
The UI now uses a dark graphite command-center aesthetic with tactical green/cyan telemetry accents, higher-contrast cards, and mobile-friendly stacking while preserving the existing training controls.
This Space is the training app for maths-conjuncture-solutions and is wired to:
- dataset:
NorthernTribe-Research/math-conjecture-training-corpus - model repo:
NorthernTribe-Research/math-conjecture-model
End-to-end flow
- Download released parquet splits (
train/validation/test). - Build runtime config from
configs/math_conjecture_sota.yaml(default answered-conjecture solver profile). - Run 4-stage curriculum LoRA fine-tuning with
scripts/train_sota.py. - Run post-train evaluation (
pass@1,pass@k, exact/boxed, family metrics). - Apply quality gate thresholds before hub push.
- Emit
training_summary.json+post_eval_report.jsonand stream live telemetry in UI.
Access posture
Credentials and publish permissions are handled by deployment runtime settings.
Operations controls
Autonomous Mode: enabled by default; applies full-stage training/eval/gate/publish profile automatically.Continuous Auto-Restart: enabled by default; after each cycle ends, the next training cycle starts automatically until cancelled.Run Evaluation After Training: toggles post-train eval in runtime config.Enforce Quality Gate: enables/disables promotion gate checks.Gate Min pass@1,Gate Min pass@k,Gate Min Rows: runtime gate thresholds.Telemetry Deck: real-time stage progression, runtime posture, training-loss graph (sparkline), artifact index, and recent-run outcomes.Run Log (Live Stream + Summary JSON): unified panel for line-by-line runtime stream, heartbeats, and structured run summary.Validation Mode (No Training): validates pipeline with--dry-run.Force Dataset Redownload: bypasses cached parquet files.Abort Active Run: cancels active subprocess tree.
Artifacts
- runtime config:
workspace/runtime/math_conjecture_sota.runtime.yaml - run output root:
workspace/runs/math-conjecture-sota - final adapter:
workspace/runs/math-conjecture-sota/final_adapter - training summary:
workspace/runs/math-conjecture-sota/training_summary.json - post-eval report:
workspace/runs/math-conjecture-sota/post_eval_report.json - run history index:
workspace/runtime/run_history.json - per-run records:
workspace/runtime/run_records/<run_label>.json
Notes
- Full training runs on GPU when available and automatically falls back to CPU mode when CUDA is unavailable.
- Default SOTA profile uses
Qwen/Qwen2.5-Math-7B-Instructfor answered-conjecture solving and Lean/formal-proof alignment. - Alternate SOTA profile
configs/qwen25_math_sota.yamltargetsQwen/Qwen2.5-Math-7B-Instruct. - App handles Gradio copy-button compatibility across versions automatically.
- The interface is optimized to stay legible on mobile, with telemetry cards collapsing into a single-column stack.
Production Hardening
Before deployment, run:
python scripts/preflight_check.py
python -m unittest discover -s tests -v
Optional deeper validation (runs stage-1 dry-run training check):
python scripts/preflight_check.py --run-training-dry-run
Continuous mode now includes two production fail-safes:
- restart cooldown between cycles (default
15s) - circuit breaker that stops after consecutive non-success cycles (default
3)
Environment overrides:
CONTINUOUS_RESTART_DELAY_SECONDS(integer,>=0)CONTINUOUS_MAX_CONSECUTIVE_FAILURES(integer,>=1)APP_LOG_MAX_CHARS(integer,>=20000; default200000)RUN_HISTORY_LIMIT(integer,>=5; default80)
Recommended runtime secrets posture:
- set
HF_TOKEN/HUGGINGFACE_HUB_TOKENfrom Space secrets - avoid storing long-lived API tokens in repository files
Detailed deployment/rollback steps are documented in PRODUCTION.md.
Validation Record
- Latest verification and hardening run details are recorded in
VALIDATION_LOG.md.