Switch Space trainer defaults to math_conjecture_sota profile and remove DeepSeek references
9a4f619 verified | title: Math Conjecture Trainer | |
| sdk: gradio | |
| sdk_version: "6.6.0" | |
| python_version: "3.10" | |
| app_file: app.py | |
| pinned: false | |
| emoji: 🧮 | |
| # Math Conjecture Command Center | |
| An autonomous training app for Lean-aware and math-reasoning base models that runs multi-stage curriculum fine-tuning on Space GPU, executes post-training quality evaluation, and publishes only qualified adapters, checkpoints, and run reports to your Hugging Face model repository. | |
| The UI now uses a dark graphite command-center aesthetic with tactical green/cyan telemetry accents, higher-contrast cards, and mobile-friendly stacking while preserving the existing training controls. | |
| This Space is the training app for `maths-conjuncture-solutions` and is wired to: | |
| - dataset: `NorthernTribe-Research/math-conjecture-training-corpus` | |
| - model repo: `NorthernTribe-Research/math-conjecture-model` | |
| ## End-to-end flow | |
| 1. Download released parquet splits (`train/validation/test`). | |
| 2. Build runtime config from `configs/math_conjecture_sota.yaml` (default answered-conjecture solver profile). | |
| 3. Run 4-stage curriculum LoRA fine-tuning with `scripts/train_sota.py`. | |
| 4. Run post-train evaluation (`pass@1`, `pass@k`, exact/boxed, family metrics). | |
| 5. Apply quality gate thresholds before hub push. | |
| 6. Emit `training_summary.json` + `post_eval_report.json` and stream live telemetry in UI. | |
| ## Access posture | |
| Credentials and publish permissions are handled by deployment runtime settings. | |
| ## Operations controls | |
| - `Autonomous Mode`: enabled by default; applies full-stage training/eval/gate/publish profile automatically. | |
| - `Continuous Auto-Restart`: enabled by default; after each cycle ends, the next training cycle starts automatically until cancelled. | |
| - `Run Evaluation After Training`: toggles post-train eval in runtime config. | |
| - `Enforce Quality Gate`: enables/disables promotion gate checks. | |
| - `Gate Min pass@1`, `Gate Min pass@k`, `Gate Min Rows`: runtime gate thresholds. | |
| - `Telemetry Deck`: real-time stage progression, runtime posture, training-loss graph (sparkline), artifact index, and recent-run outcomes. | |
| - `Run Log (Live Stream + Summary JSON)`: unified panel for line-by-line runtime stream, heartbeats, and structured run summary. | |
| - `Validation Mode (No Training)`: validates pipeline with `--dry-run`. | |
| - `Force Dataset Redownload`: bypasses cached parquet files. | |
| - `Abort Active Run`: cancels active subprocess tree. | |
| ## Artifacts | |
| - runtime config: `workspace/runtime/math_conjecture_sota.runtime.yaml` | |
| - run output root: `workspace/runs/math-conjecture-sota` | |
| - final adapter: `workspace/runs/math-conjecture-sota/final_adapter` | |
| - training summary: `workspace/runs/math-conjecture-sota/training_summary.json` | |
| - post-eval report: `workspace/runs/math-conjecture-sota/post_eval_report.json` | |
| - run history index: `workspace/runtime/run_history.json` | |
| - per-run records: `workspace/runtime/run_records/<run_label>.json` | |
| ## Notes | |
| - Full training runs on GPU when available and automatically falls back to CPU mode when CUDA is unavailable. | |
| - Default SOTA profile uses `Qwen/Qwen2.5-Math-7B-Instruct` for answered-conjecture solving and Lean/formal-proof alignment. | |
| - Alternate SOTA profile `configs/qwen25_math_sota.yaml` targets `Qwen/Qwen2.5-Math-7B-Instruct`. | |
| - App handles Gradio copy-button compatibility across versions automatically. | |
| - The interface is optimized to stay legible on mobile, with telemetry cards collapsing into a single-column stack. | |
| ## Production Hardening | |
| Before deployment, run: | |
| ```bash | |
| python scripts/preflight_check.py | |
| python -m unittest discover -s tests -v | |
| ``` | |
| Optional deeper validation (runs stage-1 dry-run training check): | |
| ```bash | |
| python scripts/preflight_check.py --run-training-dry-run | |
| ``` | |
| Continuous mode now includes two production fail-safes: | |
| - restart cooldown between cycles (default `15s`) | |
| - circuit breaker that stops after consecutive non-success cycles (default `3`) | |
| Environment overrides: | |
| - `CONTINUOUS_RESTART_DELAY_SECONDS` (integer, `>=0`) | |
| - `CONTINUOUS_MAX_CONSECUTIVE_FAILURES` (integer, `>=1`) | |
| - `APP_LOG_MAX_CHARS` (integer, `>=20000`; default `200000`) | |
| - `RUN_HISTORY_LIMIT` (integer, `>=5`; default `80`) | |
| Recommended runtime secrets posture: | |
| - set `HF_TOKEN` / `HUGGINGFACE_HUB_TOKEN` from Space secrets | |
| - avoid storing long-lived API tokens in repository files | |
| Detailed deployment/rollback steps are documented in `PRODUCTION.md`. | |
| ## Validation Record | |
| - Latest verification and hardening run details are recorded in `VALIDATION_LOG.md`. | |