Commit History

Fix QKV visualization for Mistral/Devstral architecture
4ec134b

gary-boon Claude Opus 4.5 commited on

Add future considerations doc for response size optimization
3e67ea2

gary-boon Claude Opus 4.5 commited on

Fix: Import time module at top level for SSE events
15a862b

gary-boon Claude Opus 4.5 commited on

Add SSE streaming endpoint for real-time analysis progress
172a186

gary-boon Claude Opus 4.5 commited on

feat: Include token metadata in analysis response
ee0f6c9

gary-boon Claude Opus 4.5 commited on

feat: Implement tier-based model filtering by device type
6bf9f5c

gary-boon Claude Opus 4.5 commited on

Fix: Add attn_implementation="eager" to model switch function
f94a7ae

gary-boon Claude Opus 4.5 commited on

Add Phase 5: Performance optimizations to phased plan
383a328

gary-boon Claude Opus 4.5 commited on

Add tokenSections boundaries and update system prompt
c6f4cc5

gary-boon Claude Opus 4.5 commited on

Fix: Handle MistralCommonTokenizer pad_token setter
e20ccaf

gary-boon Claude Opus 4.5 commited on

Integrate mistral-common for correct Devstral tokenization
ed06dcb

gary-boon Claude Opus 4.5 commited on

Remove mistral_common to fix dependency conflict
3d9d9ee

gary-boon Claude Opus 4.5 commited on

Use mistral_common for proper Devstral prompt formatting
3e80769

gary-boon Claude Opus 4.5 commited on

Add system prompt support for instruction-tuned models
2860768

gary-boon Claude Opus 4.5 commited on

fix: Simpler prompt format and temperature=0 for Devstral
76020ee

gary-boon Claude Opus 4.5 commited on

fix: Sanitize JSON response for NaN/Inf float values
99f6209

gary-boon Claude Opus 4.5 commited on

fix: Check chat_template is set before using apply_chat_template
474927d

gary-boon Claude Opus 4.5 commited on

fix: Add chat template support for Devstral instruct model
8d85da8

gary-boon Claude Opus 4.5 commited on

fix: Convert bfloat16 to float32 for numpy compatibility
cb6f39c

gary-boon Claude Opus 4.5 commited on

fix: Use eager attention for output_attentions support
5333b21

gary-boon Claude Opus 4.5 commited on

fix: Skip heavy ML deps in CI security checks
ba27c0c

gary-boon Claude Opus 4.5 commited on

fix: Update torch to 2.3+ for transformers compatibility
1b73605

gary-boon Claude Opus 4.5 commited on

fix: Update transformers for Devstral support
b788304

gary-boon Claude Opus 4.5 commited on

docs: Mark GPU HF Space Devstral deployment complete
65c6e2e

gary-boon Claude Opus 4.5 commited on

docs: Update phased plan with Phase 2/2b/2c completion status
688efad

gary-boon Claude Opus 4.5 commited on

Add vocabSize to modelInfo response
499afba

gary-boon Claude Opus 4.5 commited on

Update .env.spark.example: TORCH_DTYPE now auto-detected
543454f

gary-boon Claude Opus 4.5 commited on

Add recommended_dtype to model configs
62525b2

gary-boon Claude Opus 4.5 commited on

Phase 2: Add Devstral backend support
9080f28

gary-boon Claude Opus 4.5 commited on

Update plan: Phase 1 paused due to GB10 GPU support
e694533

gary-boon Claude Opus 4.5 commited on

Add DEVICE env var to force CPU mode on DGX Spark
5f122aa

gary-boon Claude Opus 4.5 commited on

Use NGC PyTorch 24.08 for Python 3.10 compatibility
a2875a2

gary-boon Claude Opus 4.5 commited on

Use NVIDIA NGC PyTorch container for GB10 support
a4cfbff

gary-boon Claude Opus 4.5 commited on

Try PyTorch nightly for GB10/sm_121 GPU support
a009a49

gary-boon Claude Opus 4.5 commited on

Make zarr/numcodecs imports optional for ARM64 compatibility
6435a75

gary-boon Claude Opus 4.5 commited on

Skip zarr/numcodecs in Spark build (ARM64 incompatible)
d129e37

gary-boon Claude Opus 4.5 commited on

Fix numcodecs ARM64 compatibility in Dockerfile.spark
772fc80

gary-boon Claude Opus 4.5 commited on

Fix Dockerfile.spark for CUDA 13.0 compatibility
a4927aa

gary-boon Claude Opus 4.5 commited on

Fix Dockerfile.spark for ARM64 architecture (DGX Spark)
9d00d33

gary-boon Claude Opus 4.5 commited on

Add GPU-enabled Dockerfile for Spark
9377cd8

gary-boon Claude Opus 4.5 commited on

Fix Dockerfile: add build-essential for numcodecs compilation
3b5c3ac

gary-boon Claude Opus 4.5 commited on

Phase 1: DGX Spark infrastructure
a2bd186

gary-boon Claude Opus 4.5 commited on

Add Devstral + DGX Spark implementation plan
ab4534a

gary-boon Claude Opus 4.5 commited on

Make QKV hook robust against shape mismatches
343dd57

gary-boon Claude commited on

Fix research attention endpoint model compatibility
f5ba954

gary-boon Claude commited on

Fix zarr/numcodecs version compatibility
9e9dc34

gary-boon Claude commited on

Add zarr to requirements.txt for storage module
f54e3f9

gary-boon Claude commited on

Add research attention analysis endpoint with real CodeGen tokenization
8f63685

gary-boon Claude commited on

Add research attention analysis endpoints with Q/K/V extraction
37ed739

gary-boon Claude commited on

Fix ablation study for Code Llama compatibility
cd300ee

gary-boon Claude commited on