search-reranker-broad-policy-v7

Broad-policy serving release for gov search.

This is a serving-policy release over temsa/search-reranker-broad-policy-v6:

  • same raw weights
  • same q8 ONNX artifact family
  • stronger latest-turn dominance on newline-separated chat turns
  • stronger budget-year/topic matching
  • stronger penalties for generic news when the query asks for a specific topic

Recommended serving profiles

Broad policy queries:

  • policy: gov_broad_v1
  • backend: onnx
  • threads: 10
  • max length: 176

Office-holder identity queries:

  • policy: office_holder_v1
  • backend: onnx
  • threads: 10
  • max length: 224

Why v7 exists

Fresh unseen holdouts uncovered a real serving gap on:

  • budget YYYY queries for unseen welfare topics
  • Gaelic latest-news queries
  • newline-separated multi-turn queries where the final turn asks for the latest news

v7 fixes those cases in the bundled serving policy without changing the model weights.

Key q8 results

Suite v7
Fresh policy holdout v6 MRR@10 (176) 0.9375
Corrected policy holdout v5 MRR@10 (176) 1.0000
Legacy policy all v2 MRR@10 (176) 0.9196
Office valid MRR@10 (224) 1.0000
Office holdout v3 MRR@10 (224) 1.0000
Office holdout v4 MRR@10 (224) 0.9659

Representative throughput:

  • fresh policy holdout v6 (176): 15.65 qps
  • office valid (224): 23.76 qps
  • office holdout v3 (224): 23.18 qps

Portfolio comparison

Updated 2026-03-20 from local reranker reports only.

Use this section as the side-by-side public temsa reranker view. Cells are intentionally left out or summarized when the local report set does not contain a trustworthy non-quantized benchmark for the same public path.

General bilingual rerankers

Repo Primary role Non-quantized path Quantized path Extra trustworthy signals
temsa/search-reranker-broad-v1 Broad final-stage reranker ONNX fp32 l256: proxy 0.9490 / 2.08 qps Sibling q8 l160: proxy 0.9458 / 8.41 qps office 0.8056; hard-k10 0.9815; holdout-a03 0.8759
temsa/search-reranker-broad-v1-qint8 Broad CPU q8 sibling See temsa/search-reranker-broad-v1 ONNX q8 l160: proxy 0.9458 / 8.41 qps office 0.8056; hard-k10 0.9815; holdout-a03 0.8759
temsa/search-reranker-irishgov-l6-fast-v1 Fast stage-1 / cascade prefilter PT l192: proxy 0.9192 / 7.99 qps ONNX q8 l128: proxy 0.9218 / 27.25 qps office 1.0000; holdout-a03 0.9259
temsa/search-reranker-irishgov-l6-fast-v2 Fast K=10 policy serving release Same raw checkpoint family as v1; q8 with corrected temporal + office policy is the recommended path ONNX q8 l160/t10: corrected holdout-v5 1.0000 / 47.68 qps office-holdout-v3 1.0000; office-valid 1.0000; legacy-holdout-v4 0.9375
temsa/search-reranker-irishgov-l6-fast-v3 Fastest public K=10 serving profile Same raw checkpoint family as v2; the value is the shorter recommended serving length on corrected gates ONNX q8 l128/t10: corrected holdout-v5 1.0000 / 48.27 qps office-holdout-v3 1.0000; office-valid 1.0000; legacy-holdout-v4 0.9444
temsa/search-reranker-irishgov-l6-fast-v4 Current best fast K=10 reranker Margin-MSE style continuation over v3; q8 per-channel is the recommended deployed artifact ONNX q8 per-channel l128/t10: corrected holdout-v5 1.0000 / 50.78 qps office-holdout-v3 1.0000; office-valid 1.0000; legacy-holdout-v4 0.9444
temsa/search-reranker-irishgov-l6-fast-v5 Fast K=10 serving release with shorter broad-policy route Same raw checkpoint family as v4; the value is a shorter broad-policy serving profile while keeping the office route unchanged ONNX q8 per-channel gov_broad_v1 l104/t10: corrected holdout-v5 1.0000 / 55.77 qps fresh-holdout-v6 0.8917; office-holdout-v3 1.0000; office-valid 1.0000
temsa/search-reranker-irishgov-l5-k10-v1 Fast K=10 direct reranker PT l160: proxy 0.8800 / 1.32 qps ONNX q8 l140: proxy 0.8872 / 28.72 qps office 1.0000; hard-k10 0.9815; holdout-a03 0.9630
temsa/search-reranker-irishgov-l5-k10-v2 Current K=10 successor PT l128: office 1.0000, finephrase 0.9405 ONNX q8 l140: proxy 0.8853 / 28.27 qps office 1.0000; holdout-a03 1.0000

Policy rerankers

Repo Primary role Non-quantized path Quantized path Extra trustworthy signals
temsa/search-reranker-broad-policy-v1 Broad policy-tuned reranker PT l224: policy-all 0.9270 / 7.94 qps ONNX q8 l224: policy-all 0.9205 / 27.55 qps office 0.9537; holdout-a04 0.9049
temsa/search-reranker-broad-policy-v3 Current broad policy successor ONNX fp32 l224: policy-all 0.9259 / 15.75 qps ONNX q8 reduce_range l224: policy-all 0.9268 / 30.12 qps office 0.9676; holdout-v3 0.9286
temsa/search-reranker-broad-policy-v4 Current broad policy serving release Same raw checkpoint family as v3; q8 gov_broad_v1 is the recommended path ONNX q8 reduce_range + gov_broad_v1 l224: policy-all 0.9257 / 31.64 qps office 0.9676; holdout-v3 0.9271; holdout-v4 0.9583
temsa/search-reranker-broad-policy-v5 Current broad policy serving release Same raw checkpoint family as v4; q8 gov_broad_v1 is the recommended path ONNX q8 reduce_range + gov_broad_v1 l224/t10: policy-all 0.9711 / 26.12 qps office 1.0000; holdout-v3 1.0000; holdout-v4 1.0000
temsa/search-reranker-broad-policy-v6 Broad policy serving release with corrected temporal gate Same raw checkpoint family as v5; q8 gov_broad_v1 + office_holder_v1 is the recommended path ONNX q8 gov_broad_v1 l176/t10: corrected holdout-v5 0.8657 / 25.74 qps office-l160 1.0000; legacy-holdout-v4 0.9444; legacy-policy-all 0.9167
temsa/search-reranker-broad-policy-v7 Broad policy serving release with stronger latest-turn and topic-specific news routing Same raw checkpoint family as v6; q8 gov_broad_v1 + office_holder_v1 is the recommended path ONNX q8 gov_broad_v1 l176/t10: fresh holdout-v6 0.9375 / 15.65 qps office-l224 1.0000; office-holdout-v4 0.9659; corrected-holdout-v5 1.0000; legacy-policy-all 0.9196
temsa/search-reranker-broad-policy-v8 Broad policy serving-profile release with shorter broad and office routes Same raw checkpoint family as v7; q8 gov_broad_v1 + office_holder_v1 is the recommended path ONNX q8 gov_broad_v1 l160/t10: fresh holdout-v7 0.9500 / 29.85 qps office-l208 1.0000; office-holdout-v3 1.0000; office-holdout-v4 0.9659; fresh-holdout-v6 0.9500
temsa/search-reranker-broad-policy-v9 Broad policy serving release with stronger year-specific change-history routing Same raw checkpoint family as v8; q8 gov_broad_v1 + office_holder_v1 is the recommended path ONNX q8 gov_broad_v1 l160/t10: gov.ie category-policy corrected 1.0000 / 26.01 qps legacy-policy-all 0.9282; fresh-holdout-v6 0.9500; office-valid 1.0000; office-holdout-v4 0.9659

Dynamic q8 CPU at K=20

These rows use the current local K=20 benchmark harness:

  • proxy_k20: policy=none
  • broad_policy_k20: policy=gov_broad_v1
  • office_holder_k20: policy=office_holder_v1
  • fixed threads=10, batch_size=32
Repo Broad len Office len Proxy K=20 Broad-policy K=20 Office-holder K=20 Notes
temsa/search-reranker-broad-v1 160 160 0.9458 / 6.93 qps 0.9437 / 9.10 qps 0.6073 / 8.23 qps broad baseline
temsa/search-reranker-broad-v1-qint8 160 160 0.9458 / 6.65 qps 0.9437 / 6.62 qps 0.6073 / 5.91 qps q8 sibling repo
temsa/search-reranker-broad-policy-v1 224 224 0.9414 / 7.20 qps 0.9125 / 11.32 qps 0.6966 / 6.49 qps first policy model
temsa/search-reranker-broad-policy-v2 224 224 0.9414 / 6.44 qps 0.9125 / 9.40 qps 0.6966 / 7.75 qps office-policy serving fix era
temsa/search-reranker-broad-policy-v3 224 224 0.9364 / 7.38 qps 0.9250 / 12.20 qps 0.7451 / 7.15 qps custom CPU continuation
temsa/search-reranker-broad-policy-v4 224 224 0.9364 / 7.16 qps 0.9250 / 11.40 qps 0.7451 / 6.90 qps gov_broad_v1 serving route
temsa/search-reranker-broad-policy-v5 224 224 0.9364 / 7.00 qps 0.9250 / 12.35 qps 0.7451 / 6.47 qps stronger serving release
temsa/search-reranker-broad-policy-v6 176 160 0.9433 / 8.09 qps 0.9250 / 9.97 qps 0.7186 / 9.77 qps corrected temporal gate
temsa/search-reranker-broad-policy-v7 176 224 0.9433 / 7.91 qps 0.9250 / 12.11 qps 0.7451 / 7.49 qps stronger latest-turn routing
temsa/search-reranker-broad-policy-v8 160 208 0.9371 / 9.49 qps 0.9250 / 11.21 qps 0.7527 / 7.01 qps current broad fallback
temsa/search-reranker-broad-policy-v9 160 208 0.9371 / 8.97 qps 0.9250 / 11.56 qps 0.7527 / 6.62 qps current broad fallback + change-history fix
temsa/search-reranker-irishgov-l5-k10-v1 140 140 0.8872 / 15.56 qps 0.8938 / 21.68 qps 0.6751 / 21.01 qps fast K=10 direct
temsa/search-reranker-irishgov-l5-k10-v2 140 140 0.8853 / 18.06 qps 0.8854 / 18.87 qps 0.6327 / 17.93 qps fast K=10 successor
temsa/search-reranker-irishgov-l6-fast-v1 128 128 0.9286 / 18.70 qps 0.9042 / 17.65 qps 0.6934 / 20.36 qps fast stage-1 v1
temsa/search-reranker-irishgov-l6-fast-v2 128 128 0.9286 / 18.59 qps 0.9042 / 19.67 qps 0.6934 / 19.30 qps fast stage-1 v2
temsa/search-reranker-irishgov-l6-fast-v3 128 128 0.9286 / 18.48 qps 0.9042 / 18.60 qps 0.6934 / 19.30 qps shorter serving profile
temsa/search-reranker-irishgov-l6-fast-v4 128 128 0.9184 / 17.66 qps 0.9008 / 21.83 qps 0.7197 / 17.31 qps margin-MSE per-channel
temsa/search-reranker-irishgov-l6-fast-v5 104 128 0.8987 / 20.85 qps 0.9133 / 20.29 qps 0.7197 / 15.45 qps current fast route

Intentional gaps:

  • search-reranker-broad-v1: the local reports include strong fp32 ONNX proxy and office data, but not a matching fp32 in-domain finephrase / hard-K10 / holdout-A03 set, so those are not claimed here.
  • search-reranker-irishgov-l5-k10-v2: the local non-quantized reports are trustworthy for office / finephrase / hard-K10, but not for the same proxy runtime shape as the shipped q8 path.
  • search-reranker-irishgov-l6-fast-v1: the local non-quantized reports cover proxy and office, but not the fresh holdout-A03 slice used for the q8 K=10 comparison.
Downloads last month
138
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for temsa/search-reranker-broad-policy-v7

Datasets used to train temsa/search-reranker-broad-policy-v7