Text Classification
Transformers
PyTorch
Safetensors
English
bert
fast
monarch-matrices
mnli
efficiency
triton
hardware-efficient
sub-quadratic
fast-inference
h100-optimized
custom_code
Eval Results (legacy)
text-embeddings-inference
Instructions to use ykae/monarch-bert-base-mnli with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ykae/monarch-bert-base-mnli with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="ykae/monarch-bert-base-mnli", trust_remote_code=True)# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("ykae/monarch-bert-base-mnli", trust_remote_code=True) model = AutoModelForSequenceClassification.from_pretrained("ykae/monarch-bert-base-mnli", trust_remote_code=True) - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -70,7 +70,7 @@ Measured on a single NVIDIA H100 using `torch.compile(mode="max-autotune")`.
|
|
| 70 |
| **Parameters** | 85.65M | **28.98M** | π **-66.2%** |
|
| 71 |
| **Compute (GFLOPs)** | 696.5 | **232.6** | π **-66.6%** |
|
| 72 |
| **Throughput (TPS)** | 7261 | **9029** | π **+24.3%** |
|
| 73 |
-
| **Latency (Batch 32)** | 4.41 ms | **3.54 ms** | β‘ **24
|
| 74 |
| **Accuracy (MNLI)** | 83.62% | **78.34%** | π **-5.28%** |
|
| 75 |
|
| 76 |
## Usage
|
|
|
|
| 70 |
| **Parameters** | 85.65M | **28.98M** | π **-66.2%** |
|
| 71 |
| **Compute (GFLOPs)** | 696.5 | **232.6** | π **-66.6%** |
|
| 72 |
| **Throughput (TPS)** | 7261 | **9029** | π **+24.3%** |
|
| 73 |
+
| **Latency (Batch 32)** | 4.41 ms | **3.54 ms** | β‘ **+24.6% Faster** |
|
| 74 |
| **Accuracy (MNLI)** | 83.62% | **78.34%** | π **-5.28%** |
|
| 75 |
|
| 76 |
## Usage
|