Uploaded using `kernel-builder`.
Browse files
README.md
CHANGED
|
@@ -1,76 +1,43 @@
|
|
| 1 |
---
|
|
|
|
| 2 |
license: apache-2.0
|
| 3 |
-
tags:
|
| 4 |
-
- kernels
|
| 5 |
---
|
| 6 |
|
| 7 |
-
|
| 8 |
|
| 9 |
-
|
| 10 |
-
uv run https://huggingface.co/kernels-community/megablocks/raw/main/readme_example.py
|
| 11 |
-
```
|
| 12 |
|
| 13 |
```python
|
| 14 |
-
#
|
| 15 |
-
# requires-python = "==3.10"
|
| 16 |
-
# dependencies = [
|
| 17 |
-
# "numpy",
|
| 18 |
-
# "kernels",
|
| 19 |
-
# "torch"
|
| 20 |
-
# ]
|
| 21 |
-
# ///
|
| 22 |
-
|
| 23 |
-
import torch
|
| 24 |
-
from collections import namedtuple
|
| 25 |
-
|
| 26 |
from kernels import get_kernel
|
| 27 |
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
torch.cuda.manual_seed(42)
|
| 31 |
-
|
| 32 |
-
# Download optimized kernels from the Hugging Face hub
|
| 33 |
-
megablocks = get_kernel("kernels-community/megablocks")
|
| 34 |
-
print("MegaBlocks kernel downloaded successfully.")
|
| 35 |
-
|
| 36 |
-
model = megablocks.layers.MegaBlocksMoeMLP()
|
| 37 |
-
model.experts = namedtuple("Experts", ["gate_up_proj", "gate_down_proj", "down_proj", "hidden_size"])
|
| 38 |
-
print("MegaBlocksMoeMLP instance created successfully.")
|
| 39 |
-
|
| 40 |
-
# Config
|
| 41 |
-
ne, hs, isz = 128, 1152, 3072
|
| 42 |
|
| 43 |
-
|
| 44 |
-
model.router = torch.nn.Linear(hs, ne, device="cuda")
|
| 45 |
-
torch.nn.init.kaiming_uniform_(model.router.weight)
|
| 46 |
-
|
| 47 |
-
# Expert layers with realistic weights
|
| 48 |
-
e = model.experts
|
| 49 |
-
e.gate_up_proj = torch.nn.Parameter(torch.randn(ne, hs, isz, device="cuda") * 0.02)
|
| 50 |
-
e.gate_up_proj_bias = torch.nn.Parameter(torch.zeros(ne, isz, device="cuda"))
|
| 51 |
-
e.down_proj = torch.nn.Parameter(torch.randn(ne, 1536, hs, device="cuda") * 0.02)
|
| 52 |
-
e.down_proj_bias = torch.nn.Parameter(torch.zeros(ne, hs, device="cuda"))
|
| 53 |
-
e.hidden_size = hs
|
| 54 |
-
print("Expert layers initialized successfully.")
|
| 55 |
-
|
| 56 |
-
# Test with normalized input
|
| 57 |
-
x = torch.randn(1, 1, hs, device="cuda") * 0.1
|
| 58 |
-
output, expert_weights = model(x)
|
| 59 |
-
print("Model forward pass completed successfully.")
|
| 60 |
-
|
| 61 |
-
print(f"Output shape: {output.shape}")
|
| 62 |
-
print(f"Output range: [{output.min():.3f}, {output.max():.3f}]")
|
| 63 |
-
print(f"Output: {output.flatten()[:10]}")
|
| 64 |
-
print(f"Expert weights sum: {expert_weights.sum():.3f}")
|
| 65 |
```
|
| 66 |
|
| 67 |
-
##
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
|
| 72 |
-
|
| 73 |
-
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
library_name: kernels
|
| 3 |
license: apache-2.0
|
|
|
|
|
|
|
| 4 |
---
|
| 5 |
|
| 6 |
+
This is the repository card of kernels-community/megablocks that has been pushed on the Hub. It was built to be used with the [`kernels` library](https://github.com/huggingface/kernels). This card was automatically generated.
|
| 7 |
|
| 8 |
+
## How to use
|
|
|
|
|
|
|
| 9 |
|
| 10 |
```python
|
| 11 |
+
# make sure `kernels` is installed: `pip install -U kernels`
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
from kernels import get_kernel
|
| 13 |
|
| 14 |
+
kernel_module = get_kernel("kernels-community/megablocks")
|
| 15 |
+
MyReplacementLayer = kernel_module.MyReplacementLayer
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
|
| 17 |
+
MyReplacementLayer(...)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
```
|
| 19 |
|
| 20 |
+
## Available functions
|
| 21 |
+
- `MyReplacementLayer`
|
| 22 |
+
- `exclusive_cumsum`
|
| 23 |
+
- `inclusive_cumsum`
|
| 24 |
+
- `histogram`
|
| 25 |
+
- `indices`
|
| 26 |
+
- `replicate_forward`
|
| 27 |
+
- `replicate_backward`
|
| 28 |
+
- `sort`
|
| 29 |
+
- `cumsum`
|
| 30 |
+
- `argsort`
|
| 31 |
+
- `Arguments`
|
| 32 |
+
- `ParallelDroplessMLP`
|
| 33 |
+
- `dMoE`
|
| 34 |
+
- `SparseGLU`
|
| 35 |
+
- `MLP`
|
| 36 |
+
- `SparseMLP`
|
| 37 |
+
- `MoE`
|
| 38 |
+
- `ParallelMLP`
|
| 39 |
+
- `get_load_balancing_loss`
|
| 40 |
+
|
| 41 |
+
## Benchmarks
|
| 42 |
+
|
| 43 |
+
Benchmarking script is available for this kernel. Run `kernels benchmark kernels-community/megablocks`.
|