YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
MIRAS Language Model
A character-level language model trained on Shakespeare using the MIRAS (Memory-Integrated Recurrent Attention System) architecture.
Model Details
- Embedding dimension: 384
- Layers: 4
- Block size: 128
- Memory type: deep
- Attentional bias: l2
- Retention: l2
- Vocabulary size: 65
Installation
pip install torch huggingface_hub
Usage
Quick Start
from huggingface_hub import hf_hub_download
import torch
# Download files
for f in ["modeling_miras.py", "model.pt", "config.json"]:
hf_hub_download(repo_id="av-codes/miras-shakespeare", filename=f, local_dir="./miras")
# Import and load
import sys
sys.path.insert(0, "./miras")
from modeling_miras import load_miras_model
model, encode, decode, config = load_miras_model("./miras")
model.eval()
# Generate text
context = torch.zeros((1, 1), dtype=torch.long)
output = model.generate(context, max_new_tokens=200, temperature=0.8)
print(decode(output[0].tolist()))
Using the Helper Function
from modeling_miras import load_miras_model
# Load directly from Hub
model, encode, decode, config = load_miras_model("av-codes/miras-shakespeare")
# Generate
import torch
context = torch.zeros((1, 1), dtype=torch.long)
generated = model.generate(context, max_new_tokens=100)
print(decode(generated[0].tolist()))
Files
model.pt- Model weights and architecture configconfig.json- Full configuration including vocabularymodeling_miras.py- Complete model architecture code
Training
Trained for 5000 iterations on the TinyShakespeare dataset.
Architecture
MIRAS uses a novel memory-based attention mechanism with configurable:
- Memory type:
linear(matrix memory) ordeep(MLP memory) - Attentional bias:
l2,lp, orhuberloss functions - Retention:
l2,kl, orelasticweight update rules
- Downloads last month
- 12
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support