YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

MIRAS Language Model

A character-level language model trained on Shakespeare using the MIRAS (Memory-Integrated Recurrent Attention System) architecture.

Model Details

Embedding dimension: 384
Layers: 4
Block size: 128
Memory type: deep
Attentional bias: l2
Retention: l2
Vocabulary size: 65

Installation

pip install torch huggingface_hub

Usage

Quick Start

from huggingface_hub import hf_hub_download
import torch

# Download files
for f in ["modeling_miras.py", "model.pt", "config.json"]:
    hf_hub_download(repo_id="av-codes/miras-shakespeare", filename=f, local_dir="./miras")

# Import and load
import sys
sys.path.insert(0, "./miras")
from modeling_miras import load_miras_model

model, encode, decode, config = load_miras_model("./miras")
model.eval()

# Generate text
context = torch.zeros((1, 1), dtype=torch.long)
output = model.generate(context, max_new_tokens=200, temperature=0.8)
print(decode(output[0].tolist()))

Using the Helper Function

from modeling_miras import load_miras_model

# Load directly from Hub
model, encode, decode, config = load_miras_model("av-codes/miras-shakespeare")

# Generate
import torch
context = torch.zeros((1, 1), dtype=torch.long)
generated = model.generate(context, max_new_tokens=100)
print(decode(generated[0].tolist()))

Files

model.pt - Model weights and architecture config
config.json - Full configuration including vocabulary
modeling_miras.py - Complete model architecture code

Training

Trained for 5000 iterations on the TinyShakespeare dataset.

Architecture

MIRAS uses a novel memory-based attention mechanism with configurable:

Memory type: linear (matrix memory) or deep (MLP memory)
Attentional bias: l2, lp, or huber loss functions
Retention: l2, kl, or elastic weight update rules

Downloads last month: 12

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support