Latin Intertextuality Classifier

This model is a fine-tuned version of bowphs/PhilBerta for sequence classification of intertextual links between Jerome (Hieronymus) and other classical authors. This model is intended to integrate with the LociSimiles Python package for Latin intertextuality workflows: https://julianschelb.github.io/locisimiles/api/.

Model Description

  • Task: Binary classification for detecting intertextual links between classical Latin authors
  • Model type: Sequence Classification
  • Base model: bowphs/PhilBerta
  • Max input tokens: 512
  • Language: Latin
  • License: Apache 2.0

Usage

When using standard tokenization for a sequence-pair classification task, the final input sequence follows the encoder-style pattern with special tokens:

<s> Jerome_phrase </s></s> Candidate_phrase </s>

Here is a complete example:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("julian-schelb/philberta-class-lat-intertext-v2")
model = AutoModelForSequenceClassification.from_pretrained("julian-schelb/philberta-class-lat-intertext-v2")

# Define your sentence pair
sentence1 = "omnia fert aetas, animum quoque; saepe ego longos cantando puerum memini me condere soles."
sentence2 = "saepe ego longos cantando puerum memini me condere soles."

# Tokenize the sentence pair for the model
inputs = tokenizer(
    sentence1,  # Hieronymus
    sentence2,  # Classical author
    add_special_tokens=True,
    truncation=True,
    padding="max_length",
    return_tensors='pt'
 )

# Run the model in evaluation mode (no gradient calculation)
with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.nn.functional.softmax(outputs.logits, dim=-1)
    # probs[0][1] corresponds to the probability of "citation", if binary labels are 0="no citation", 1="citation"
    print("Prediction probabilities:", probs)

Citation

@misc{schelb2026locisimilesbenchmarkextracting,
      title={Loci Similes: A Benchmark for Extracting Intertextualities in Latin Literature},
      author={Julian Schelb and Michael Wittweiler and Marie Revellio and Barbara Feichtinger and Andreas Spitz},
      year={2026},
      eprint={2601.07533},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2601.07533},
}
Downloads last month
92
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for julian-schelb/philberta-class-lat-intertext-v2

Base model

bowphs/PhilBerta
Finetuned
(3)
this model

Collection including julian-schelb/philberta-class-lat-intertext-v2

Paper for julian-schelb/philberta-class-lat-intertext-v2