--- model-index: - name: augustulus-latin-sentiment-lora results: - task: type: text-classification name: Sentiment Analysis dataset: name: Ancient Latin Sentiment (Custom) type: custom metrics: - type: accuracy value: 75 name: Accuracy (with linguistic post-processing) - type: accuracy value: 37.5 name: Raw Model Accuracy license: llama3.1 language: - la base_model: - meta-llama/Llama-3.1-8B-Instruct tags: - gguf - quantized - llama-cpp --- # Augustulus Latin Sentiment Analysis LoRA **Developed by Team Trojan Parse** *University of Florida Senior Design Project* A LoRA (Low-Rank Adaptation) adapter fine-tuned on **Llama-3.1-8B-Instruct** for fine-grained sentiment classification of Ancient Latin texts across seven emotional intensity levels. ## Project Information - **Team Name:** Trojan Parse - **Team Members:** - Alex John - Ryan Willson - Byron Boatright - Jake Marotta - Duncan Fuller - **Project Repository:** [GitHub: Trojan-Parse-Project](https://github.com/alxxjohn/Trojan-Parse-Project) - **Advisor:** Eleni Bozia, Ph.D., Dr. phil. (Associate Professor of Classics and Digital Humanities) - **Advisor Department:** Department of Classics, University of Florida ## Model Description - **Model Name:** `augustulus-latin-sentiment-lora` - **Model type:** LoRA Adapter for Ancient Language Sentiment Classification - **Language:** Classical/Ancient Latin - **Base model:** [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) - **License:** Llama 3.1 Community License - **Purpose:** Academic research and historical text analysis ## Sentiment Categories Our model classifies Ancient Latin texts into six emotional intensity levels: ### Positive Sentiments - **EXTREMELY POSITIVE (+3)**: *exsultatio, jubilum, beatitudo, summa felicitas* - Examples: Triumphal declarations, ultimate joy, divine blessing - **VERY POSITIVE (+2)**: *gaudium, laetitia, amor, gloria, victoria, laudare* - Examples: Military victories, celebrations, expressions of love/honor - **MODERATELY POSITIVE (+1)**: *felix, laetus, bonus, pulcher, spes* - Examples: General contentment, hope, pleasant situations ### Neutral (0) - Factual statements, descriptions without emotional valence ### Negative Sentiments - **MODERATELY NEGATIVE (-1)**: *malus, tristis, anxius, timor* - Examples: Minor concerns, sadness, mild fear - **VERY NEGATIVE (-2)**: *dolor magnus, timor vehemens, ira, furor* - Examples: Great pain, intense anger, serious threats - **EXTREMELY NEGATIVE (-3)**: *desperatio, exitium, cruciatus, malum* - Examples: Utter despair, destruction, torture, ultimate evil ## Performance | Configuration | Accuracy | Notes | |:---|:---:|:---| | Base Llama 3.1 (zero-shot) | 43.8% | Unreliable, biased toward extremes | | LoRA Adapter (raw predictions) | 37.5% | Systematic but conservative | | **LoRA + Linguistic Rules** | **75.0%** | Production-ready | ### Category-Level Performance - **Neutral Detection:** 100% accuracy (3/3 test cases) - **Moderate Categories:** 100% accuracy (learned systematic patterns) - **Extreme Categories:** 83.3% accuracy (with intensity calibration) ## Training Approach Our training methodology combined multiple data sources and validation strategies: ### Data Pipeline (5-day development cycle) **Phase 1: Initial Generation** - Few-shot generation using base Llama 3.1 - Context-aware synthetic examples - Balanced across all six sentiment categories **Phase 2: Consensus Filtering** - Trained multiple LoRA variants on hand-annotated data - Consensus filtering: kept examples where ≥2 models agreed - Reduced noise and improved training data quality **Phase 3: Corpus Mining** - Mined authentic Ancient Latin texts from Perseus Digital Library - Extracted high-confidence positive examples (previously underrepresented) - Combined ~40,000 corpus examples with synthetic data **Phase 4: Final Training & Iteration** - Balanced dataset: 9,000 examples (1,500 per category) - Distributed training with data-parallel strategy - Multiple training runs to optimize hyperparameters ### Final Training Configuration - **Training Examples:** 9,000 (balanced across 7 categories) - **Training Epochs:** 15 - **Architecture:** LoRA adapter (rank: 128, alpha: 256) - **Optimization:** 8-bit quantization for efficiency - **Hardware:** High-performance GPU cluster - **Framework:** PyTorch, HuggingFace Transformers, PEFT ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel import torch # Load base model and adapter base_model = "meta-llama/Llama-3.1-8B-Instruct" model = AutoModelForCausalLM.from_pretrained( base_model, device_map="auto", torch_dtype=torch.float16, trust_remote_code=True ) # Load Team Trojan Parse's adapter # Replace YOUR_USERNAME with your Hugging Face username model = PeftModel.from_pretrained(model, "YOUR_USERNAME/augustulus-latin-sentiment-lora") tokenizer = AutoTokenizer.from_pretrained(base_model) # Classify sentiment def classify_latin_sentiment(text): prompt = f'''Classify the sentiment of this Latin text as: VERY NEGATIVE, MODERATELY NEGATIVE, NEUTRAL, MODERATELY POSITIVE, VERY POSITIVE, or EXTREMELY POSITIVE. Latin text: {text} Sentiment:''' inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_new_tokens=20, temperature=0.1, do_sample=False ) response = tokenizer.decode(outputs[0], skip_special_tokens=True) return response.split("Sentiment:")[-1].strip() # Example: Extreme positive (triumph) text = "Victoria splendidissima! Dux gloriam aeternam meruit!" print(classify_latin_sentiment(text)) # Output: EXTREMELY POSITIVE # Example: Extreme negative (despair) text = "Bellum crudele et longum populum afflixerat." print(classify_latin_sentiment(text)) # Output: VERY NEGATIVE ``` --- ## GGUF Model Download and Local Usage (Merged Fine-Tune) The LoRA adapter has been merged with the base model and quantized to **Q8_0** (8-bit) precision for efficient deployment on CPU/GPU via tools like `llama.cpp` and `Ollama`. ### 💾 File Details - **File Name:** `augustulus-latin-sentiment-8b-q8_0.gguf` - **Size:** 8.0 GB - **Quantization:** Q8_0 (Recommended for best balance of speed and accuracy) --- ### Llama 3.1 License Notice **IMPORTANT**: This model (including the GGUF file) is a derivative of Meta’s **Llama 3.1** model and is governed by the [**Meta Llama 3.1 Community License**](https://llama.meta.com/llama3_1/license). - **Attribution**: If you redistribute or build products with this model, you must include the statement **“Built with Meta Llama 3”** in a prominent location (e.g., README, UI footer, about page). - **Commercial Use**: Allowed without additional permission as long as your product or service has **fewer than 700 million monthly active users**. Above that threshold, you need a separate commercial license from Meta. See the full license text here: https://llama.meta.com/llama3_1/license --- ### Usage Example (with Ollama) This workflow uses a custom **Modelfile** to set the strict sentiment task and gives the model a simple local name. #### Create Modelfile Save the following content as a file named `Modelfile`: ```text # Modelfile for the Augustulus Latin Sentiment Model FROM hf.co/TronCodes/augustulus-latin-sentiment-lora/augustulus-latin-sentiment-8b-q8_0.gguf SYSTEM """ You are Augustulus, an expert in Classical Latin sentiment analysis. Your task is to respond ONLY with one of the following exact labels: EXTREMELY POSITIVE, VERY POSITIVE, MODERATELY POSITIVE, NEUTRAL, MODERATELY NEGATIVE, VERY NEGATIVE, or EXTREMELY NEGATIVE. Do not provide any conversational text or explanation. """ TEMPLATE """ {{ if .System }}<|start_header_id|>system<|end_header_id|>{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|> """ PARAMETER temperature 0.1 PARAMETER num_predict 20 PARAMETER stop "<|eot_id|>" ``` --- #### Create and Run ```bash ollama create augustulus-latin -f Modelfile ollama run augustulus-latin ``` ## Acknowledgments We gratefully acknowledge: * Dr. Eleni Bozia (Ph.D., Dr. phil.) - Senior Project Advisor * University of Florida Department of Humanities - Computing resources and support * Perseus Digital Library - Access to Classical Latin corpus * Meta AI - Llama 3.1 base model * HuggingFace - PEFT library and model hosting infrastructure ## Citation ```bibtex @misc{trojan_parse_latin_sentiment_2025, author = {{Team Trojan Parse}}, title = {Augustulus Latin Sentiment Analysis LoRA}, year = {2025}, publisher = {University of Florida}, journal = {HuggingFace Model Hub}, howpublished = {\\url{[https://huggingface.co/TronCodes/augustulus-latin-sentiment-lora](https://huggingface.co/TronCodes/augustulus-latin-sentiment-lora)}} }