You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Codette Orchestrator GGUF - Llama 3.1 8B

Quantized GGUF model for the Codette Multi-Perspective Reasoning System.

This is a Llama 3.1 8B Instruct model with the orchestrator LoRA merged in and quantized to Q4_K_M format for efficient local inference via llama.cpp.

Model Details

Property	Value
Base Model	meta-llama/Llama-3.1-8B-Instruct
Merged Adapter	Orchestrator (query routing + debate coordination)
Quantization	Q4_K_M (4-bit, ~4.6 GB)
Context Length	4096 tokens
Format	GGUF (llama.cpp compatible)

What is Codette?

Codette is a multi-perspective AI reasoning system that approaches problems through 9 specialized cognitive lenses:

Adapter	Perspective
Newton	Analytical physics and systematic reasoning
DaVinci	Creative invention and cross-domain thinking
Empathy	Emotional intelligence and human understanding
Philosophy	Conceptual analysis and ethical reasoning
Quantum	Probabilistic thinking and uncertainty
Consciousness	Recursive cognition (RC+xi framework)
Multi-Perspective	Cross-lens synthesis
Systems Architecture	Modularity, scalability, engineering
Orchestrator	Query routing, debate coordination, coherence monitoring

Architecture (Phase 6+)

Semantic Tension Engine: Measures epistemic tension (xi) between perspectives
Coherence Field (Gamma): Real-time monitoring for reasoning collapse
Quantum Spiderweb: Belief propagation across adapter network
AEGIS Governance: 6-framework ethical validation
Executive Controller: Routes queries by complexity (SIMPLE/MEDIUM/COMPLEX)

Usage

With llama.cpp

./llama-server -m codette-orchestrator-Q4_K_M.gguf -c 4096 -ngl 35

With Codette Web UI

git clone https://github.com/Raiff1982/codette
cd codette
codette_web.bat

The GGUF model serves as the base, with 9 LoRA adapters hot-swapped at inference time for perspective-specific reasoning.

With llama-cpp-python

from llama_cpp import Llama

llm = Llama(
    model_path="codette-orchestrator-Q4_K_M.gguf",
    n_ctx=4096,
    n_gpu_layers=35,
)

response = llm.create_chat_completion(
    messages=[{"role": "user", "content": "Explain consciousness from multiple perspectives"}],
    max_tokens=512,
    temperature=0.7,
)
print(response["choices"][0]["message"]["content"])