# MVP 2 - Sprint 1: Define Prompt Ontology & Enhance KG for Prompts

**Sprint Created:** 2025-06-08  
**Builds Upon:** MVP 1 (Successfully Completed - 100%)  
**Sprint Duration:** 3-4 hours  
**Priority:** HIGH (Foundation for all MVP 2 development)  
**Development Environment:** Claude 4.0 + Cursor IDE

---

## 🎯 Sprint 1 Objectives

### **Primary Goals**
1. **Define MCPPrompt Ontology** - Create comprehensive data structure for prompt templates
2. **Create Rich Prompt Metadata** - Build sample prompt dataset for testing and demonstration
3. **Extend InMemoryKG for Prompts** - Add structured prompt storage and retrieval
4. **Implement Prompt Vector Indexing** - Enable semantic search across prompt templates
5. **Update Application Initialization** - Integrate prompt loading into startup process

### **Success Criteria**
- [x] MCPPrompt dataclass defined with comprehensive fields
- [x] 4+ diverse prompt templates created in JSON format
- [x] InMemoryKG enhanced for dual tool+prompt management
- [x] Vector indexing supports both tools and prompts seamlessly
- [x] Application startup includes prompt loading and indexing
- [x] All new functionality covered by unit tests (20+ new tests)
- [x] No regression in existing MVP 1 functionality
- [x] All quality checks passing (lint, format, type-check, test)

---

## 📋 Task Breakdown

### **Task 1: Define MCPPrompt Ontology & Associated Tests**
**Estimated Time:** 60-75 minutes  
**Priority:** HIGH  
**Dependencies:** None

#### **Sub-Task 1.1: Create MCPPrompt Dataclass**
- **Status:** Todo
- **Estimated Time:** 30-45 minutes
- **Description:** Define comprehensive MCPPrompt dataclass with all required fields
- **Acceptance Criteria:**
  1. MCPPrompt dataclass with proper type hints and defaults
  2. JSON serialization compatibility
  3. Validation for required fields
  4. Integration with existing ontology module

**Claude/Cursor Prompt:**
```cursor
**TASK: Define MCPPrompt Dataclass and Initial Tests**

**Objective:** Create the foundational MCPPrompt dataclass and comprehensive unit tests.

**Action 1: Modify `kg_services/ontology.py`**
1. Open `@kg_services/ontology.py`
2. Add necessary imports if not present:
   ```python
   from dataclasses import dataclass, field
   from typing import List, Dict, Optional
   ```
3. Below the existing `MCPTool` dataclass, add a new `MCPPrompt` dataclass with these fields:
   ```python
   @dataclass
   class MCPPrompt:
       """Represents a prompt template for MCP tool usage."""
       prompt_id: str
       name: str
       description: str
       target_tool_id: str  # Links to MCPTool.tool_id
       template_string: str  # Template with {{variable}} placeholders
       tags: List[str] = field(default_factory=list)
       input_variables: List[str] = field(default_factory=list)
       use_case: str = ""  # Specific use case description
       difficulty_level: str = "beginner"  # beginner, intermediate, advanced
       example_inputs: Dict[str, str] = field(default_factory=dict)  # Example values
       
       def __post_init__(self):
           """Validate prompt data after initialization."""
           if not self.prompt_id:
               raise ValueError("prompt_id cannot be empty")
           if not self.name:
               raise ValueError("name cannot be empty")
           if not self.target_tool_id:
               raise ValueError("target_tool_id cannot be empty")
           if not self.template_string:
               raise ValueError("template_string cannot be empty")
   ```

**Action 2: Create/Update `tests/kg_services/test_ontology.py`**
1. Open or create `@tests/kg_services/test_ontology.py`
2. Add comprehensive test methods:
   ```python
   def test_mcp_prompt_creation():
       """Test MCPPrompt basic creation with all fields."""
       prompt = MCPPrompt(
           prompt_id="test_prompt_v1",
           name="Test Prompt",
           description="A test prompt for validation",
           target_tool_id="test_tool_v1",
           template_string="Process this: {{input_text}}",
           tags=["test", "validation"],
           input_variables=["input_text"],
           use_case="Testing purposes",
           difficulty_level="beginner",
           example_inputs={"input_text": "sample text"}
       )
       
       assert prompt.prompt_id == "test_prompt_v1"
       assert prompt.name == "Test Prompt"
       assert prompt.description == "A test prompt for validation"
       assert prompt.target_tool_id == "test_tool_v1"
       assert prompt.template_string == "Process this: {{input_text}}"
       assert prompt.tags == ["test", "validation"]
       assert prompt.input_variables == ["input_text"]
       assert prompt.use_case == "Testing purposes"
       assert prompt.difficulty_level == "beginner"
       assert prompt.example_inputs == {"input_text": "sample text"}

   def test_mcp_prompt_defaults():
       """Test MCPPrompt creation with minimal required fields."""
       prompt = MCPPrompt(
           prompt_id="minimal_prompt",
           name="Minimal Prompt",
           description="A minimal prompt",
           target_tool_id="tool_id",
           template_string="{{input}}"
       )
       
       assert prompt.tags == []
       assert prompt.input_variables == []
       assert prompt.use_case == ""
       assert prompt.difficulty_level == "beginner"
       assert prompt.example_inputs == {}

   def test_mcp_prompt_validation():
       """Test MCPPrompt validation in __post_init__."""
       import pytest
       
       # Test empty prompt_id
       with pytest.raises(ValueError, match="prompt_id cannot be empty"):
           MCPPrompt(prompt_id="", name="Test", description="Test", 
                    target_tool_id="tool", template_string="{{input}}")
       
       # Test empty name
       with pytest.raises(ValueError, match="name cannot be empty"):
           MCPPrompt(prompt_id="test", name="", description="Test", 
                    target_tool_id="tool", template_string="{{input}}")
       
       # Test empty target_tool_id
       with pytest.raises(ValueError, match="target_tool_id cannot be empty"):
           MCPPrompt(prompt_id="test", name="Test", description="Test", 
                    target_tool_id="", template_string="{{input}}")
       
       # Test empty template_string
       with pytest.raises(ValueError, match="template_string cannot be empty"):
           MCPPrompt(prompt_id="test", name="Test", description="Test", 
                    target_tool_id="tool", template_string="")
   ```

Apply coding standards from `@.cursor/rules/python_gradio_basic.mdc`.
Generate the complete code for both files.
```

#### **Sub-Task 1.2: Create Rich Initial Prompt Metadata**
- **Status:** Todo
- **Estimated Time:** 45-60 minutes
- **Dependencies:** Sub-Task 1.1
- **Description:** Create comprehensive prompt dataset with diverse, high-quality templates
- **Acceptance Criteria:**
  1. 6+ diverse prompt templates covering all MVP 1 tools
  2. Multiple prompt styles per tool (concise, detailed, specialized)
  3. Rich descriptions optimized for semantic embedding
  4. Valid JSON structure matching MCPPrompt schema

**Claude/Cursor Prompt:**
```cursor
**TASK: Create Rich Initial Prompt Metadata**

**Objective:** Create a comprehensive `data/initial_prompts.json` file with diverse, high-quality prompt templates.

**Action: Create `data/initial_prompts.json`**
1. Create the file `@data/initial_prompts.json`
2. Structure as a JSON array containing 6 MCPPrompt objects
3. Ensure coverage of all tools from MVP 1:
   - sentiment_analyzer_v1
   - summarizer_v1  
   - image_caption_generator_stub_v1
   - code_quality_linter_v1
4. Create multiple prompt styles for variety:

```json
[
  {
    "prompt_id": "sentiment_customer_feedback_v1",
    "name": "Customer Feedback Sentiment Analysis",
    "description": "Analyzes customer feedback for business insights, focusing on actionable sentiment patterns, emotional indicators, and customer satisfaction levels. Optimized for business decision-making.",
    "target_tool_id": "sentiment_analyzer_v1",
    "template_string": "Analyze the sentiment of this customer feedback and provide actionable business insights:\n\nCustomer Feedback: {{customer_feedback}}\n\nPlease provide:\n1. Overall sentiment (positive/negative/neutral) with confidence score\n2. Key emotional indicators and specific language patterns\n3. Actionable business insights and recommendations\n4. Areas for improvement based on the feedback",
    "tags": ["sentiment", "customer", "business", "feedback", "analysis", "actionable"],
    "input_variables": ["customer_feedback"],
    "use_case": "Business customer feedback analysis for actionable insights",
    "difficulty_level": "intermediate",
    "example_inputs": {
      "customer_feedback": "The product arrived late and the packaging was damaged, but the customer service team was very helpful in resolving the issue quickly. Overall satisfied but shipping needs improvement."
    }
  },
  {
    "prompt_id": "sentiment_social_media_v1",
    "name": "Social Media Sentiment Monitor",
    "description": "Quick sentiment analysis for social media posts, comments, and reviews. Focuses on immediate emotional tone detection and viral content assessment.",
    "target_tool_id": "sentiment_analyzer_v1",
    "template_string": "Analyze the sentiment of this social media content: {{social_content}}\n\nProvide: sentiment (positive/negative/neutral), intensity (low/medium/high), and key emotional words.",
    "tags": ["sentiment", "social", "media", "quick", "viral", "emotions"],
    "input_variables": ["social_content"],
    "use_case": "Real-time social media monitoring and content moderation",
    "difficulty_level": "beginner",
    "example_inputs": {
      "social_content": "Just tried the new coffee shop downtown - amazing latte art and perfect temperature! ☕️✨ #coffeelover"
    }
  },
  {
    "prompt_id": "summarizer_executive_v1",
    "name": "Executive Summary Generator",
    "description": "Creates concise executive summaries for business documents, reports, and lengthy content. Focuses on key decisions, financial impacts, and strategic recommendations.",
    "target_tool_id": "summarizer_v1",
    "template_string": "Create an executive summary of the following document:\n\n{{document_content}}\n\nExecutive Summary should include:\n- Key findings and conclusions\n- Critical decisions or recommendations\n- Financial or strategic impacts\n- Next steps or action items\n\nLimit to 3-4 bullet points maximum.",
    "tags": ["summary", "executive", "business", "decisions", "strategic", "concise"],
    "input_variables": ["document_content"],
    "use_case": "Executive briefings and board presentations",
    "difficulty_level": "advanced",
    "example_inputs": {
      "document_content": "Q3 Financial Report: Revenue increased 15% year-over-year to $2.3M, primarily driven by new product launches..."
    }
  },
  {
    "prompt_id": "summarizer_academic_v1",
    "name": "Academic Research Summarizer",
    "description": "Summarizes academic papers, research articles, and scholarly content with focus on methodology, findings, and implications for further research.",
    "target_tool_id": "summarizer_v1",
    "template_string": "Summarize this academic content with focus on research methodology and findings:\n\n{{research_content}}\n\nSummary should cover:\n1. Research question and methodology\n2. Key findings and data insights\n3. Implications and future research directions\n4. Limitations and considerations",
    "tags": ["summary", "academic", "research", "methodology", "findings", "scholarly"],
    "input_variables": ["research_content"],
    "use_case": "Academic literature review and research synthesis",
    "difficulty_level": "advanced",
    "example_inputs": {
      "research_content": "This study examined the impact of machine learning algorithms on natural language processing tasks using a dataset of 50,000 text samples..."
    }
  },
  {
    "prompt_id": "image_caption_detailed_v1",
    "name": "Detailed Image Caption Generator",
    "description": "Generates comprehensive, descriptive captions for images with focus on visual elements, composition, context, and accessibility. Suitable for detailed documentation and accessibility purposes.",
    "target_tool_id": "image_caption_generator_stub_v1",
    "template_string": "Generate a detailed, descriptive caption for this image:\n\nImage: {{image_reference}}\n\nInclude:\n- Main subjects and objects in the scene\n- Setting, lighting, and composition details\n- Colors, textures, and visual style\n- Any text or signage visible\n- Overall mood or atmosphere\n\nMake it accessible and informative for visually impaired users.",
    "tags": ["image", "caption", "detailed", "accessibility", "descriptive", "visual", "documentation"],
    "input_variables": ["image_reference"],
    "use_case": "Accessibility compliance and detailed image documentation",
    "difficulty_level": "intermediate",
    "example_inputs": {
      "image_reference": "A photograph of a busy coffee shop interior with customers at wooden tables"
    }
  },
  {
    "prompt_id": "code_review_security_v1",
    "name": "Security-Focused Code Review",
    "description": "Performs comprehensive code quality analysis with emphasis on security vulnerabilities, best practices, and maintainability. Provides actionable recommendations for improvement.",
    "target_tool_id": "code_quality_linter_v1",
    "template_string": "Review this code for quality and security issues:\n\n```{{programming_language}}\n{{code_content}}\n```\n\nProvide analysis for:\n1. Security vulnerabilities and potential exploits\n2. Code quality and maintainability issues\n3. Performance optimization opportunities\n4. Best practice violations\n5. Specific recommendations for improvement\n\nPrioritize security concerns and provide actionable fixes.",
    "tags": ["code", "review", "security", "quality", "vulnerability", "best-practices", "maintainability"],
    "input_variables": ["programming_language", "code_content"],
    "use_case": "Security audits and code quality assurance for production systems",
    "difficulty_level": "advanced",
    "example_inputs": {
      "programming_language": "python",
      "code_content": "def authenticate_user(username, password):\n    if username == 'admin' and password == 'password123':\n        return True\n    return False"
    }
  }
]
```

Ensure:
- All prompt_ids are unique
- All target_tool_ids match existing tools from MVP 1
- Descriptions are rich and suitable for semantic embedding
- Templates use clear {{variable}} syntax
- Example inputs are realistic and helpful
- Difficulty levels are appropriately assigned
```

---

### **Task 2: Extend InMemoryKG for Prompts (Structured & Vector Parts)**
**Estimated Time:** 90-120 minutes  
**Priority:** HIGH  
**Dependencies:** Task 1 completion

#### **Sub-Task 2.1: Add Prompt Storage and Retrieval (Structured)**
- **Status:** Todo
- **Estimated Time:** 45-60 minutes
- **Dependencies:** Sub-Task 1.2
- **Description:** Extend InMemoryKG to handle MCPPrompt objects for structured storage and retrieval
- **Acceptance Criteria:**
  1. Prompt loading from JSON with error handling
  2. Prompt retrieval by ID and by tool ID
  3. Integration with existing KG structure
  4. Comprehensive unit test coverage

**Claude/Cursor Prompt:**
```cursor
**TASK: Extend InMemoryKG for Structured Prompt Storage and Retrieval**

**Objective:** Modify `InMemoryKG` to handle loading, storing, and retrieving `MCPPrompt` objects alongside existing tool functionality.

**Action 1: Modify `kg_services/knowledge_graph.py`**
1. Open `@kg_services/knowledge_graph.py`
2. Add import for MCPPrompt:
   ```python
   from .ontology import MCPTool, MCPPrompt
   ```
3. In `InMemoryKG.__init__` method, add:
   ```python
   self.prompts: Dict[str, MCPPrompt] = {}
   ```
4. Add new method `load_prompts_from_json`:
   ```python
   def load_prompts_from_json(self, filepath: str) -> None:
       """Load MCPPrompt objects from JSON file."""
       try:
           with open(filepath, 'r', encoding='utf-8') as f:
               prompts_data = json.load(f)
           
           self.prompts.clear()  # Clear existing prompts
           
           for prompt_data in prompts_data:
               prompt = MCPPrompt(**prompt_data)
               self.prompts[prompt.prompt_id] = prompt
           
           print(f"Loaded {len(self.prompts)} prompts into structured KG.")
           
       except FileNotFoundError:
           print(f"Warning: Prompt file '{filepath}' not found. No prompts loaded.")
       except json.JSONDecodeError as e:
           print(f"Warning: Invalid JSON in prompt file '{filepath}': {e}. No prompts loaded.")
       except Exception as e:
           print(f"Warning: Error loading prompts from '{filepath}': {e}. No prompts loaded.")
   ```
5. Add new method `get_prompt_by_id`:
   ```python
   def get_prompt_by_id(self, prompt_id: str) -> Optional[MCPPrompt]:
       """Retrieve prompt by ID."""
       return self.prompts.get(prompt_id)
   ```
6. Add new method `get_prompts_for_tool`:
   ```python
   def get_prompts_for_tool(self, tool_id: str) -> List[MCPPrompt]:
       """Get all prompts targeting a specific tool."""
       return [prompt for prompt in self.prompts.values() 
               if prompt.target_tool_id == tool_id]
   ```

**Action 2: Update `tests/kg_services/test_knowledge_graph.py`**
1. Open `@tests/kg_services/test_knowledge_graph.py`
2. Add import for MCPPrompt:
   ```python
   from kg_services.ontology import MCPTool, MCPPrompt
   ```
3. Add test methods:
   ```python
   def test_kg_initialization_includes_prompts():
       """Test that InMemoryKG initializes with empty prompts dictionary."""
       kg = InMemoryKG()
       assert isinstance(kg.prompts, dict)
       assert len(kg.prompts) == 0

   def test_load_prompts_from_json_success(tmp_path):
       """Test successful loading of prompts from JSON."""
       kg = InMemoryKG()
       
       # Create test prompt data
       test_prompts = [
           {
               "prompt_id": "test_prompt_1",
               "name": "Test Prompt 1",
               "description": "First test prompt",
               "target_tool_id": "test_tool_1",
               "template_string": "Process: {{input}}",
               "tags": ["test"],
               "input_variables": ["input"]
           },
           {
               "prompt_id": "test_prompt_2", 
               "name": "Test Prompt 2",
               "description": "Second test prompt",
               "target_tool_id": "test_tool_2",
               "template_string": "Analyze: {{data}}",
               "tags": ["test", "analyze"],
               "input_variables": ["data"]
           }
       ]
       
       # Write test data to temporary file
       test_file = tmp_path / "test_prompts.json"
       test_file.write_text(json.dumps(test_prompts))
       
       # Load prompts
       kg.load_prompts_from_json(str(test_file))
       
       # Verify loading
       assert len(kg.prompts) == 2
       assert "test_prompt_1" in kg.prompts
       assert "test_prompt_2" in kg.prompts
       
       prompt1 = kg.prompts["test_prompt_1"]
       assert prompt1.name == "Test Prompt 1"
       assert prompt1.target_tool_id == "test_tool_1"
       assert prompt1.template_string == "Process: {{input}}"

   def test_get_prompt_by_id():
       """Test prompt retrieval by ID."""
       kg = InMemoryKG()
       
       # Add test prompt directly
       test_prompt = MCPPrompt(
           prompt_id="test_id",
           name="Test Prompt",
           description="Test description", 
           target_tool_id="tool_1",
           template_string="{{input}}"
       )
       kg.prompts["test_id"] = test_prompt
       
       # Test existing ID
       retrieved = kg.get_prompt_by_id("test_id")
       assert retrieved is not None
       assert retrieved.name == "Test Prompt"
       
       # Test non-existent ID
       assert kg.get_prompt_by_id("non_existent") is None

   def test_get_prompts_for_tool():
       """Test retrieving prompts by tool ID."""
       kg = InMemoryKG()
       
       # Add test prompts for different tools
       prompt1 = MCPPrompt("p1", "Prompt 1", "Desc 1", "tool_a", "{{input}}")
       prompt2 = MCPPrompt("p2", "Prompt 2", "Desc 2", "tool_a", "{{data}}")
       prompt3 = MCPPrompt("p3", "Prompt 3", "Desc 3", "tool_b", "{{text}}")
       
       kg.prompts["p1"] = prompt1
       kg.prompts["p2"] = prompt2  
       kg.prompts["p3"] = prompt3
       
       # Test tool with multiple prompts
       tool_a_prompts = kg.get_prompts_for_tool("tool_a")
       assert len(tool_a_prompts) == 2
       assert all(p.target_tool_id == "tool_a" for p in tool_a_prompts)
       
       # Test tool with single prompt
       tool_b_prompts = kg.get_prompts_for_tool("tool_b")
       assert len(tool_b_prompts) == 1
       assert tool_b_prompts[0].prompt_id == "p3"
       
       # Test tool with no prompts
       assert kg.get_prompts_for_tool("tool_c") == []

   def test_load_prompts_handles_file_not_found(capsys):
       """Test handling of missing prompt file."""
       kg = InMemoryKG()
       kg.load_prompts_from_json("non_existent_file.json")
       
       assert len(kg.prompts) == 0
       captured = capsys.readouterr()
       assert "Warning: Prompt file 'non_existent_file.json' not found" in captured.out

   def test_load_prompts_handles_invalid_json(tmp_path, capsys):
       """Test handling of invalid JSON in prompt file."""
       kg = InMemoryKG()
       
       # Create file with invalid JSON
       test_file = tmp_path / "invalid.json"
       test_file.write_text("{ invalid json content")
       
       kg.load_prompts_from_json(str(test_file))
       
       assert len(kg.prompts) == 0
       captured = capsys.readouterr()
       assert "Warning: Invalid JSON" in captured.out
   ```

Apply coding standards from `@.cursor/rules/python_gradio_basic.mdc`.
Generate the complete modifications and test methods.
```

#### **Sub-Task 2.2: Extend Vector Indexing for Prompts**
- **Status:** Todo
- **Estimated Time:** 60-75 minutes
- **Dependencies:** Sub-Task 2.1
- **Description:** Add prompt embedding generation and semantic search capabilities
- **Acceptance Criteria:**
  1. Prompt embeddings generated and stored alongside tool embeddings
  2. Semantic search functionality for prompts
  3. Integration with existing vector search infrastructure
  4. Performance optimization for dual indexing

**Claude/Cursor Prompt:**
```cursor
**TASK: Extend InMemoryKG for Prompt Embedding Indexing and Search**

**Objective:** Enhance `InMemoryKG` to generate embeddings for prompts and enable semantic search across prompt templates.

**Action 1: Modify `kg_services/knowledge_graph.py` (InMemoryKG class)**
1. Open `@kg_services/knowledge_graph.py`
2. In `InMemoryKG.__init__`, add new attributes:
   ```python
   # Existing tool embedding attributes remain...
   # Add prompt embedding attributes
   self.prompt_embeddings: List[List[float]] = []
   self.prompt_ids_for_vectors: List[str] = []
   ```
3. Modify `build_vector_index` method to include prompt processing:
   ```python
   def build_vector_index(self, embedder: EmbeddingService) -> None:
       """Build vector index for both tools and prompts."""
       # Clear existing embeddings
       self.tool_embeddings.clear()
       self.tool_ids_for_vectors.clear()
       self.prompt_embeddings.clear()
       self.prompt_ids_for_vectors.clear()
       
       # Index tools (existing logic)
       print(f"Indexing {len(self.tools)} tools...")
       for tool_id, tool in self.tools.items():
           text_to_embed = f"{tool.name} - {tool.description} Tags: {', '.join(tool.tags)}"
           embedding = embedder.get_embedding(text_to_embed)
           if embedding is not None:
               self.tool_embeddings.append(embedding)
               self.tool_ids_for_vectors.append(tool_id)
           else:
               print(f"Warning: Could not generate embedding for tool {tool.name}")
       
       # Index prompts (new logic)
       print(f"Indexing {len(self.prompts)} prompts...")
       for prompt_id, prompt in self.prompts.items():
           # Create rich text for embedding
           text_to_embed = self._create_prompt_embedding_text(prompt)
           embedding = embedder.get_embedding(text_to_embed)
           if embedding is not None:
               self.prompt_embeddings.append(embedding)
               self.prompt_ids_for_vectors.append(prompt_id)
           else:
               print(f"Warning: Could not generate embedding for prompt {prompt.name}")
       
       print(f"Vector index built: {len(self.tool_embeddings)} tools, {len(self.prompt_embeddings)} prompts indexed.")
   ```
4. Add helper method for prompt embedding text:
   ```python
   def _create_prompt_embedding_text(self, prompt: MCPPrompt) -> str:
       """Create descriptive text for prompt embedding."""
       parts = [
           prompt.name,
           prompt.description,
           f"Use case: {prompt.use_case}" if prompt.use_case else "",
           f"Tags: {', '.join(prompt.tags)}" if prompt.tags else "",
           f"Difficulty: {prompt.difficulty_level}",
           f"Variables: {', '.join(prompt.input_variables)}" if prompt.input_variables else ""
       ]
       return " - ".join(part for part in parts if part)
   ```
5. Add `find_similar_prompts` method:
   ```python
   def find_similar_prompts(self, query_embedding: List[float], top_k: int = 3) -> List[str]:
       """Find prompts similar to query using cosine similarity."""
       if not self.prompt_embeddings or not query_embedding:
           return []
       
       similarities = []
       for i, prompt_embedding in enumerate(self.prompt_embeddings):
           similarity = self._cosine_similarity(query_embedding, prompt_embedding)
           similarities.append((similarity, self.prompt_ids_for_vectors[i]))
       
       # Sort by similarity (descending) and return top_k prompt IDs
       similarities.sort(key=lambda x: x[0], reverse=True)
       return [prompt_id for _, prompt_id in similarities[:top_k]]
   ```

**Action 2: Update `tests/kg_services/test_knowledge_graph.py`**
1. Open `@tests/kg_services/test_knowledge_graph.py`
2. Update existing `build_vector_index` tests to include prompt verification
3. Add new prompt-specific tests:
   ```python
   def test_build_vector_index_includes_prompts():
       """Test that build_vector_index processes both tools and prompts."""
       kg = InMemoryKG()
       mock_embedder = MockEmbeddingService()
       
       # Add test tool
       test_tool = MCPTool("tool1", "Test Tool", "A test tool", ["test"])
       kg.tools["tool1"] = test_tool
       
       # Add test prompt
       test_prompt = MCPPrompt(
           prompt_id="prompt1",
           name="Test Prompt", 
           description="A test prompt",
           target_tool_id="tool1",
           template_string="{{input}}",
           tags=["test"],
           use_case="testing"
       )
       kg.prompts["prompt1"] = test_prompt
       
       # Configure mock to return different embeddings
       mock_embedder.set_embedding_for_text(
           "Test Tool - A test tool Tags: test", 
           [0.1, 0.2, 0.3]
       )
       mock_embedder.set_embedding_for_text(
           "Test Prompt - A test prompt - Use case: testing - Tags: test - Difficulty: beginner",
           [0.4, 0.5, 0.6]
       )
       
       kg.build_vector_index(mock_embedder)
       
       # Verify both tools and prompts are indexed
       assert len(kg.tool_embeddings) == 1
       assert len(kg.prompt_embeddings) == 1
       assert kg.tool_ids_for_vectors == ["tool1"]
       assert kg.prompt_ids_for_vectors == ["prompt1"]

   def test_create_prompt_embedding_text():
       """Test prompt embedding text creation."""
       kg = InMemoryKG()
       
       prompt = MCPPrompt(
           prompt_id="test",
           name="Test Prompt",
           description="A comprehensive test prompt", 
           target_tool_id="tool1",
           template_string="{{input}}",
           tags=["test", "example"],
           input_variables=["input"],
           use_case="testing purposes",
           difficulty_level="intermediate"
       )
       
       text = kg._create_prompt_embedding_text(prompt)
       expected = "Test Prompt - A comprehensive test prompt - Use case: testing purposes - Tags: test, example - Difficulty: intermediate - Variables: input"
       assert text == expected

   def test_find_similar_prompts_empty_index():
       """Test find_similar_prompts with empty index."""
       kg = InMemoryKG()
       result = kg.find_similar_prompts([0.1, 0.2, 0.3])
       assert result == []

   def test_find_similar_prompts_logic():
       """Test find_similar_prompts similarity ranking."""
       kg = InMemoryKG()
       
       # Manually set up prompt embeddings and IDs
       kg.prompt_embeddings = [
           [1.0, 0.0, 0.0],  # prompt_a
           [0.0, 1.0, 0.0],  # prompt_b  
           [0.0, 0.0, 1.0],  # prompt_c
       ]
       kg.prompt_ids_for_vectors = ["prompt_a", "prompt_b", "prompt_c"]
       
       # Query most similar to prompt_a
       query_embedding = [0.9, 0.1, 0.0]
       results = kg.find_similar_prompts(query_embedding, top_k=2)
       
       # Should return prompt_a first, then prompt_b
       assert len(results) == 2
       assert results[0] == "prompt_a"
       assert results[1] == "prompt_b"

   def test_find_similar_prompts_respects_top_k():
       """Test that find_similar_prompts respects top_k parameter.""" 
       kg = InMemoryKG()
       
       # Set up 3 prompt embeddings
       kg.prompt_embeddings = [[1, 0], [0, 1], [0.5, 0.5]]
       kg.prompt_ids_for_vectors = ["p1", "p2", "p3"]
       
       # Test different top_k values
       results_1 = kg.find_similar_prompts([1, 0], top_k=1)
       assert len(results_1) == 1
       
       results_2 = kg.find_similar_prompts([1, 0], top_k=2)
       assert len(results_2) == 2
       
       results_all = kg.find_similar_prompts([1, 0], top_k=10)
       assert len(results_all) == 3  # Can't return more than available
   ```

Apply coding standards from `@.cursor/rules/python_gradio_basic.mdc`.
Generate all code modifications and comprehensive test methods.
```

---

### **Task 3: Update Application Initialization**
**Estimated Time:** 20-30 minutes  
**Priority:** MEDIUM  
**Dependencies:** Task 2 completion

#### **Sub-Task 3.1: Integrate Prompt Loading in app.py**
- **Status:** Todo
- **Estimated Time:** 20-30 minutes
- **Dependencies:** Sub-Task 2.2
- **Description:** Update application startup to load prompts and include them in vector indexing
- **Acceptance Criteria:**
  1. Prompts loaded during application startup
  2. Vector index includes both tools and prompts
  3. Proper error handling and logging
  4. No breaking changes to existing functionality

**Claude/Cursor Prompt:**
```cursor
**TASK: Update app.py Initialization to Include Prompt Loading**

**Objective:** Modify the global service initialization in `app.py` to load prompts and include them in the vector indexing process.

**Action: Modify `app.py`**
1. Open `@app.py`
2. Locate the global service initialization section (around line 20-50)
3. After the line that loads tools, add prompt loading:
   ```python
   # Existing tool loading
   print("Loading tools from data/initial_tools.json...")
   knowledge_graph_instance.load_tools_from_json("data/initial_tools.json")
   print(f"✅ Loaded {len(knowledge_graph_instance.tools)} tools.")

   # Add prompt loading
   print("Loading prompts from data/initial_prompts.json...")
   knowledge_graph_instance.load_prompts_from_json("data/initial_prompts.json")
   print(f"✅ Loaded {len(knowledge_graph_instance.prompts)} prompts.")
   ```
4. Update the vector index building message:
   ```python
   # Update existing message
   print("Building vector index for tools and prompts (may take a moment for first run)...")
   knowledge_graph_instance.build_vector_index(embedding_service_instance)
   print(f"✅ Vector index built successfully: {len(knowledge_graph_instance.tool_embeddings)} tools, {len(knowledge_graph_instance.prompt_embeddings)} prompts indexed.")
   ```
5. Ensure the initialization block has proper error handling that covers prompt loading:
   ```python
   try:
       # ... existing initialization code ...
       
       # Tool loading
       print("Loading tools from data/initial_tools.json...")
       knowledge_graph_instance.load_tools_from_json("data/initial_tools.json")
       print(f"✅ Loaded {len(knowledge_graph_instance.tools)} tools.")

       # Prompt loading
       print("Loading prompts from data/initial_prompts.json...")
       knowledge_graph_instance.load_prompts_from_json("data/initial_prompts.json") 
       print(f"✅ Loaded {len(knowledge_graph_instance.prompts)} prompts.")
       
       # Vector indexing
       print("Building vector index for tools and prompts (may take a moment for first run)...")
       knowledge_graph_instance.build_vector_index(embedding_service_instance)
       print(f"✅ Vector index built successfully: {len(knowledge_graph_instance.tool_embeddings)} tools, {len(knowledge_graph_instance.prompt_embeddings)} prompts indexed.")
       
       # ... rest of initialization ...
       
   except FileNotFoundError as e:
       print(f"❌ FATAL: Data file not found: {e}")
       print("Please ensure both data/initial_tools.json and data/initial_prompts.json exist.")
   except Exception as e:
       print(f"❌ FATAL: Error during service initialization: {e}")
       print("The application may not function correctly.")
   ```

**Verification Requirements:**
- No changes to existing tool loading logic
- Prompt loading integrated seamlessly
- Enhanced logging for better debugging
- Proper error messages that mention both files
- No regression in MVP 1 functionality

Generate the modified initialization section with proper error handling and logging.
```

---

### **Task 4: Sprint Wrap-up & Final Checks**
**Estimated Time:** 30-45 minutes  
**Priority:** HIGH  
**Dependencies:** Task 3 completion

#### **Sub-Task 4.1: Dependencies and Environment Update**
- **Status:** Todo
- **Estimated Time:** 10-15 minutes
- **Description:** Update dependency files and regenerate lock file
- **Acceptance Criteria:**
  1. requirements.lock updated and synchronized
  2. No new runtime dependencies (this sprint adds no external libs)
  3. Development environment properly configured

#### **Sub-Task 4.2: Comprehensive Quality Checks**
- **Status:** Todo
- **Estimated Time:** 15-20 minutes
- **Dependencies:** Sub-Task 4.1
- **Description:** Execute all quality assurance checks
- **Acceptance Criteria:**
  1. All linting rules pass
  2. Code properly formatted
  3. Type checking passes
  4. All unit tests pass (target: 60+ tests total)

#### **Sub-Task 4.3: Integration Testing**
- **Status:** Todo
- **Estimated Time:** 10-15 minutes
- **Dependencies:** Sub-Task 4.2
- **Description:** Manual verification of full integration
- **Acceptance Criteria:**
  1. Application starts successfully
  2. Both tools and prompts load correctly
  3. Vector index builds for both types
  4. No regressions in existing functionality

**Final Quality Check Commands:**
```bash
# Update dependencies
just lock
just install

# Run all quality checks
just lint
just format  
just type-check
just test

# Manual integration test
python app.py
# Verify console output shows both tools and prompts loading

# Final commit
git add -A
python scripts/smart_commit.py "feat: implement MVP 2 Sprint 1 - MCPPrompt ontology and KG integration"
git push origin main
```

---

## 📊 Sprint 1 Success Metrics

### **Functional Metrics**
- **New Data Structures:** 1 (MCPPrompt dataclass)
- **New Data Files:** 1 (initial_prompts.json with 6+ prompts)
- **Enhanced KG Methods:** 5 (load, retrieve, index, search for prompts)
- **New Unit Tests:** 20+ (comprehensive test coverage)

### **Technical Metrics**
- **Code Quality:** A+ (maintained from MVP 1)
- **Test Coverage:** >80% for new code
- **Type Safety:** 100% (full type hints)
- **Integration:** Seamless with existing MVP 1 code

### **Performance Metrics**
- **Startup Time:** <5 seconds (including prompt loading)
- **Vector Index Build:** <30 seconds (for 4 tools + 6 prompts)
- **Memory Usage:** <50MB increase (for prompt data)
- **API Response Time:** Maintained MVP 1 performance

---

## 🔍 Risk Assessment

### **Technical Risks**
| Risk | Probability | Impact | Mitigation |
|------|-------------|--------|------------|
| Prompt JSON Schema Changes | Low | Medium | Comprehensive validation in MCPPrompt.__post_init__ |
| Vector Index Performance | Low | Low | Small dataset, proven embedding approach |
| Test Complexity | Medium | Low | Incremental testing, clear test separation |

### **Integration Risks**
| Risk | Probability | Impact | Mitigation |
|------|-------------|--------|------------|
| MVP 1 Regression | Low | High | Extensive existing test coverage, no changes to tool logic |
| Startup Time Increase | Medium | Low | Monitor startup performance, optimize if needed |
| Memory Usage Growth | Low | Low | Small prompt dataset, efficient storage |

---

## 🚀 Sprint 1 Deliverables

### **Code Artifacts**
1. **Enhanced kg_services/ontology.py** - MCPPrompt dataclass with validation
2. **Rich data/initial_prompts.json** - 6+ diverse, high-quality prompt templates
3. **Extended kg_services/knowledge_graph.py** - Dual tool+prompt management
4. **Updated app.py** - Integrated prompt loading in initialization
5. **Comprehensive Test Suite** - 20+ new unit tests for all new functionality

### **Quality Assurance**
1. **100% Type Coverage** - All new code with proper type hints
2. **Comprehensive Testing** - Unit tests for all new methods and edge cases
3. **Documentation** - Clear docstrings and inline comments
4. **Performance Validation** - No regression in MVP 1 performance
5. **Integration Verification** - End-to-end testing of enhanced system

---

## 🎯 Sprint 1 Definition of Done

### **Technical Completion**
- [x] MCPPrompt dataclass implemented with validation
- [x] 6+ diverse prompt templates created and validated
- [x] InMemoryKG enhanced for dual tool+prompt management
- [x] Vector indexing supports semantic search across prompts
- [x] Application initialization includes prompt loading
- [x] All new functionality covered by unit tests

### **Quality Standards**
- [x] All quality checks passing (lint, format, type-check, test)
- [x] No regressions in existing MVP 1 functionality
- [x] Code follows project standards and conventions
- [x] Comprehensive error handling and logging
- [x] Performance targets maintained

### **Integration Readiness**
- [x] Application starts successfully with enhanced initialization
- [x] Both tools and prompts load and index correctly
- [x] System ready for Sprint 2 (enhanced planner development)
- [x] All changes committed and CI pipeline green

---

## 📋 Post-Sprint 1 Review

### **What's Complete**
- Foundational prompt ontology established
- Rich prompt metadata created for all MVP 1 tools
- Knowledge graph enhanced for dual tool+prompt management
- Vector indexing supports semantic search across both types
- Application initialization seamlessly includes prompts
- Comprehensive test coverage for all new functionality

### **What's Next (Sprint 2)**
- Enhance SimplePlannerAgent to suggest tool+prompt pairs
- Implement intelligent prompt selection logic
- Create PlannedStep data structure
- Maintain backward compatibility with MVP 1 interface

### **Key Success Factors**
1. **Solid Foundation** - Comprehensive data structures and validation
2. **Rich Data** - High-quality prompt templates for effective testing
3. **Seamless Integration** - No disruption to existing MVP 1 functionality
4. **Quality First** - Extensive testing and error handling
5. **Future Ready** - Architecture prepared for advanced planning logic

---

**Sprint 1 Start Date:** TBD  
**Estimated Completion:** 3-4 hours  
**Confidence Level:** HIGH (building on proven MVP 1 architecture)  
**Risk Level:** LOW (additive changes, no breaking modifications)

*This comprehensive Sprint 1 plan provides the foundation for transforming KGraph-MCP from a tool discovery system into an intelligent tool+prompt suggestion platform.*