kgraph-mcp-agent-platform / docs /progress /mvp2_sprint2_plan.md
BasalGanglia's picture
πŸ† Multi-Track Hackathon Submission
1f2d50a verified

A newer version of the Gradio SDK is available: 6.2.0

Upgrade

MVP 2 Sprint 2 - Comprehensive Plan

Enhanced Planner for Tool+Prompt Pairs

Date: 2025-06-08
Sprint Goal: Modify SimplePlannerAgent to select both relevant MCPTool and corresponding MCPPrompt, returning structured PlannedStep objects
Duration: 3-5 hours
Status: πŸš€ READY TO START

🎯 Sprint 2 Objectives

Goal Evolution: MVP1 β†’ MVP2 Sprint 2

  • MVP1: User Query β†’ Tool Discovery β†’ Tool Suggestion
  • MVP2 Sprint 2: User Query β†’ Tool Discovery β†’ Prompt Selection β†’ (Tool + Prompt) Suggestion

Key Deliverables

  1. PlannedStep Ontology - New dataclass for structured tool+prompt pairs
  2. Enhanced SimplePlannerAgent - Semantic tool+prompt selection logic
  3. Updated Application Integration - Backend support for new planner output
  4. Comprehensive Testing - Full coverage of new planning workflow

πŸ“‹ Task Breakdown

Task 2.1: Define PlannedStep Dataclass (60 mins)

Files: kg_services/ontology.py, tests/kg_services/test_ontology.py

Objective: Create structured data representation for planner output

Implementation:

@dataclass
class PlannedStep:
    """Represents a planned step combining a tool and its prompt."""
    tool: MCPTool
    prompt: MCPPrompt
    relevance_score: Optional[float] = None  # Future use

Testing Requirements:

  • Test PlannedStep creation with valid tool+prompt pairs
  • Validate type safety and field access
  • Test optional relevance_score functionality

Task 2.2: Refactor SimplePlannerAgent (180 mins)

Files: agents/planner.py, tests/agents/test_planner.py

Objective: Implement combined tool+prompt selection logic

Key Algorithm:

  1. Tool Selection: Find relevant tools using semantic search
  2. Prompt Filtering: Get prompts targeting each selected tool
  3. Prompt Ranking: Semantically rank prompts against user query
  4. PlannedStep Assembly: Create structured output

Implementation Strategy:

def generate_plan(self, user_query: str, top_k_plans: int = 1) -> List[PlannedStep]:
    # 1. Get query embedding
    query_embedding = self.embedder.get_embedding(user_query)
    
    # 2. Find candidate tools
    tool_ids = self.kg.find_similar_tools(query_embedding, top_k=3)
    
    # 3. For each tool, find and rank prompts
    planned_steps = []
    for tool_id in tool_ids:
        tool = self.kg.get_tool_by_id(tool_id)
        prompts = [p for p in self.kg.prompts.values() 
                  if p.target_tool_id == tool.tool_id]
        
        # 4. Select best prompt semantically
        best_prompt = self._select_best_prompt(prompts, query_embedding)
        if best_prompt:
            planned_steps.append(PlannedStep(tool=tool, prompt=best_prompt))
    
    return planned_steps[:top_k_plans]

Testing Requirements:

  • Test no tools found scenario
  • Test tool found but no prompts scenario
  • Test tool with single prompt selection
  • Test tool with multiple prompts - semantic selection
  • Test top_k_plans limiting functionality

Task 2.3: Update Application Integration (45 mins)

Files: app.py, tests/test_app.py

Objective: Update backend to use new planner method

Changes Required:

  1. Update handle_find_tools to call generate_plan() instead of suggest_tools()
  2. Handle PlannedStep output format (temporary backward compatibility)
  3. Ensure no UI crashes during transition

Implementation:

def handle_find_tools(query: str) -> dict:
    if not planner_agent:
        return {"error": "Planner not available"}
    
    planned_steps = planner_agent.generate_plan(query, top_k_plans=1)
    
    if not planned_steps:
        return {"info": f"No actionable plans found for: '{query}'"}
    
    # Temporary: extract tool for display (UI update in Sprint 3)
    first_plan = planned_steps[0]
    return format_tool_for_display(first_plan.tool)

Task 2.4: Quality Assurance & Deployment (30 mins)

Objective: Ensure code quality and system stability

Checklist:

  • Run just lint - Code style compliance
  • Run just format - Automatic formatting
  • Run just type-check - Type safety validation
  • Run just test - Full test suite execution
  • Manual integration testing
  • Update requirements.lock if needed
  • Commit and push changes
  • Verify CI pipeline success

πŸ”§ Technical Architecture

Data Flow Evolution

User Query
    ↓
Query Embedding (OpenAI)
    ↓
Tool Semantic Search (Knowledge Graph)
    ↓
Prompt Filtering (by target_tool_id)
    ↓
Prompt Semantic Ranking (vs Query)
    ↓
PlannedStep Assembly
    ↓
Structured Output (Tool + Prompt)

New Components Introduced

  1. PlannedStep Dataclass - Structured output format
  2. Enhanced Planning Logic - Tool+prompt selection
  3. Semantic Prompt Ranking - Context-aware prompt selection
  4. Backward Compatible Interface - Smooth transition support

Integration Points

  • Knowledge Graph: Extended prompt search capabilities
  • Embedding Service: Dual-purpose tool+prompt ranking
  • Application Layer: Updated method signatures and handling

πŸ§ͺ Testing Strategy

Unit Test Coverage

  • PlannedStep Tests: Creation, validation, type safety
  • Planner Logic Tests: All selection scenarios and edge cases
  • Integration Tests: End-to-end workflow validation
  • Error Handling Tests: Graceful failure scenarios

Test Scenarios

  1. Happy Path: Query β†’ Tool β†’ Prompt β†’ PlannedStep
  2. No Tools Found: Empty result handling
  3. Tool Without Prompts: Graceful skipping
  4. Multiple Prompts: Semantic selection validation
  5. Edge Cases: Empty queries, API failures

Manual Testing Checklist

  • Application starts successfully with new planner
  • Tool suggestions still work (backward compatibility)
  • No crashes in UI during tool selection
  • Logging shows enhanced planning information

πŸ“Š Success Metrics

Metric Target Validation Method
PlannedStep Creation βœ… Complete Unit tests pass
Tool+Prompt Selection βœ… Semantic accuracy Integration tests
Backward Compatibility βœ… No breaking changes Manual testing
Code Quality βœ… All checks pass CI pipeline
Test Coverage βœ… >90% for new code pytest coverage

πŸ”„ Sprint Dependencies

Prerequisites (Completed in Sprint 1)

  • βœ… MCPPrompt ontology established
  • βœ… Knowledge graph extended for prompts
  • βœ… Vector indexing for prompt search
  • βœ… Initial prompt dataset created

Deliverables for Sprint 3

  • βœ… PlannedStep objects ready for UI display
  • βœ… Enhanced planner generating structured output
  • βœ… Backend integration supporting rich display
  • βœ… Test coverage preventing regressions

🚨 Risk Mitigation

Potential Challenges

  1. Semantic Prompt Selection Complexity

    • Risk: Overly complex ranking logic
    • Mitigation: Start with simple cosine similarity, iterate
  2. Performance with Multiple Prompts

    • Risk: Slow response times
    • Mitigation: Use pre-computed embeddings, limit candidates
  3. Test Complexity

    • Risk: Difficult to mock complex interactions
    • Mitigation: Break into smaller, testable units
  4. Backward Compatibility

    • Risk: Breaking existing functionality
    • Mitigation: Careful interface design, thorough testing

🎯 Sprint 3 Preparation

Ready for Next Sprint

After Sprint 2 completion, Sprint 3 can focus on:

  • UI enhancements to display PlannedStep information
  • Rich prompt template display with variables
  • Interactive input field generation
  • Enhanced user experience for tool+prompt workflows

Plan created for MVP 2 Sprint 2 - Enhanced Planner for Tool+Prompt Pairs
Estimated effort: 3-5 hours
Focus: Backend logic enhancement and structured output