Spaces:

BasalGanglia
/

kgraph-mcp-agent-platform

Sleeping

App Files Files Community

kgraph-mcp-agent-platform / docs /progress /mvp2_sprint2_plan.md

BasalGanglia

🏆 Multi-Track Hackathon Submission

1f2d50a verified 6 months ago

preview code

raw

history blame contribute delete

7.93 kB

A newer version of the Gradio SDK is available: 6.2.0

Upgrade

MVP 2 Sprint 2 - Comprehensive Plan

Enhanced Planner for Tool+Prompt Pairs

Date: 2025-06-08
Sprint Goal: Modify SimplePlannerAgent to select both relevant MCPTool and corresponding MCPPrompt, returning structured PlannedStep objects
Duration: 3-5 hours
Status: 🚀 READY TO START

🎯 Sprint 2 Objectives

Goal Evolution: MVP1 → MVP2 Sprint 2

MVP1: User Query → Tool Discovery → Tool Suggestion
MVP2 Sprint 2: User Query → Tool Discovery → Prompt Selection → (Tool + Prompt) Suggestion

Key Deliverables

PlannedStep Ontology - New dataclass for structured tool+prompt pairs
Enhanced SimplePlannerAgent - Semantic tool+prompt selection logic
Updated Application Integration - Backend support for new planner output
Comprehensive Testing - Full coverage of new planning workflow

📋 Task Breakdown

Task 2.1: Define PlannedStep Dataclass (60 mins)

Files: kg_services/ontology.py, tests/kg_services/test_ontology.py

Objective: Create structured data representation for planner output

Implementation:

@dataclass
class PlannedStep:
    """Represents a planned step combining a tool and its prompt."""
    tool: MCPTool
    prompt: MCPPrompt
    relevance_score: Optional[float] = None  # Future use

Testing Requirements:

Test PlannedStep creation with valid tool+prompt pairs
Validate type safety and field access
Test optional relevance_score functionality

Task 2.2: Refactor SimplePlannerAgent (180 mins)

Files: agents/planner.py, tests/agents/test_planner.py

Objective: Implement combined tool+prompt selection logic

Key Algorithm:

Tool Selection: Find relevant tools using semantic search
Prompt Filtering: Get prompts targeting each selected tool
Prompt Ranking: Semantically rank prompts against user query
PlannedStep Assembly: Create structured output

Implementation Strategy:

def generate_plan(self, user_query: str, top_k_plans: int = 1) -> List[PlannedStep]:
    # 1. Get query embedding
    query_embedding = self.embedder.get_embedding(user_query)
    
    # 2. Find candidate tools
    tool_ids = self.kg.find_similar_tools(query_embedding, top_k=3)
    
    # 3. For each tool, find and rank prompts
    planned_steps = []
    for tool_id in tool_ids:
        tool = self.kg.get_tool_by_id(tool_id)
        prompts = [p for p in self.kg.prompts.values() 
                  if p.target_tool_id == tool.tool_id]
        
        # 4. Select best prompt semantically
        best_prompt = self._select_best_prompt(prompts, query_embedding)
        if best_prompt:
            planned_steps.append(PlannedStep(tool=tool, prompt=best_prompt))
    
    return planned_steps[:top_k_plans]

Testing Requirements:

Test no tools found scenario
Test tool found but no prompts scenario
Test tool with single prompt selection
Test tool with multiple prompts - semantic selection
Test top_k_plans limiting functionality

Task 2.3: Update Application Integration (45 mins)

Files: app.py, tests/test_app.py

Objective: Update backend to use new planner method

Changes Required:

Update handle_find_tools to call generate_plan() instead of suggest_tools()
Handle PlannedStep output format (temporary backward compatibility)
Ensure no UI crashes during transition

Implementation:

def handle_find_tools(query: str) -> dict:
    if not planner_agent:
        return {"error": "Planner not available"}
    
    planned_steps = planner_agent.generate_plan(query, top_k_plans=1)
    
    if not planned_steps:
        return {"info": f"No actionable plans found for: '{query}'"}
    
    # Temporary: extract tool for display (UI update in Sprint 3)
    first_plan = planned_steps[0]
    return format_tool_for_display(first_plan.tool)

Task 2.4: Quality Assurance & Deployment (30 mins)

Objective: Ensure code quality and system stability

Checklist:

Run just lint - Code style compliance
Run just format - Automatic formatting
Run just type-check - Type safety validation
Run just test - Full test suite execution
Manual integration testing
Update requirements.lock if needed
Commit and push changes
Verify CI pipeline success

🔧 Technical Architecture

Data Flow Evolution

User Query
    ↓
Query Embedding (OpenAI)
    ↓
Tool Semantic Search (Knowledge Graph)
    ↓
Prompt Filtering (by target_tool_id)
    ↓
Prompt Semantic Ranking (vs Query)
    ↓
PlannedStep Assembly
    ↓
Structured Output (Tool + Prompt)

New Components Introduced

PlannedStep Dataclass - Structured output format
Enhanced Planning Logic - Tool+prompt selection
Semantic Prompt Ranking - Context-aware prompt selection
Backward Compatible Interface - Smooth transition support

Integration Points

Knowledge Graph: Extended prompt search capabilities
Embedding Service: Dual-purpose tool+prompt ranking
Application Layer: Updated method signatures and handling

🧪 Testing Strategy

Unit Test Coverage

PlannedStep Tests: Creation, validation, type safety
Planner Logic Tests: All selection scenarios and edge cases
Integration Tests: End-to-end workflow validation
Error Handling Tests: Graceful failure scenarios

Test Scenarios

Happy Path: Query → Tool → Prompt → PlannedStep
No Tools Found: Empty result handling
Tool Without Prompts: Graceful skipping
Multiple Prompts: Semantic selection validation
Edge Cases: Empty queries, API failures

Manual Testing Checklist

Application starts successfully with new planner
Tool suggestions still work (backward compatibility)
No crashes in UI during tool selection
Logging shows enhanced planning information

📊 Success Metrics

Metric	Target	Validation Method
PlannedStep Creation	✅ Complete	Unit tests pass
Tool+Prompt Selection	✅ Semantic accuracy	Integration tests
Backward Compatibility	✅ No breaking changes	Manual testing
Code Quality	✅ All checks pass	CI pipeline
Test Coverage	✅ >90% for new code	pytest coverage

🔄 Sprint Dependencies

Prerequisites (Completed in Sprint 1)

✅ MCPPrompt ontology established
✅ Knowledge graph extended for prompts
✅ Vector indexing for prompt search
✅ Initial prompt dataset created

Deliverables for Sprint 3

✅ PlannedStep objects ready for UI display
✅ Enhanced planner generating structured output
✅ Backend integration supporting rich display
✅ Test coverage preventing regressions

🚨 Risk Mitigation

Potential Challenges

Semantic Prompt Selection Complexity
- Risk: Overly complex ranking logic
- Mitigation: Start with simple cosine similarity, iterate
Performance with Multiple Prompts
- Risk: Slow response times
- Mitigation: Use pre-computed embeddings, limit candidates
Test Complexity
- Risk: Difficult to mock complex interactions
- Mitigation: Break into smaller, testable units
Backward Compatibility
- Risk: Breaking existing functionality
- Mitigation: Careful interface design, thorough testing

🎯 Sprint 3 Preparation

Ready for Next Sprint

After Sprint 2 completion, Sprint 3 can focus on:

UI enhancements to display PlannedStep information
Rich prompt template display with variables
Interactive input field generation
Enhanced user experience for tool+prompt workflows

Plan created for MVP 2 Sprint 2 - Enhanced Planner for Tool+Prompt Pairs
Estimated effort: 3-5 hours
Focus: Backend logic enhancement and structured output