Spaces:

BasalGanglia
/

kgraph-mcp-agent-platform

Sleeping

App Files Files Community

kgraph-mcp-agent-platform / docs /progress /sprint3_plan.md

BasalGanglia

🏆 Multi-Track Hackathon Submission

1f2d50a verified 6 months ago

preview code

raw

history blame contribute delete

11.2 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

Sprint 3 (MVP 1): Implementing the Simplified Planner Agent

Sprint Overview

Goal: Implement the SimplePlannerAgent logic that orchestrates the EmbeddingService and InMemoryKG to take a user query, process it, and return a list of suggested MCPTool objects. This sprint focuses on the agent's internal logic; UI integration comes in Sprint 4.
Duration: Estimated 2-4 hours (flexible within Hackathon Day 1 or early Day 2).
Core Primitives Focused On: Tool (being suggested by the agent).
Key Artifacts by End of Sprint:
- agents/planner.py: SimplePlannerAgent class fully implemented and tested.
- Unit tests for SimplePlannerAgent.
- All code linted, formatted, type-checked, and passing CI.

Task List

Task 3.1: Implement `SimplePlannerAgent` Core Logic

Status: Todo
Parent MVP: MVP 1
Parent Sprint (MVP 1): Sprint 3
Description: In agents/planner.py, fully implement the SimplePlannerAgent class.
- The __init__ method should accept instances of InMemoryKG and EmbeddingService.
- The suggest_tools(self, user_query: str, top_k: int = 3) -> List[MCPTool] method should:
  1. Handle empty or whitespace-only user_query (return empty list).
  2. Call self.embedder.get_embedding(user_query) to get the query embedding.
  3. If embedding fails (returns None), log a warning and return an empty list.
  4. Call self.kg.find_similar_tools(query_embedding, top_k=top_k) to get relevant tool_ids.
  5. For each tool_id returned, call self.kg.get_tool_by_id(tool_id) to retrieve the MCPTool object.
  6. Collect valid MCPTool objects and return them.
  7. Add print statements or basic logging for observability during development.
Acceptance Criteria:
1. SimplePlannerAgent class and suggest_tools method are fully implemented.
2. Method correctly utilizes injected InMemoryKG and EmbeddingService.
3. Handles edge cases like empty queries or embedding failures gracefully.
4. Unit tests (using mocks for dependencies) pass.
TDD Approach: In tests/agents/test_planner.py (create this file and tests/agents/__init__.py):
- test_planner_init: Test constructor.
- test_suggest_tools_empty_query: Assert returns [].
- test_suggest_tools_embedding_failure: Mock EmbeddingService.get_embedding to return None. Assert returns [] and logs a warning.
- test_suggest_tools_no_similar_tools_found: Mock InMemoryKG.find_similar_tools to return []. Assert suggest_tools returns [].
- test_suggest_tools_success:
  - Mock EmbeddingService.get_embedding to return a dummy query vector.
  - Mock InMemoryKG.find_similar_tools to return a list of dummy tool IDs (e.g., ["tool1", "tool2"]).
  - Mock InMemoryKG.get_tool_by_id to return specific MCPTool instances for "tool1" and "tool2", and None for any other ID.
  - Call planner.suggest_tools("some query") and assert it returns the expected list of MCPTool objects.

Task 3.2: Integration Point for Main Application (`app.py`)

Status: Todo
Parent MVP: MVP 1
Parent Sprint (MVP 1): Sprint 3
Description: In app.py (where the Gradio app will be), add the initialization logic for InMemoryKG, EmbeddingService, and SimplePlannerAgent. This logic should be placed globally so it runs once when the Gradio app starts.
- Load tools into the KG: knowledge_graph.load_tools_from_json("data/initial_tools.json").
- Build the vector index: knowledge_graph.build_vector_index(embedding_service).
- Pass the initialized knowledge_graph and embedding_service to the SimplePlannerAgent constructor.
- Handle potential errors during this initialization (e.g., API key not set, data file not found) by printing clear error messages.
Acceptance Criteria:
1. app.py contains the global initialization block for KG, Embedder, and Planner.
2. Initialization sequence is correct (load data, then build index).
3. Basic error handling for initialization is present.
4. The app can be run (python app.py) without crashing during this setup phase (Gradio UI itself is not built yet, but initialization should complete).
TDD Approach: This is primarily setup code. Manual testing by running python app.py and checking console output is key. Unit tests for the individual components (KG, Embedder) should already cover their internal logic.

Task 3.3: Update Dependencies & Run All Checks

Status: Todo
Parent MVP: MVP 1
Parent Sprint (MVP 1): Sprint 3
Description:
1. Review requirements.txt and requirements-dev.txt for any new additions (e.g., python-dotenv if used for local API key loading).
2. Regenerate requirements.lock: uv pip compile requirements.txt requirements-dev.txt --all-extras -o requirements.lock.
3. Run just install (or uv pip sync requirements.lock).
4. Run just lint, just format, just type-check, just test.
5. Commit all changes.
6. Push to GitHub and verify CI pipeline passes.
Acceptance Criteria:
1. requirements.lock is updated.
2. All just checks pass locally.
3. Code committed and pushed.
4. GitHub Actions CI pipeline passes for the sprint's commits.

Implementation Guidance

Task 3.1 Implementation Details

# In agents/planner.py
from typing import List, Optional
from kg_services.ontology import MCPTool
from kg_services.knowledge_graph import InMemoryKG
from kg_services.embedder import EmbeddingService


class SimplePlannerAgent:
    """
    A simplified planner agent that suggests tools based on user queries
    using semantic similarity search.
    """
    
    def __init__(self, kg: InMemoryKG, embedder: EmbeddingService):
        """Initialize the planner with knowledge graph and embedding service."""
        self.kg = kg
        self.embedder = embedder
    
    def suggest_tools(self, user_query: str, top_k: int = 3) -> List[MCPTool]:
        """
        Suggest relevant tools based on user query using semantic similarity.
        
        Args:
            user_query: Natural language query from user
            top_k: Maximum number of tools to suggest
            
        Returns:
            List of relevant MCPTool objects, ordered by relevance
        """
        # Handle empty or whitespace-only queries
        if not user_query or not user_query.strip():
            print("Warning: Empty or whitespace-only query provided")
            return []
        
        # Get embedding for the user query
        query_embedding = self.embedder.get_embedding(user_query)
        if query_embedding is None:
            print(f"Warning: Could not generate embedding for query: {user_query}")
            return []
        
        # Find similar tools using the knowledge graph
        similar_tool_ids = self.kg.find_similar_tools(query_embedding, top_k=top_k)
        
        # Retrieve actual MCPTool objects
        suggested_tools: List[MCPTool] = []
        for tool_id in similar_tool_ids:
            tool = self.kg.get_tool_by_id(tool_id)
            if tool is not None:
                suggested_tools.append(tool)
        
        print(f"Planner suggested tools: {[t.name for t in suggested_tools]} for query: '{user_query}'")
        return suggested_tools

Task 3.2 Implementation Details

# In app.py - Global Service Initialization Section
print("Attempting to initialize KGraph-MCP services...")

embedding_service_instance = None
knowledge_graph_instance = None
planner_agent_instance = None

try:
    # For local dev, ensure .env is loaded if API keys are there
    # from dotenv import load_dotenv
    # load_dotenv() # Add python-dotenv to requirements-dev.txt if needed

    from kg_services.embedder import EmbeddingService
    from kg_services.knowledge_graph import InMemoryKG
    from agents.planner import SimplePlannerAgent

    embedding_service_instance = EmbeddingService()
    knowledge_graph_instance = InMemoryKG()
    
    # Load initial tools data
    print("Loading tools from data/initial_tools.json...")
    knowledge_graph_instance.load_tools_from_json("data/initial_tools.json")
    
    # Build vector index (this will make actual API calls)
    print("Building vector index (may take a moment for first run)...")
    knowledge_graph_instance.build_vector_index(embedding_service_instance)
    print("Vector index built successfully.")
    
    # Initialize the planner agent
    planner_agent_instance = SimplePlannerAgent(knowledge_graph_instance, embedding_service_instance)
    print("KGraph-MCP services initialized successfully.")

except FileNotFoundError as e:
    print(f"FATAL: Data file not found: {e}")
    print("Please ensure data/initial_tools.json exists.")
except Exception as e:
    print(f"FATAL: Error during service initialization: {e}")
    print("The application may not function correctly.")
    # In a real app, you might want to prevent Gradio from launching or show an error state.

Task 3.3 Implementation Details

This task is primarily operational:

Dependency Review:

# Check if python-dotenv is needed
grep -r "dotenv" . --include="*.py"
# If found, add to requirements-dev.txt

Quality Checks:

# Update dependencies
uv pip compile requirements.txt requirements-dev.txt --all-extras -o requirements.lock
uv pip sync requirements.lock

# Run all quality checks
just lint
just format  
just type-check
just test

Commit and Push:

# Stage all changes
git add -A

# Use conventional commit
python scripts/smart_commit.py "implement SimplePlannerAgent and app integration"

# Push to trigger CI
git push origin main

End of Sprint 3 Review

What's Done:

The SimplePlannerAgent is implemented, capable of taking a user query and using the EmbeddingService and InMemoryKG to produce a list of relevant MCPTool objects.
The main application (app.py) now initializes all backend services on startup.
Unit tests cover the planner agent's logic using mocks.
The backend logic for the "KG-Powered Tool Suggester" is now complete.

What's Next (Sprint 4):

Implement the Gradio UI in app.py to take user input and display the suggestions from the SimplePlannerAgent.
Create interactive demo interface for the hackathon.
Add visualization components for the knowledge graph.

Key Benefits Achieved:

Separation of Concerns: Agent logic is decoupled from UI
Testability: Comprehensive unit tests with mocked dependencies
Robustness: Proper error handling for edge cases
Integration Ready: Clean interfaces for UI integration in Sprint 4

Potential Blockers/Issues:

API key configuration for embedding service
Performance of vector index building with real API calls
Memory usage with larger tool datasets

Implementation Priority: This sprint establishes the core intelligence of MVP1. The next sprint will expose this functionality through an interactive user interface, completing the hackathon demo.