Spaces:

BasalGanglia
/

kgraph-mcp-agent-platform

Sleeping

App Files Files Community

kgraph-mcp-agent-platform / docs /progress /sprint3_plan.md

BasalGanglia

🏆 Multi-Track Hackathon Submission

1f2d50a verified 6 months ago

preview code

raw

history blame contribute delete

11.2 kB

	# Sprint 3 (MVP 1): Implementing the Simplified Planner Agent

	## Sprint Overview

	- Goal: Implement the `SimplePlannerAgent` logic that orchestrates the `EmbeddingService` and `InMemoryKG` to take a user query, process it, and return a list of suggested `MCPTool` objects. This sprint focuses on the agent's internal logic; UI integration comes in Sprint 4.
	- Duration: Estimated 2-4 hours (flexible within Hackathon Day 1 or early Day 2).
	- Core Primitives Focused On: Tool (being suggested by the agent).
	- Key Artifacts by End of Sprint:
	- `agents/planner.py`: `SimplePlannerAgent` class fully implemented and tested.
	- Unit tests for `SimplePlannerAgent`.
	- All code linted, formatted, type-checked, and passing CI.

	---

	## Task List

	### Task 3.1: Implement `SimplePlannerAgent` Core Logic

	- Status: Todo
	- Parent MVP: MVP 1
	- Parent Sprint (MVP 1): Sprint 3
	- Description: In `agents/planner.py`, fully implement the `SimplePlannerAgent` class.
	- The `__init__` method should accept instances of `InMemoryKG` and `EmbeddingService`.
	- The `suggest_tools(self, user_query: str, top_k: int = 3) -> List[MCPTool]` method should:
	1. Handle empty or whitespace-only `user_query` (return empty list).
	2. Call `self.embedder.get_embedding(user_query)` to get the query embedding.
	3. If embedding fails (returns `None`), log a warning and return an empty list.
	4. Call `self.kg.find_similar_tools(query_embedding, top_k=top_k)` to get relevant `tool_id`s.
	5. For each `tool_id` returned, call `self.kg.get_tool_by_id(tool_id)` to retrieve the `MCPTool` object.
	6. Collect valid `MCPTool` objects and return them.
	7. Add print statements or basic logging for observability during development.
	- Acceptance Criteria:
	1. `SimplePlannerAgent` class and `suggest_tools` method are fully implemented.
	2. Method correctly utilizes injected `InMemoryKG` and `EmbeddingService`.
	3. Handles edge cases like empty queries or embedding failures gracefully.
	4. Unit tests (using mocks for dependencies) pass.
	- TDD Approach: In `tests/agents/test_planner.py` (create this file and `tests/agents/__init__.py`):
	- `test_planner_init`: Test constructor.
	- `test_suggest_tools_empty_query`: Assert returns `[]`.
	- `test_suggest_tools_embedding_failure`: Mock `EmbeddingService.get_embedding` to return `None`. Assert returns `[]` and logs a warning.
	- `test_suggest_tools_no_similar_tools_found`: Mock `InMemoryKG.find_similar_tools` to return `[]`. Assert `suggest_tools` returns `[]`.
	- `test_suggest_tools_success`:
	- Mock `EmbeddingService.get_embedding` to return a dummy query vector.
	- Mock `InMemoryKG.find_similar_tools` to return a list of dummy tool IDs (e.g., `["tool1", "tool2"]`).
	- Mock `InMemoryKG.get_tool_by_id` to return specific `MCPTool` instances for "tool1" and "tool2", and `None` for any other ID.
	- Call `planner.suggest_tools("some query")` and assert it returns the expected list of `MCPTool` objects.

	### Task 3.2: Integration Point for Main Application (`app.py`)

	- Status: Todo
	- Parent MVP: MVP 1
	- Parent Sprint (MVP 1): Sprint 3
	- Description: In `app.py` (where the Gradio app will be), add the initialization logic for `InMemoryKG`, `EmbeddingService`, and `SimplePlannerAgent`. This logic should be placed globally so it runs once when the Gradio app starts.
	- Load tools into the KG: `knowledge_graph.load_tools_from_json("data/initial_tools.json")`.
	- Build the vector index: `knowledge_graph.build_vector_index(embedding_service)`.
	- Pass the initialized `knowledge_graph` and `embedding_service` to the `SimplePlannerAgent` constructor.
	- Handle potential errors during this initialization (e.g., API key not set, data file not found) by printing clear error messages.
	- Acceptance Criteria:
	1. `app.py` contains the global initialization block for KG, Embedder, and Planner.
	2. Initialization sequence is correct (load data, then build index).
	3. Basic error handling for initialization is present.
	4. The app can be run (`python app.py`) without crashing during this setup phase (Gradio UI itself is not built yet, but initialization should complete).
	- TDD Approach: This is primarily setup code. Manual testing by running `python app.py` and checking console output is key. Unit tests for the individual components (KG, Embedder) should already cover their internal logic.

	### Task 3.3: Update Dependencies & Run All Checks

	- Status: Todo
	- Parent MVP: MVP 1
	- Parent Sprint (MVP 1): Sprint 3
	- Description:
	1. Review `requirements.txt` and `requirements-dev.txt` for any new additions (e.g., `python-dotenv` if used for local API key loading).
	2. Regenerate `requirements.lock`: `uv pip compile requirements.txt requirements-dev.txt --all-extras -o requirements.lock`.
	3. Run `just install` (or `uv pip sync requirements.lock`).
	4. Run `just lint`, `just format`, `just type-check`, `just test`.
	5. Commit all changes.
	6. Push to GitHub and verify CI pipeline passes.
	- Acceptance Criteria:
	1. `requirements.lock` is updated.
	2. All `just` checks pass locally.
	3. Code committed and pushed.
	4. GitHub Actions CI pipeline passes for the sprint's commits.

	---

	## Implementation Guidance

	### Task 3.1 Implementation Details

	```python
	# In agents/planner.py
	from typing import List, Optional
	from kg_services.ontology import MCPTool
	from kg_services.knowledge_graph import InMemoryKG
	from kg_services.embedder import EmbeddingService


	class SimplePlannerAgent:
	"""
	A simplified planner agent that suggests tools based on user queries
	using semantic similarity search.
	"""

	def __init__(self, kg: InMemoryKG, embedder: EmbeddingService):
	"""Initialize the planner with knowledge graph and embedding service."""
	self.kg = kg
	self.embedder = embedder

	def suggest_tools(self, user_query: str, top_k: int = 3) -> List[MCPTool]:
	"""
	Suggest relevant tools based on user query using semantic similarity.

	Args:
	user_query: Natural language query from user
	top_k: Maximum number of tools to suggest

	Returns:
	List of relevant MCPTool objects, ordered by relevance
	"""
	# Handle empty or whitespace-only queries
	if not user_query or not user_query.strip():
	print("Warning: Empty or whitespace-only query provided")
	return []

	# Get embedding for the user query
	query_embedding = self.embedder.get_embedding(user_query)
	if query_embedding is None:
	print(f"Warning: Could not generate embedding for query: {user_query}")
	return []

	# Find similar tools using the knowledge graph
	similar_tool_ids = self.kg.find_similar_tools(query_embedding, top_k=top_k)

	# Retrieve actual MCPTool objects
	suggested_tools: List[MCPTool] = []
	for tool_id in similar_tool_ids:
	tool = self.kg.get_tool_by_id(tool_id)
	if tool is not None:
	suggested_tools.append(tool)

	print(f"Planner suggested tools: {[t.name for t in suggested_tools]} for query: '{user_query}'")
	return suggested_tools
	```

	### Task 3.2 Implementation Details

	```python
	# In app.py - Global Service Initialization Section
	print("Attempting to initialize KGraph-MCP services...")

	embedding_service_instance = None
	knowledge_graph_instance = None
	planner_agent_instance = None

	try:
	# For local dev, ensure .env is loaded if API keys are there
	# from dotenv import load_dotenv
	# load_dotenv() # Add python-dotenv to requirements-dev.txt if needed

	from kg_services.embedder import EmbeddingService
	from kg_services.knowledge_graph import InMemoryKG
	from agents.planner import SimplePlannerAgent

	embedding_service_instance = EmbeddingService()
	knowledge_graph_instance = InMemoryKG()

	# Load initial tools data
	print("Loading tools from data/initial_tools.json...")
	knowledge_graph_instance.load_tools_from_json("data/initial_tools.json")

	# Build vector index (this will make actual API calls)
	print("Building vector index (may take a moment for first run)...")
	knowledge_graph_instance.build_vector_index(embedding_service_instance)
	print("Vector index built successfully.")

	# Initialize the planner agent
	planner_agent_instance = SimplePlannerAgent(knowledge_graph_instance, embedding_service_instance)
	print("KGraph-MCP services initialized successfully.")

	except FileNotFoundError as e:
	print(f"FATAL: Data file not found: {e}")
	print("Please ensure data/initial_tools.json exists.")
	except Exception as e:
	print(f"FATAL: Error during service initialization: {e}")
	print("The application may not function correctly.")
	# In a real app, you might want to prevent Gradio from launching or show an error state.
	```

	### Task 3.3 Implementation Details

	This task is primarily operational:

	1. Dependency Review:
	```bash
	# Check if python-dotenv is needed
	grep -r "dotenv" . --include="*.py"
	# If found, add to requirements-dev.txt
	```

	2. Quality Checks:
	```bash
	# Update dependencies
	uv pip compile requirements.txt requirements-dev.txt --all-extras -o requirements.lock
	uv pip sync requirements.lock

	# Run all quality checks
	just lint
	just format
	just type-check
	just test
	```

	3. Commit and Push:
	```bash
	# Stage all changes
	git add -A

	# Use conventional commit
	python scripts/smart_commit.py "implement SimplePlannerAgent and app integration"

	# Push to trigger CI
	git push origin main
	```

	---

	## End of Sprint 3 Review

	### What's Done:
	- The `SimplePlannerAgent` is implemented, capable of taking a user query and using the `EmbeddingService` and `InMemoryKG` to produce a list of relevant `MCPTool` objects.
	- The main application (`app.py`) now initializes all backend services on startup.
	- Unit tests cover the planner agent's logic using mocks.
	- The backend logic for the "KG-Powered Tool Suggester" is now complete.

	### What's Next (Sprint 4):
	- Implement the Gradio UI in `app.py` to take user input and display the suggestions from the `SimplePlannerAgent`.
	- Create interactive demo interface for the hackathon.
	- Add visualization components for the knowledge graph.

	### Key Benefits Achieved:
	1. Separation of Concerns: Agent logic is decoupled from UI
	2. Testability: Comprehensive unit tests with mocked dependencies
	3. Robustness: Proper error handling for edge cases
	4. Integration Ready: Clean interfaces for UI integration in Sprint 4

	### Potential Blockers/Issues:
	- API key configuration for embedding service
	- Performance of vector index building with real API calls
	- Memory usage with larger tool datasets

	---

	Implementation Priority: This sprint establishes the core intelligence of MVP1. The next sprint will expose this functionality through an interactive user interface, completing the hackathon demo.