File size: 11,217 Bytes
1f2d50a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
# Sprint 3 (MVP 1): Implementing the Simplified Planner Agent

## Sprint Overview

- **Goal:** Implement the `SimplePlannerAgent` logic that orchestrates the `EmbeddingService` and `InMemoryKG` to take a user query, process it, and return a list of suggested `MCPTool` objects. *This sprint focuses on the agent's internal logic; UI integration comes in Sprint 4.*
- **Duration:** Estimated 2-4 hours (flexible within Hackathon Day 1 or early Day 2).
- **Core Primitives Focused On:** Tool (being suggested by the agent).
- **Key Artifacts by End of Sprint:**
  - `agents/planner.py`: `SimplePlannerAgent` class fully implemented and tested.
  - Unit tests for `SimplePlannerAgent`.
  - All code linted, formatted, type-checked, and passing CI.

---

## Task List

### Task 3.1: Implement `SimplePlannerAgent` Core Logic

- **Status:** Todo
- **Parent MVP:** MVP 1
- **Parent Sprint (MVP 1):** Sprint 3
- **Description:** In `agents/planner.py`, fully implement the `SimplePlannerAgent` class.
  - The `__init__` method should accept instances of `InMemoryKG` and `EmbeddingService`.
  - The `suggest_tools(self, user_query: str, top_k: int = 3) -> List[MCPTool]` method should:
    1. Handle empty or whitespace-only `user_query` (return empty list).
    2. Call `self.embedder.get_embedding(user_query)` to get the query embedding.
    3. If embedding fails (returns `None`), log a warning and return an empty list.
    4. Call `self.kg.find_similar_tools(query_embedding, top_k=top_k)` to get relevant `tool_id`s.
    5. For each `tool_id` returned, call `self.kg.get_tool_by_id(tool_id)` to retrieve the `MCPTool` object.
    6. Collect valid `MCPTool` objects and return them.
    7. Add print statements or basic logging for observability during development.
- **Acceptance Criteria:**
  1. `SimplePlannerAgent` class and `suggest_tools` method are fully implemented.
  2. Method correctly utilizes injected `InMemoryKG` and `EmbeddingService`.
  3. Handles edge cases like empty queries or embedding failures gracefully.
  4. Unit tests (using mocks for dependencies) pass.
- **TDD Approach:** In `tests/agents/test_planner.py` (create this file and `tests/agents/__init__.py`):
  - `test_planner_init`: Test constructor.
  - `test_suggest_tools_empty_query`: Assert returns `[]`.
  - `test_suggest_tools_embedding_failure`: Mock `EmbeddingService.get_embedding` to return `None`. Assert returns `[]` and logs a warning.
  - `test_suggest_tools_no_similar_tools_found`: Mock `InMemoryKG.find_similar_tools` to return `[]`. Assert `suggest_tools` returns `[]`.
  - `test_suggest_tools_success`:
    - Mock `EmbeddingService.get_embedding` to return a dummy query vector.
    - Mock `InMemoryKG.find_similar_tools` to return a list of dummy tool IDs (e.g., `["tool1", "tool2"]`).
    - Mock `InMemoryKG.get_tool_by_id` to return specific `MCPTool` instances for "tool1" and "tool2", and `None` for any other ID.
    - Call `planner.suggest_tools("some query")` and assert it returns the expected list of `MCPTool` objects.

### Task 3.2: Integration Point for Main Application (`app.py`)

- **Status:** Todo
- **Parent MVP:** MVP 1
- **Parent Sprint (MVP 1):** Sprint 3
- **Description:** In `app.py` (where the Gradio app will be), add the initialization logic for `InMemoryKG`, `EmbeddingService`, and `SimplePlannerAgent`. This logic should be placed globally so it runs once when the Gradio app starts.
  - Load tools into the KG: `knowledge_graph.load_tools_from_json("data/initial_tools.json")`.
  - Build the vector index: `knowledge_graph.build_vector_index(embedding_service)`.
  - Pass the initialized `knowledge_graph` and `embedding_service` to the `SimplePlannerAgent` constructor.
  - Handle potential errors during this initialization (e.g., API key not set, data file not found) by printing clear error messages.
- **Acceptance Criteria:**
  1. `app.py` contains the global initialization block for KG, Embedder, and Planner.
  2. Initialization sequence is correct (load data, then build index).
  3. Basic error handling for initialization is present.
  4. The app can be run (`python app.py`) without crashing during this setup phase (Gradio UI itself is not built yet, but initialization should complete).
- **TDD Approach:** This is primarily setup code. Manual testing by running `python app.py` and checking console output is key. Unit tests for the individual components (KG, Embedder) should already cover their internal logic.

### Task 3.3: Update Dependencies & Run All Checks

- **Status:** Todo
- **Parent MVP:** MVP 1
- **Parent Sprint (MVP 1):** Sprint 3
- **Description:**
  1. Review `requirements.txt` and `requirements-dev.txt` for any new additions (e.g., `python-dotenv` if used for local API key loading).
  2. Regenerate `requirements.lock`: `uv pip compile requirements.txt requirements-dev.txt --all-extras -o requirements.lock`.
  3. Run `just install` (or `uv pip sync requirements.lock`).
  4. Run `just lint`, `just format`, `just type-check`, `just test`.
  5. Commit all changes.
  6. Push to GitHub and verify CI pipeline passes.
- **Acceptance Criteria:**
  1. `requirements.lock` is updated.
  2. All `just` checks pass locally.
  3. Code committed and pushed.
  4. GitHub Actions CI pipeline passes for the sprint's commits.

---

## Implementation Guidance

### Task 3.1 Implementation Details

```python
# In agents/planner.py
from typing import List, Optional
from kg_services.ontology import MCPTool
from kg_services.knowledge_graph import InMemoryKG
from kg_services.embedder import EmbeddingService


class SimplePlannerAgent:
    """
    A simplified planner agent that suggests tools based on user queries
    using semantic similarity search.
    """
    
    def __init__(self, kg: InMemoryKG, embedder: EmbeddingService):
        """Initialize the planner with knowledge graph and embedding service."""
        self.kg = kg
        self.embedder = embedder
    
    def suggest_tools(self, user_query: str, top_k: int = 3) -> List[MCPTool]:
        """
        Suggest relevant tools based on user query using semantic similarity.
        
        Args:
            user_query: Natural language query from user
            top_k: Maximum number of tools to suggest
            
        Returns:
            List of relevant MCPTool objects, ordered by relevance
        """
        # Handle empty or whitespace-only queries
        if not user_query or not user_query.strip():
            print("Warning: Empty or whitespace-only query provided")
            return []
        
        # Get embedding for the user query
        query_embedding = self.embedder.get_embedding(user_query)
        if query_embedding is None:
            print(f"Warning: Could not generate embedding for query: {user_query}")
            return []
        
        # Find similar tools using the knowledge graph
        similar_tool_ids = self.kg.find_similar_tools(query_embedding, top_k=top_k)
        
        # Retrieve actual MCPTool objects
        suggested_tools: List[MCPTool] = []
        for tool_id in similar_tool_ids:
            tool = self.kg.get_tool_by_id(tool_id)
            if tool is not None:
                suggested_tools.append(tool)
        
        print(f"Planner suggested tools: {[t.name for t in suggested_tools]} for query: '{user_query}'")
        return suggested_tools
```

### Task 3.2 Implementation Details

```python
# In app.py - Global Service Initialization Section
print("Attempting to initialize KGraph-MCP services...")

embedding_service_instance = None
knowledge_graph_instance = None
planner_agent_instance = None

try:
    # For local dev, ensure .env is loaded if API keys are there
    # from dotenv import load_dotenv
    # load_dotenv() # Add python-dotenv to requirements-dev.txt if needed

    from kg_services.embedder import EmbeddingService
    from kg_services.knowledge_graph import InMemoryKG
    from agents.planner import SimplePlannerAgent

    embedding_service_instance = EmbeddingService()
    knowledge_graph_instance = InMemoryKG()
    
    # Load initial tools data
    print("Loading tools from data/initial_tools.json...")
    knowledge_graph_instance.load_tools_from_json("data/initial_tools.json")
    
    # Build vector index (this will make actual API calls)
    print("Building vector index (may take a moment for first run)...")
    knowledge_graph_instance.build_vector_index(embedding_service_instance)
    print("Vector index built successfully.")
    
    # Initialize the planner agent
    planner_agent_instance = SimplePlannerAgent(knowledge_graph_instance, embedding_service_instance)
    print("KGraph-MCP services initialized successfully.")

except FileNotFoundError as e:
    print(f"FATAL: Data file not found: {e}")
    print("Please ensure data/initial_tools.json exists.")
except Exception as e:
    print(f"FATAL: Error during service initialization: {e}")
    print("The application may not function correctly.")
    # In a real app, you might want to prevent Gradio from launching or show an error state.
```

### Task 3.3 Implementation Details

This task is primarily operational:

1. **Dependency Review:**
   ```bash
   # Check if python-dotenv is needed
   grep -r "dotenv" . --include="*.py"
   # If found, add to requirements-dev.txt
   ```

2. **Quality Checks:**
   ```bash
   # Update dependencies
   uv pip compile requirements.txt requirements-dev.txt --all-extras -o requirements.lock
   uv pip sync requirements.lock
   
   # Run all quality checks
   just lint
   just format  
   just type-check
   just test
   ```

3. **Commit and Push:**
   ```bash
   # Stage all changes
   git add -A
   
   # Use conventional commit
   python scripts/smart_commit.py "implement SimplePlannerAgent and app integration"
   
   # Push to trigger CI
   git push origin main
   ```

---

## End of Sprint 3 Review

### **What's Done:**
- The `SimplePlannerAgent` is implemented, capable of taking a user query and using the `EmbeddingService` and `InMemoryKG` to produce a list of relevant `MCPTool` objects.
- The main application (`app.py`) now initializes all backend services on startup.
- Unit tests cover the planner agent's logic using mocks.
- The backend logic for the "KG-Powered Tool Suggester" is now complete.

### **What's Next (Sprint 4):**
- Implement the Gradio UI in `app.py` to take user input and display the suggestions from the `SimplePlannerAgent`.
- Create interactive demo interface for the hackathon.
- Add visualization components for the knowledge graph.

### **Key Benefits Achieved:**
1. **Separation of Concerns:** Agent logic is decoupled from UI
2. **Testability:** Comprehensive unit tests with mocked dependencies
3. **Robustness:** Proper error handling for edge cases
4. **Integration Ready:** Clean interfaces for UI integration in Sprint 4

### **Potential Blockers/Issues:**
- API key configuration for embedding service
- Performance of vector index building with real API calls
- Memory usage with larger tool datasets

---

**Implementation Priority:** This sprint establishes the core intelligence of MVP1. The next sprint will expose this functionality through an interactive user interface, completing the hackathon demo.