BasalGanglia's picture
πŸ† Multi-Track Hackathon Submission
1f2d50a verified

A newer version of the Gradio SDK is available: 6.2.0

Upgrade

Sprint 2 (MVP 1): Real Embeddings & Semantic Search Logic

Sprint Overview

  • Goal: Integrate actual LLM calls for generating embeddings, build the vector index within the InMemoryKG, and implement the core semantic search functionality. Still no UI, focus on backend KG capabilities.
  • Duration: Estimated 3-5 hours (flexible within Hackathon Day 1, following Sprint 1).
  • Core Primitives Focused On: Tool (its description being embedded and searched).
  • Key Artifacts by End of Sprint:
    • kg_services/embedder.py: EmbeddingService.get_embedding method now makes live API calls.
    • kg_services/knowledge_graph.py: InMemoryKG.build_vector_index now uses real embeddings, and InMemoryKG.find_similar_tools performs actual cosine similarity search.
    • Updated unit tests, potentially including tests that mock the LLM API calls.
    • Updated requirements.txt (if new LLM client libraries were added) and requirements.lock.
    • All code linted, formatted, type-checked, and passing CI.

Task List

Task 2.1: Implement Live LLM API Call in EmbeddingService

  • Status: Todo
  • Parent MVP: MVP 1
  • Parent Sprint (MVP 1): Sprint 2
  • Description: Modify kg_services/embedder.py's EmbeddingService.get_embedding method to make actual API calls to your chosen LLM provider (OpenAI, Anthropic, or Azure OpenAI) to generate text embeddings.
    • Ensure API keys are handled securely via environment variables (e.g., loaded using python-dotenv for local dev, and set as secrets in GitHub Actions/Hugging Face Spaces).
    • Add necessary LLM client libraries (e.g., openai, anthropic) to requirements.txt if not already there.
  • Acceptance Criteria:
    1. get_embedding method successfully calls the chosen LLM API and returns a valid embedding vector (list of floats).
    2. Handles potential API errors gracefully (e.g., logs an error and returns None or raises a custom exception).
    3. requirements.txt updated with LLM client library.
    4. Unit tests (with API mocking) pass.
  • TDD Approach: In tests/kg_services/test_embedder.py, refactor/add tests:
    • test_get_embedding_live_success: Mocks the LLM client's create (or equivalent) method to return a sample successful embedding response. Verifies the method processes this correctly.
    • test_get_embedding_api_error: Mocks the LLM client to raise an API error. Verifies get_embedding handles this gracefully.

Task 2.2: Implement Real Vector Index Building in InMemoryKG

  • Status: Todo
  • Parent MVP: MVP 1
  • Parent Sprint (MVP 1): Sprint 2
  • Description: Modify kg_services/knowledge_graph.py's InMemoryKG.build_vector_index method.
    • It should now iterate through the loaded self.tools.
    • For each tool, construct a descriptive text string (e.g., from name, description, tags).
    • Use the (now live) EmbeddingService instance to get a real embedding for this text.
    • Store these real embeddings in self.tool_embeddings and corresponding tool_ids in self.tool_ids_for_vectors.
    • Handle cases where get_embedding might return None (e.g., skip that tool or use a zero vector with a warning).
  • Acceptance Criteria:
    1. build_vector_index populates self.tool_embeddings with actual vectors from the LLM API.
    2. Correctly associates embeddings with tool_ids.
    3. Handles potential embedding failures for individual tools.
    4. Unit tests pass.
  • TDD Approach: In tests/kg_services/test_knowledge_graph.py:
    • test_build_vector_index_with_real_embeddings:
      • Needs a mock EmbeddingService that returns predictable (but distinct) vectors for different inputs.
      • Load sample tools into InMemoryKG.
      • Call build_vector_index with the mock embedder.
      • Assert that self.tool_embeddings contains the expected number of vectors and that they match what the mock embedder would have returned.
      • Assert self.tool_ids_for_vectors is populated correctly.
    • test_build_vector_index_handles_embedding_failure:
      • Mock EmbeddingService.get_embedding to return None for one of the tools.
      • Assert that the index is built for other tools and the failed one is handled (e.g., skipped or has a zero vector).

Task 2.3: Implement Cosine Similarity Search in InMemoryKG

  • Status: Todo
  • Parent MVP: MVP 1
  • Parent Sprint (MVP 1): Sprint 2
  • Description: Modify kg_services/knowledge_graph.py's InMemoryKG.find_similar_tools method.
    • It should now use numpy to perform cosine similarity calculations between the input query_embedding and each of the real embeddings stored in self.tool_embeddings.
    • Return the tool_ids of the top_k most similar tools.
    • Ensure numpy is in requirements.txt.
  • Acceptance Criteria:
    1. find_similar_tools correctly calculates cosine similarities and returns the top_k tool IDs.
    2. Handles empty self.tool_embeddings case.
    3. numpy is listed in requirements.txt.
    4. Unit tests pass.
  • TDD Approach: In tests/kg_services/test_knowledge_graph.py:
    • test_cosine_similarity_calculation (if _cosine_similarity is a helper, test it directly with known vectors).
    • test_find_similar_tools_with_populated_index:
      • Manually set kg.tool_embeddings and kg.tool_ids_for_vectors with a few known vectors and IDs.
      • Provide a query_embedding that is known to be most similar to one of them.
      • Call find_similar_tools and assert that the correct tool_id(s) are returned in the correct order.
    • test_find_similar_tools_empty_index: Assert it returns an empty list.
    • test_find_similar_tools_top_k_respected: Test with different top_k values.

Task 2.4: Update Dependencies & Run All Checks

  • Status: Todo
  • Parent MVP: MVP 1
  • Parent Sprint (MVP 1): Sprint 2
  • Description:
    1. Ensure requirements.txt includes openai (or anthropic) and numpy.
    2. Ensure requirements-dev.txt includes python-dotenv and unittest.mock (mock is part of stdlib, but good to be aware if you were using an external mocking lib).
    3. Regenerate requirements.lock: uv pip compile requirements.txt requirements-dev.txt --all-extras -o requirements.lock.
    4. Run just install (or uv pip sync requirements.lock).
    5. Run just lint, just format, just type-check, just test.
    6. Commit all changes.
    7. Push to GitHub and verify CI pipeline passes. Note: Live API calls in CI for tests are usually avoided. Ensure your tests for EmbeddingService use mocks. The build_vector_index tests should also use a mocked embedder.
  • Acceptance Criteria:
    1. requirements.lock is updated.
    2. All just checks pass locally.
    3. Code committed and pushed.
    4. GitHub Actions CI pipeline passes for the sprint's commits (leveraging mocks for API calls).

End of Sprint 2 Review

  • What's Done:
    • EmbeddingService can now generate real embeddings using an LLM API.
    • InMemoryKG can build a vector index using these real embeddings.
    • InMemoryKG can perform semantic search (cosine similarity) over the indexed tools.
    • Unit tests cover the new functionalities, using mocks for external API calls.
    • The backend logic for tool suggestion based on semantic similarity is complete.
  • What's Next (Sprint 3):
    • Implement the SimplePlannerAgent logic that ties together the EmbeddingService and InMemoryKG to process a user query and suggest tools.
  • Blockers/Issues:
    • API key setup and management (ensure it's smooth for local dev and CI doesn't expose keys).
    • Potential rate limits or costs if generating many embeddings for testing (though for 3-5 tools, this should be minimal).

Implementation Guidance

Task 2.1 Implementation Details

# In kg_services/embedder.py
# Refactor the EmbeddingService class:
# - In __init__:
#     - Initialize the appropriate LLM client (OpenAI, AzureOpenAI, or Anthropic).
#     - Read API keys and any necessary endpoint/deployment information from environment variables.
#     - Add python-dotenv to requirements-dev.txt and load .env in __init__ if a .env file exists.
# - In get_embedding(self, text: str) -> Optional[List[float]]:
#     - Replace the placeholder logic with an actual API call to the embedding endpoint.
#     - Include error handling (try-except block) for API calls.
#     - Ensure the text is preprocessed if necessary.

Task 2.2 Implementation Details

# In kg_services/knowledge_graph.py
# Refactor InMemoryKG.build_vector_index(self, embedder: EmbeddingService):
# - Clear self.tool_embeddings and self.tool_ids_for_vectors at the start.
# - Iterate through self.tools.items().
# - For each tool_id, tool:
#     - Construct a meaningful text: f"{tool.name} - {tool.description} Tags: {', '.join(tool.tags)}"
#     - Call embedding = embedder.get_embedding(text_to_embed)
#     - If embedding is not None and is valid:
#         - Append embedding to self.tool_embeddings
#         - Append tool_id to self.tool_ids_for_vectors
#     - Else (embedding failed):
#         - Log a warning
#         - Optionally, append a zero vector and the tool_id

Task 2.3 Implementation Details

# In kg_services/knowledge_graph.py
# Refactor InMemoryKG._cosine_similarity(self, vec1: List[float], vec2: List[float]) -> float:
# - Ensure inputs vec1 and vec2 are converted to np.array.
# - Perform dot product.
# - Calculate norms.
# - Handle potential division by zero if a norm is zero (return 0.0 similarity).
# - Return the cosine similarity.
# Refactor InMemoryKG.find_similar_tools(self, query_embedding: List[float], top_k: int = 3) -> List[str]:
# - If not self.tool_embeddings or not query_embedding, return [].
# - Calculate similarities: Iterate through self.tool_embeddings, calling _cosine_similarity.
# - Create pairs of (similarity_score, tool_id).
# - Sort these pairs in descending order of similarity score.
# - Return the tool_ids from the top top_k pairs.