# Visualization Agent AI agent that generates architectural sketches of disaster-resistant buildings using Google's Gemini image generation API (Nano Banana). ## Overview The Visualization Agent receives risk assessment data and building specifications to create contextual architectural visualizations. It analyzes disaster risks (seismic, volcanic, hydrometeorological) and generates prompts that incorporate appropriate disaster-resistant features, then uses Gemini's image generation API to create visual representations. ## Features - **Risk-Aware Visualization**: Incorporates disaster-resistant features based on risk assessment - **Building Type Support**: Generates appropriate architecture for residential, commercial, institutional, industrial, and infrastructure projects - **Philippine Context**: Includes tropical climate and local architectural considerations - **Fast Generation**: Uses gemini-2.5-flash-image model for quick results (~10-20 seconds) - **Detailed Metadata**: Returns prompt used, features included, and generation timestamp - **Backward Compatibility**: Supports both legacy format (risk_data + building_type) and new format (prompt + construction_data) ## Architecture ### High-Level Architecture ``` Orchestrator Agent ↓ Visualization Agent (FastAPI) ├─→ Request Validator ├─→ Prompt Generator │ ├─→ Hazard Analyzer │ ├─→ Feature Mapper │ └─→ Context Builder ├─→ Gemini API Client │ ├─→ API Request Handler │ ├─→ Error Handler │ └─→ Response Parser └─→ Response Formatter ├─→ Base64 Encoder ├─→ Metadata Generator └─→ Feature List Compiler ↓ Returns VisualizationData to Orchestrator ↓ Gradio UI (displays image) ``` ### Component Details #### 1. VisualizationAgent (Main Class) **Responsibilities**: - Orchestrate the visualization generation process - Coordinate between prompt generator and API client - Handle errors and format responses **Key Methods**: - `generate_visualization()`: Main entry point - `_validate_input()`: Validate request parameters - `_format_response()`: Format final response with metadata #### 2. PromptGenerator **Responsibilities**: - Analyze risk data to identify relevant hazards - Map hazards to visual features - Generate descriptive prompts for Gemini API - Add Philippine architectural context **Key Methods**: - `generate_prompt()`: Create complete prompt - `_extract_hazard_features()`: Extract features from risk data - `_get_building_description()`: Get building type description - `_add_philippine_context()`: Add contextual elements - `_prioritize_features()`: Prioritize features for multi-hazard scenarios **Feature Mapping Logic**: ```python # Seismic hazards if risk_data.seismic_risk == "high": features.append("Reinforced concrete frame with cross-bracing") features.append("Moment-resisting frames") # Flood hazards if risk_data.flood_risk == "high": features.append("Elevated first floor on stilts") # Volcanic hazards if risk_data.volcanic_risk == "high": features.append("Steep-pitched roof for ash shedding") ``` #### 3. GeminiAPIClient **Responsibilities**: - Communicate with Google Gemini API - Handle API authentication - Manage timeouts and retries - Parse API responses **Key Methods**: - `generate_image()`: Call Gemini API - `_handle_api_error()`: Convert API errors to structured format - `_validate_response()`: Validate API response **API Configuration**: - Model: `gemini-2.5-flash-image` - Resolution: 1024x1024 - Format: PNG - Timeout: 30 seconds #### 4. Response Formatter **Responsibilities**: - Encode image data to base64 - Generate metadata - Compile features list - Format final response **Metadata Included**: - Prompt used for generation - Model version - Generation timestamp (ISO 8601) - Image format and resolution - List of disaster-resistant features ### Data Flow ``` 1. Request arrives at FastAPI endpoint ↓ 2. Request validation (Pydantic models) ↓ 3. VisualizationAgent.generate_visualization() ↓ 4. PromptGenerator.generate_prompt() - Analyze risk_data - Extract hazard features - Get building description - Add Philippine context - Compile final prompt ↓ 5. GeminiAPIClient.generate_image() - Send prompt to Gemini API - Wait for response (10-20 seconds) - Receive image bytes ↓ 6. Response Formatter - Encode image to base64 - Generate metadata - Compile features list ↓ 7. Return VisualizationResponse ↓ 8. Orchestrator receives response ↓ 9. Gradio UI displays image ``` ### Error Handling Flow ``` Error occurs at any stage ↓ Error caught by try/except block ↓ Error categorized (AUTH, RATE_LIMIT, NETWORK, etc.) ↓ ErrorDetail object created ↓ Response with success=false returned ↓ Orchestrator handles error gracefully ↓ UI shows error message or continues without visualization ``` ### Technology Stack - **Framework**: FastAPI (HTTP server) - **AI API**: Google Gemini (gemini-2.5-flash-image) - **Image Processing**: Pillow (PIL) - **Data Validation**: Pydantic v2 - **Deployment**: Blaxel platform - **Language**: Python 3.11+ ### Performance Characteristics - **Latency**: 10-20 seconds typical (Gemini API call) - **Throughput**: 5 concurrent requests - **Memory**: ~300MB per request - **CPU**: Minimal (mostly I/O bound) - **Network**: ~2-5MB per request (image download) ## Installation ### Prerequisites - Python 3.11+ - Gemini API key (Google AI Studio) ### Setup 1. Install dependencies: ```bash cd visualization-agent pip install -r requirements.txt ``` 2. Configure environment variables: ```bash cp .env.example .env # Edit .env and add your GEMINI_API_KEY ``` 3. Test the agent: ```bash python test_agent.py ``` ## Usage ### As HTTP Service Start the FastAPI server: ```bash python main.py ``` Send POST request to generate visualization: ```bash curl -X POST http://localhost:8000/ \ -H "Content-Type: application/json" \ -d '{ "risk_data": { "seismic_risk": "high", "flood_risk": "medium", "location": {"latitude": 14.5995, "longitude": 120.9842} }, "building_type": "residential_single_family", "recommendations": {...} }' ``` ### As Python Module ```python from agent import VisualizationAgent agent = VisualizationAgent() visualization_data = agent.generate_visualization( risk_data=risk_data, building_type="residential_single_family", recommendations=recommendations ) # Access generated image image_base64 = visualization_data.image_base64 prompt_used = visualization_data.prompt_used features = visualization_data.features_included ``` ## API Reference ### Endpoint ``` POST / ``` ### Request Formats The agent supports **three request formats** for backward compatibility: #### Format 1: Legacy Format (risk_data + building_type) This format is supported for backward compatibility with older orchestrator versions: ```json { "risk_data": { "location": {...}, "hazards": {...} }, "building_type": "residential_single_family", "recommendations": {...} // optional } ``` The agent automatically converts this to the new format by: 1. Generating a prompt based on building_type 2. Creating construction_data from risk_data, building_type, and recommendations 3. Processing as a context-aware request #### Format 2: New Format (prompt + construction_data) This is the recommended format for new integrations: ```json { "prompt": "A disaster-resistant school building in the Philippines", "construction_data": { "building_type": "institutional_school", "location": {...}, "risk_data": {...}, "recommendations": {...} }, "config": { "aspect_ratio": "16:9", "image_size": "1K" } } ``` #### Format 3: Basic Format (prompt only) For simple use cases without context: ```json { "prompt": "A modern disaster-resistant building in the Philippines" } ``` ### Request Format Details #### Complete Request Schema ```python { "risk_data": { "seismic_risk": str, # "low", "medium", "high" "flood_risk": str, # "low", "medium", "high" "volcanic_risk": str, # "low", "medium", "high" "location": { "latitude": float, # 4.0 to 21.0 (Philippines) "longitude": float, # 116.0 to 127.0 (Philippines) "municipality": str, # Optional "province": str # Optional }, "hazards": [ # Optional, detailed hazard list { "type": str, # "seismic", "volcanic", "hydrometeorological" "category": str, # Specific hazard category "severity": str, # "low", "medium", "high" "description": str # Human-readable description } ] }, "building_type": str, # See Building Types section "recommendations": { # Optional "structural": [ { "category": str, "priority": str, "description": str } ] } } ``` #### Minimal Request ```python { "risk_data": { "seismic_risk": "high", "flood_risk": "low", "volcanic_risk": "low", "location": { "latitude": 14.5995, "longitude": 120.9842 } }, "building_type": "residential_single_family" } ``` ### Response Format #### Success Response ```python { "success": true, "visualization_data": { "image_base64": str, # Base64-encoded PNG image "prompt_used": str, # Full prompt sent to Gemini "model_version": str, # "gemini-2.5-flash-image" "generation_timestamp": str, # ISO 8601 format "image_format": "PNG", # Always PNG "resolution": "1024x1024", # Always 1024x1024 "features_included": [str] # List of disaster-resistant features }, "error": null } ``` #### Error Response ```python { "success": false, "visualization_data": null, "error": { "code": str, # Error code (see Error Codes section) "message": str, # Human-readable error message "retry_possible": bool # Whether retry is recommended } } ``` ### Request Parameters #### risk_data (required) | Field | Type | Required | Description | |-------|------|----------|-------------| | `seismic_risk` | string | Yes | Overall seismic risk level: "low", "medium", "high" | | `flood_risk` | string | Yes | Overall flood risk level: "low", "medium", "high" | | `volcanic_risk` | string | Yes | Overall volcanic risk level: "low", "medium", "high" | | `location` | object | Yes | Geographic location data | | `hazards` | array | No | Detailed hazard information | #### location (required) | Field | Type | Required | Description | |-------|------|----------|-------------| | `latitude` | float | Yes | Latitude (4.0 to 21.0 for Philippines) | | `longitude` | float | Yes | Longitude (116.0 to 127.0 for Philippines) | | `municipality` | string | No | Municipality name | | `province` | string | No | Province name | #### building_type (required) | Value | Description | |-------|-------------| | `residential_single_family` | Single-family home | | `residential_multi_family` | Multi-family residential (2-4 units) | | `residential_high_rise` | High-rise apartment building | | `commercial_office` | Modern office building | | `commercial_retail` | Retail shopping center | | `industrial_warehouse` | Industrial warehouse facility | | `institutional_school` | School building | | `institutional_hospital` | Hospital or healthcare facility | | `infrastructure_bridge` | Bridge structure | | `mixed_use` | Mixed-use development | #### recommendations (optional) Optional construction recommendations from research agent. If provided, may influence feature selection. ### Response Fields #### visualization_data | Field | Type | Description | |-------|------|-------------| | `image_base64` | string | Base64-encoded PNG image data | | `prompt_used` | string | Complete prompt sent to Gemini API | | `model_version` | string | Gemini model version used | | `generation_timestamp` | string | ISO 8601 timestamp of generation | | `image_format` | string | Always "PNG" | | `resolution` | string | Always "1024x1024" | | `features_included` | array | List of disaster-resistant features shown | #### error | Field | Type | Description | |-------|------|-------------| | `code` | string | Error code (see Error Codes section) | | `message` | string | Human-readable error description | | `retry_possible` | boolean | Whether the request can be retried | ### HTTP Status Codes | Status Code | Description | |-------------|-------------| | 200 | Success (check `success` field in response) | | 400 | Bad Request (invalid input parameters) | | 401 | Unauthorized (invalid API key) | | 429 | Too Many Requests (rate limit exceeded) | | 500 | Internal Server Error | | 504 | Gateway Timeout (generation took > 30 seconds) | ### Rate Limits - **Free Tier**: 60 requests per minute - **Paid Tier**: Varies by plan - **Concurrent Requests**: Maximum 5 simultaneous requests ### Authentication When deployed on Blaxel: ```bash curl -X POST https://run.blaxel.ai/{workspace}/agents/visualization-agent \ -H "Content-Type: application/json" \ -H "Authorization: Bearer {BLAXEL_API_KEY}" \ -d @request.json ``` ### Content Type All requests and responses use `application/json`. ## Supported Building Types The agent supports 10 building type categories, each with specific architectural characteristics: ### Residential Buildings | Building Type | Code | Description | Typical Features | |---------------|------|-------------|------------------| | Single Family | `residential_single_family` | Single-family home | 1-2 stories, pitched roof, residential scale | | Multi Family | `residential_multi_family` | Multi-family residential (2-4 units) | 2-3 stories, multiple entrances, shared spaces | | High Rise | `residential_high_rise` | High-rise apartment building | 10+ stories, elevator core, balconies | ### Commercial Buildings | Building Type | Code | Description | Typical Features | |---------------|------|-------------|------------------| | Office | `commercial_office` | Modern office building | 3-10 stories, glass facade, modern design | | Retail | `commercial_retail` | Retail shopping center | 1-2 stories, large windows, parking area | ### Industrial Buildings | Building Type | Code | Description | Typical Features | |---------------|------|-------------|------------------| | Warehouse | `industrial_warehouse` | Industrial warehouse facility | Large open space, high ceilings, loading docks | ### Institutional Buildings | Building Type | Code | Description | Typical Features | |---------------|------|-------------|------------------| | School | `institutional_school` | School building with classrooms | 1-3 stories, multiple wings, playground area | | Hospital | `institutional_hospital` | Hospital or healthcare facility | 3-5 stories, emergency entrance, medical design | ### Infrastructure | Building Type | Code | Description | Typical Features | |---------------|------|-------------|------------------| | Bridge | `infrastructure_bridge` | Bridge structure | Span structure, support columns, roadway | ### Mixed Use | Building Type | Code | Description | Typical Features | |---------------|------|-------------|------------------| | Mixed Use | `mixed_use` | Mixed-use development | Commercial ground floor, residential upper floors | ### Building Type Selection Guide Choose the appropriate building type based on your project: - **Residential Projects**: Use `residential_single_family` for houses, `residential_multi_family` for apartments/condos, `residential_high_rise` for towers - **Commercial Projects**: Use `commercial_office` for office buildings, `commercial_retail` for shops/malls - **Industrial Projects**: Use `industrial_warehouse` for factories, warehouses, distribution centers - **Public Buildings**: Use `institutional_school` for schools, `institutional_hospital` for hospitals/clinics - **Infrastructure**: Use `infrastructure_bridge` for bridges, overpasses - **Mixed Projects**: Use `mixed_use` for buildings combining residential and commercial spaces ## Prompt Generation Strategy The Visualization Agent uses a sophisticated prompt generation strategy to create contextual, risk-aware architectural visualizations. ### Prompt Template Structure ``` [Building Type Description] in the Philippines, designed for disaster resistance. Key Features: - [Hazard-specific feature 1] - [Hazard-specific feature 2] - [Hazard-specific feature 3] Architectural Style: [Philippine context] Setting: [Tropical environment with appropriate landscaping] Perspective: [Exterior view showing structural features] Style: Architectural sketch, professional rendering ``` ### Feature Prioritization When multiple hazards are present, the agent prioritizes features based on: 1. **Risk Level**: High-risk hazards get priority over medium/low 2. **Structural Impact**: Features that affect the entire building structure 3. **Visual Prominence**: Features that are clearly visible in architectural sketches ### Philippine Context Integration The agent automatically adds contextual elements: - Tropical climate considerations (ventilation, sun protection) - Local architectural styles and materials - Appropriate landscaping (palm trees, tropical vegetation) - Regional building practices ## Hazard-to-Feature Mappings The agent uses detailed mappings to translate risk data into visual features: ### Seismic Hazards | Hazard Type | Risk Level | Visual Features | |-------------|-----------|-----------------| | Active Fault | High | Reinforced concrete frame with visible cross-bracing | | Ground Shaking | High | Moment-resisting frames, shear walls | | Liquefaction | Medium-High | Deep pile foundation visible at base | | Earthquake | All | Structural reinforcements, seismic joints | **Example Features**: - Reinforced concrete frame with cross-bracing - Moment-resisting frames - Shear walls - Deep pile foundations - Seismic isolation systems ### Volcanic Hazards | Hazard Type | Risk Level | Visual Features | |-------------|-----------|-----------------| | Ashfall | High | Steep-pitched roof (45°+ angle) for ash shedding | | Pyroclastic Flow | High | Reinforced concrete construction, protective barriers | | Lahar | Medium-High | Elevated foundation, diversion channels | | Volcanic Activity | All | Robust roof structure, sealed openings | **Example Features**: - Steep-pitched roof for ash shedding - Reinforced concrete construction - Protective barriers and walls - Elevated foundation - Sealed ventilation systems ### Hydrometeorological Hazards | Hazard Type | Risk Level | Visual Features | |-------------|-----------|-----------------| | Flood | High | Elevated first floor on stilts (2-3 meters) | | Storm Surge | High | Coastal reinforcement, breakwaters | | Severe Winds | High | Aerodynamic roof design, hurricane straps | | Typhoon | High | Wind-resistant construction, storm shutters | | Landslide | Medium-High | Retaining walls, terraced foundation | **Example Features**: - Elevated first floor on stilts - Raised foundation - Flood barriers - Aerodynamic roof design - Hurricane straps - Storm shutters - Retaining walls - Terraced foundation ### Multi-Hazard Scenarios When multiple hazards are present, the agent combines features intelligently: **Example: High Seismic + High Flood** - Elevated foundation on reinforced concrete piles - Moment-resisting frames visible in structure - Cross-bracing on elevated sections **Example: High Volcanic + Medium Wind** - Steep-pitched roof with aerodynamic design - Reinforced concrete construction - Storm shutters on windows ## Example Requests and Responses ### Example 1: Residential Building with High Seismic Risk **Request**: ```json { "risk_data": { "seismic_risk": "high", "flood_risk": "low", "volcanic_risk": "low", "location": { "latitude": 14.5995, "longitude": 120.9842, "municipality": "Manila", "province": "Metro Manila" }, "hazards": [ { "type": "seismic", "category": "active_fault", "severity": "high", "description": "Near active fault line" } ] }, "building_type": "residential_single_family", "recommendations": { "structural": [ { "category": "foundation", "priority": "critical", "description": "Use reinforced concrete foundation with seismic isolation" } ] } } ``` **Response**: ```json { "success": true, "visualization_data": { "image_base64": "iVBORw0KGgoAAAANSUhEUgAA...(truncated)", "prompt_used": "Single-family home in the Philippines, designed for disaster resistance.\n\nKey Features:\n- Reinforced concrete frame with visible cross-bracing\n- Moment-resisting frames for earthquake protection\n- Deep pile foundation visible at base\n\nArchitectural Style: Modern Filipino residential with tropical design elements\nSetting: Tropical environment with palm trees and lush vegetation\nPerspective: Exterior view showing structural reinforcements\nStyle: Architectural sketch, professional rendering", "model_version": "gemini-2.5-flash-image", "generation_timestamp": "2024-01-15T10:30:45.123Z", "image_format": "PNG", "resolution": "1024x1024", "features_included": [ "Reinforced concrete frame with cross-bracing", "Moment-resisting frames", "Deep pile foundation" ] } } ``` ### Example 2: Commercial Building with Flood Risk **Request**: ```json { "risk_data": { "seismic_risk": "low", "flood_risk": "high", "volcanic_risk": "low", "location": { "latitude": 10.3157, "longitude": 123.8854, "municipality": "Cebu City", "province": "Cebu" }, "hazards": [ { "type": "hydrometeorological", "category": "flood", "severity": "high", "description": "Flood-prone area" } ] }, "building_type": "commercial_office" } ``` **Response**: ```json { "success": true, "visualization_data": { "image_base64": "iVBORw0KGgoAAAANSUhEUgAA...(truncated)", "prompt_used": "Modern office building in the Philippines, designed for disaster resistance.\n\nKey Features:\n- Elevated first floor on reinforced concrete stilts (2-3 meters)\n- Flood barriers around perimeter\n- Water-resistant materials for lower levels\n\nArchitectural Style: Contemporary Filipino commercial architecture\nSetting: Urban tropical environment with flood management features\nPerspective: Exterior view showing elevated foundation\nStyle: Architectural sketch, professional rendering", "model_version": "gemini-2.5-flash-image", "generation_timestamp": "2024-01-15T10:32:18.456Z", "image_format": "PNG", "resolution": "1024x1024", "features_included": [ "Elevated first floor on stilts", "Flood barriers", "Water-resistant construction" ] } } ``` ### Example 3: Multi-Hazard Scenario **Request**: ```json { "risk_data": { "seismic_risk": "high", "flood_risk": "medium", "volcanic_risk": "high", "location": { "latitude": 13.2572, "longitude": 123.8144, "municipality": "Legazpi", "province": "Albay" }, "hazards": [ { "type": "volcanic", "category": "ashfall", "severity": "high" }, { "type": "seismic", "category": "earthquake", "severity": "high" } ] }, "building_type": "institutional_school" } ``` **Response**: ```json { "success": true, "visualization_data": { "image_base64": "iVBORw0KGgoAAAANSUhEUgAA...(truncated)", "prompt_used": "School building with classrooms in the Philippines, designed for disaster resistance.\n\nKey Features:\n- Steep-pitched reinforced concrete roof for volcanic ash shedding\n- Reinforced concrete frame with seismic cross-bracing\n- Moment-resisting frames for earthquake protection\n- Protective barriers around building perimeter\n\nArchitectural Style: Institutional Filipino architecture with disaster-resistant design\nSetting: Tropical environment near volcanic area with protective landscaping\nPerspective: Exterior view showing roof design and structural reinforcements\nStyle: Architectural sketch, professional rendering", "model_version": "gemini-2.5-flash-image", "generation_timestamp": "2024-01-15T10:35:22.789Z", "image_format": "PNG", "resolution": "1024x1024", "features_included": [ "Steep-pitched roof for ash shedding", "Reinforced concrete frame with cross-bracing", "Moment-resisting frames", "Protective barriers" ] } } ``` ### Example 4: Error Response **Request**: ```json { "risk_data": {...}, "building_type": "residential_single_family" } ``` **Response** (Invalid API Key): ```json { "success": false, "visualization_data": null, "error": { "code": "AUTH_ERROR", "message": "Invalid or missing Gemini API key. Please check your GEMINI_API_KEY environment variable.", "retry_possible": false } } ``` ## Error Handling The agent provides comprehensive error handling with detailed error codes and messages. ### Error Codes | Error Code | Description | Retry Possible | Recommended Action | |------------|-------------|----------------|-------------------| | `AUTH_ERROR` | Invalid or missing API key | No | Check GEMINI_API_KEY environment variable | | `RATE_LIMIT` | API quota exceeded | Yes | Wait and retry after delay (typically 60 seconds) | | `GENERATION_FAILED` | Image generation failed | Yes | Retry with same or modified prompt | | `NETWORK_ERROR` | Connection issues | Yes | Check internet connection and retry | | `TIMEOUT` | Generation took longer than 30 seconds | Yes | Retry or simplify prompt | | `INVALID_INPUT` | Invalid request parameters | No | Check request format and parameters | | `MODEL_ERROR` | Gemini model error | Yes | Retry or contact support | ### Error Response Format All errors follow this structure: ```json { "success": false, "visualization_data": null, "error": { "code": "ERROR_CODE", "message": "Human-readable error description", "retry_possible": true/false } } ``` ### Retry Strategy For errors with `retry_possible: true`: 1. **Rate Limit Errors**: Wait 60 seconds before retrying 2. **Network Errors**: Retry immediately, then with exponential backoff (2s, 4s, 8s) 3. **Generation Failures**: Retry up to 2 times with same prompt 4. **Timeouts**: Retry once, then consider simplifying the prompt ### Error Logging All errors are logged with full context: - Request parameters - Error type and message - Timestamp - Stack trace (for debugging) Example log entry: ``` 2024-01-15 10:30:45 ERROR [VisualizationAgent] Image generation failed Error: RATE_LIMIT Message: API quota exceeded Request: building_type=residential_single_family, location=Manila Retry: true ``` ## Deployment ### Prerequisites Before deploying, ensure you have: 1. **Blaxel CLI installed**: ```bash pip install blaxel ``` 2. **Blaxel account and workspace**: - Sign up at [blaxel.ai](https://blaxel.ai) - Create a workspace - Get your API key 3. **Gemini API key**: - Get API key from [Google AI Studio](https://makersuite.google.com/app/apikey) - Add to `.env` file ### Local Development 1. **Install dependencies**: ```bash cd visualization-agent pip install -r requirements.txt ``` 2. **Configure environment**: ```bash cp .env.example .env # Edit .env and add: # GEMINI_API_KEY=your_api_key_here ``` 3. **Run locally**: ```bash python main.py ``` 4. **Test the agent**: ```bash python test_agent.py ``` ### Blaxel Platform Deployment #### Step 1: Configure Environment Variables Create or update `.env` file: ```bash GEMINI_API_KEY=your_gemini_api_key GEMINI_MODEL=gemini-2.5-flash-image ``` #### Step 2: Review Configuration Check `blaxel.toml` configuration: ```toml name = "visualization-agent" type = "agent" [env] GEMINI_API_KEY = "${GEMINI_API_KEY}" GEMINI_MODEL = "${GEMINI_MODEL}" [runtime] timeout = 30 memory = 512 [entrypoint] prod = "python main.py" [[triggers]] id = "trigger-visualization-agent" type = "http" timeout = 30 [triggers.configuration] path = "agents/visualization-agent/process" retry = 1 authenticationType = "private" ``` #### Step 3: Deploy to Blaxel ```bash cd visualization-agent bl deploy --env-file .env ``` #### Step 4: Verify Deployment The agent will be available at: ``` https://run.blaxel.ai/{workspace}/agents/visualization-agent ``` Test the deployed agent: ```bash curl -X POST https://run.blaxel.ai/{workspace}/agents/visualization-agent \ -H "Content-Type: application/json" \ -H "Authorization: Bearer {BLAXEL_API_KEY}" \ -d @test_request.json ``` ### Configuration Options #### Runtime Configuration | Parameter | Default | Description | |-----------|---------|-------------| | `timeout` | 30 | Maximum execution time (seconds) | | `memory` | 512 | Memory limit (MB) | | `retry` | 1 | Number of retry attempts | #### Model Configuration | Parameter | Default | Description | |-----------|---------|-------------| | `GEMINI_MODEL` | gemini-2.5-flash-image | Gemini model version | | `GEMINI_API_KEY` | (required) | Google Gemini API key | #### Image Configuration - **Resolution**: 1024x1024 (fixed) - **Format**: PNG - **Watermark**: SynthID (automatic) ### Integration with Orchestrator The orchestrator agent calls the visualization agent automatically. To integrate: 1. **Update orchestrator's `blaxel.toml`**: ```toml [[resources]] id = "visualization-agent" type = "agent" name = "visualization-agent" ``` 2. **Orchestrator calls visualization agent**: ```python visualization_response = await self.execute_visualization( risk_data=risk_data, building_type=building_type, recommendations=recommendations ) ``` 3. **Response flows to Gradio UI**: - Image displayed in visualization tab - Metadata shown alongside image - Features list displayed ### Monitoring and Logs #### View Logs ```bash bl logs visualization-agent ``` #### Monitor Performance Key metrics to monitor: - **Generation Time**: Should be < 30 seconds - **Success Rate**: Should be > 95% - **Error Rate**: Monitor for rate limit errors - **Memory Usage**: Should stay under 512MB #### Common Issues 1. **Timeout Errors**: - Increase timeout in `blaxel.toml` - Simplify prompts - Check Gemini API status 2. **Rate Limit Errors**: - Implement request throttling - Upgrade Gemini API quota - Add retry logic with backoff 3. **Memory Issues**: - Increase memory limit in `blaxel.toml` - Optimize image processing - Check for memory leaks ### Scaling Considerations For high-volume deployments: 1. **Increase Memory**: Set to 1024MB for better performance 2. **Add Caching**: Cache generated images for identical requests 3. **Load Balancing**: Deploy multiple instances 4. **Rate Limiting**: Implement request queuing 5. **Monitoring**: Set up alerts for errors and performance ### Security Best Practices 1. **API Key Management**: - Never commit API keys to version control - Use environment variables only - Rotate keys regularly 2. **Authentication**: - Use `authenticationType = "private"` in blaxel.toml - Require BLAXEL_API_KEY for all requests - Validate request signatures 3. **Input Validation**: - Validate all input parameters - Sanitize location data - Check building type against allowed values 4. **Output Security**: - Ensure generated images don't contain sensitive data - Add watermarks (automatic with SynthID) - Log all generation requests ## Testing ### Unit Tests Run the unit test suite: ```bash python test_agent.py ``` The test suite includes: #### Agent Initialization Tests - Verify agent initializes with correct configuration - Check Gemini API client setup - Validate environment variable loading #### Building Type Tests - **Residential Single Family**: High seismic risk scenario - **Commercial Office**: Flood risk scenario - **Institutional School**: Multiple hazards (volcanic + seismic) - **Industrial Warehouse**: Wind resistance scenario - **Infrastructure Bridge**: Multi-hazard scenario #### Risk Scenario Tests - High seismic risk with active fault - High flood risk in coastal area - High volcanic risk with ashfall - Multiple hazards combined - Low risk baseline scenario #### Error Handling Tests - Invalid API key handling - Network error simulation - Timeout handling - Rate limit error handling - Invalid input validation #### Response Format Tests - Base64 encoding validation - Metadata completeness - Timestamp format verification - Features list accuracy ### Integration Tests Test the HTTP endpoint: ```bash python test_http_endpoint.py ``` Integration tests cover: - POST endpoint functionality - Request validation - Response format compatibility - Error response handling - Orchestrator integration ### Manual Testing Test with real Gemini API: 1. **Set up environment**: ```bash export GEMINI_API_KEY=your_api_key ``` 2. **Run test script**: ```bash python test_agent.py ``` 3. **Verify output**: - Check generated images are valid PNG files - Verify disaster-resistant features are visible - Confirm metadata is accurate - Validate generation time < 30 seconds ### Test Coverage Current test coverage: - **Prompt Generation**: 100% - **API Client**: 95% (excluding live API calls) - **Response Formatting**: 100% - **Error Handling**: 100% - **Integration**: 90% ### Performance Testing Test performance metrics: ```bash # Test generation time time python test_agent.py # Test concurrent requests python -m pytest test_concurrent.py -n 5 # Test memory usage python -m memory_profiler test_agent.py ``` Expected performance: - **Generation Time**: 10-20 seconds average - **Memory Usage**: < 300MB per request - **Success Rate**: > 95% - **Concurrent Requests**: 5 simultaneous requests supported ## Performance - **Generation Time**: 10-20 seconds typical - **Image Size**: ~500KB - 2MB per image - **Resolution**: 1024x1024 pixels - **Format**: PNG with SynthID watermark ## Troubleshooting ### Common Issues and Solutions #### Issue: "Invalid API Key" Error **Symptoms**: ```json { "error": { "code": "AUTH_ERROR", "message": "Invalid or missing Gemini API key" } } ``` **Solutions**: 1. Check `.env` file contains `GEMINI_API_KEY=your_key` 2. Verify API key is valid at [Google AI Studio](https://makersuite.google.com/app/apikey) 3. Ensure no extra spaces or quotes around the key 4. Restart the agent after updating `.env` #### Issue: Rate Limit Exceeded **Symptoms**: ```json { "error": { "code": "RATE_LIMIT", "message": "API quota exceeded" } } ``` **Solutions**: 1. Wait 60 seconds before retrying 2. Check your API quota at Google AI Studio 3. Upgrade to higher quota tier if needed 4. Implement request throttling in orchestrator #### Issue: Generation Timeout **Symptoms**: - Request takes longer than 30 seconds - Timeout error returned **Solutions**: 1. Simplify the prompt (reduce number of features) 2. Check Gemini API status 3. Increase timeout in `blaxel.toml` (not recommended) 4. Retry the request #### Issue: Poor Quality Visualizations **Symptoms**: - Generated images don't show disaster-resistant features clearly - Building type doesn't match expectations **Solutions**: 1. Verify risk data is accurate and complete 2. Check building type is correct 3. Ensure hazard severity levels are set appropriately 4. Review prompt generation logic in `agent.py` #### Issue: Network Errors **Symptoms**: ```json { "error": { "code": "NETWORK_ERROR", "message": "Connection failed" } } ``` **Solutions**: 1. Check internet connection 2. Verify firewall allows HTTPS to Google APIs 3. Check proxy settings if applicable 4. Retry with exponential backoff #### Issue: Memory Errors **Symptoms**: - Agent crashes with out-of-memory error - Slow performance **Solutions**: 1. Increase memory limit in `blaxel.toml` to 1024MB 2. Check for memory leaks in custom code 3. Reduce concurrent request limit 4. Monitor memory usage with profiling tools ### Debug Mode Enable debug logging: ```python import logging logging.basicConfig(level=logging.DEBUG) ``` This will show: - Detailed API request/response logs - Prompt generation steps - Error stack traces - Performance metrics ### Getting Help If you encounter issues not covered here: 1. Check the logs: `bl logs visualization-agent` 2. Review the [Gemini API documentation](https://ai.google.dev/docs) 3. Check the main project documentation 4. Contact support with: - Error message and code - Request payload (sanitized) - Timestamp of the error - Agent version and configuration ## Limitations ### Technical Limitations - **Internet Connection**: Requires active internet for Gemini API - **API Rate Limits**: Subject to Gemini API quotas (varies by tier) - **Generation Time**: 10-30 seconds per image (cannot be reduced) - **Resolution**: Fixed at 1024x1024 pixels - **Format**: PNG only (no JPEG, SVG, or other formats) - **Watermark**: SynthID watermark automatically added (cannot be removed) ### Functional Limitations - **Artistic Interpretation**: Generated images are conceptual sketches, not engineering drawings - **Feature Visibility**: Some structural features may not be clearly visible in exterior views - **Accuracy**: AI-generated images may not perfectly represent all specified features - **Consistency**: Multiple generations with same prompt may produce different results - **Detail Level**: Cannot generate detailed floor plans or technical specifications ### Geographic Limitations - **Philippine Context**: Optimized for Philippine architecture and climate - **Location Data**: Requires valid Philippine coordinates - **Regional Styles**: May not accurately represent all regional architectural variations ### Use Case Limitations **Appropriate Uses**: - Conceptual visualization for stakeholders - Initial design exploration - Communication tool for non-technical audiences - Marketing and presentation materials **Inappropriate Uses**: - Engineering drawings or construction blueprints - Structural analysis or calculations - Building permit applications - Detailed cost estimation basis - Legal or contractual documentation ## Integration The Visualization Agent integrates with: - **Orchestrator Agent**: Receives requests and returns visualization data - **Gradio UI**: Displays generated images in the web interface - **Risk Assessment Agent**: Uses risk data to inform feature selection ## Environment Variables ### Required Variables - **`GEMINI_API_KEY`** (required): Google Gemini API key for image generation - Get your API key from: https://makersuite.google.com/app/apikey - Alternative: `GOOGLE_API_KEY` can be used instead of `GEMINI_API_KEY` ### Optional Variables - **`VISUALIZATION_MODEL`** (optional): Gemini model version to use - Default: `gemini-2.5-flash-image` - Options: `gemini-2.5-flash-image`, `gemini-3-pro-image-preview` - Example: `VISUALIZATION_MODEL=gemini-2.5-flash-image` - **`VISUALIZATION_OUTPUT_DIR`** (optional): Directory where generated images will be saved - Default: `./generated_images` - Example: `VISUALIZATION_OUTPUT_DIR=./my_images` - Note: Directory will be created automatically if it doesn't exist ### Environment Variable Priority The agent loads configuration in the following priority order (highest to lowest): 1. **Constructor parameters**: Values passed directly to `VisualizationAgent()` 2. **Environment variables**: Values from `.env` file or system environment 3. **Default values**: Built-in defaults Example: ```python # Priority 1: Constructor parameter (highest) agent = VisualizationAgent(model="gemini-3-pro-image-preview") # Priority 2: Environment variable # VISUALIZATION_MODEL=gemini-2.5-flash-image # Priority 3: Default value (lowest) # Default: gemini-2.5-flash-image ``` ### Setting Environment Variables #### Local Development Create a `.env` file in the `visualization-agent` directory: ```bash # Required GEMINI_API_KEY=your_gemini_api_key_here # Optional VISUALIZATION_MODEL=gemini-2.5-flash-image VISUALIZATION_OUTPUT_DIR=./generated_images ``` #### Blaxel Deployment Set environment variables in `blaxel.toml`: ```toml [env] GEMINI_API_KEY = "${GEMINI_API_KEY}" VISUALIZATION_MODEL = "${VISUALIZATION_MODEL}" VISUALIZATION_OUTPUT_DIR = "${VISUALIZATION_OUTPUT_DIR}" ``` Then deploy with environment file: ```bash bl deploy --env-file .env ``` #### System Environment Set environment variables in your shell: ```bash # Bash/Zsh export GEMINI_API_KEY=your_api_key export VISUALIZATION_MODEL=gemini-2.5-flash-image export VISUALIZATION_OUTPUT_DIR=./generated_images # Windows Command Prompt set GEMINI_API_KEY=your_api_key set VISUALIZATION_MODEL=gemini-2.5-flash-image set VISUALIZATION_OUTPUT_DIR=./generated_images # Windows PowerShell $env:GEMINI_API_KEY="your_api_key" $env:VISUALIZATION_MODEL="gemini-2.5-flash-image" $env:VISUALIZATION_OUTPUT_DIR="./generated_images" ``` ## Best Practices ### Prompt Optimization 1. **Be Specific**: Include detailed hazard information for better feature selection 2. **Prioritize Hazards**: Focus on the most critical risks (high severity) 3. **Provide Context**: Include location data for better Philippine context 4. **Use Recommendations**: Pass construction recommendations when available ### Performance Optimization 1. **Batch Requests**: Group multiple visualizations when possible 2. **Cache Results**: Cache generated images for identical requests 3. **Async Processing**: Use async/await for concurrent requests 4. **Monitor Quotas**: Track API usage to avoid rate limits ### Error Handling 1. **Implement Retries**: Retry transient errors with exponential backoff 2. **Graceful Degradation**: Continue without visualization if generation fails 3. **Log Errors**: Log all errors with full context for debugging 4. **User Feedback**: Provide clear error messages to users ### Security 1. **Protect API Keys**: Never expose GEMINI_API_KEY in client code 2. **Validate Input**: Always validate and sanitize input parameters 3. **Rate Limiting**: Implement rate limiting to prevent abuse 4. **Monitor Usage**: Track API usage and set up alerts ### Integration 1. **Async Calls**: Call visualization agent asynchronously from orchestrator 2. **Timeout Handling**: Set appropriate timeouts (30+ seconds) 3. **Fallback Logic**: Have fallback behavior if visualization fails 4. **Response Validation**: Validate response format before using ## Frequently Asked Questions ### General Questions **Q: How long does it take to generate a visualization?** A: Typically 10-20 seconds, with a maximum timeout of 30 seconds. **Q: Can I generate multiple visualizations for the same building?** A: Yes, but each request may produce slightly different results due to AI generation variability. **Q: What image format is returned?** A: PNG format, base64-encoded, at 1024x1024 resolution. **Q: Can I remove the SynthID watermark?** A: No, the watermark is automatically added by Gemini API and cannot be removed. ### Technical Questions **Q: Can I use a different Gemini model?** A: Yes, set `GEMINI_MODEL` environment variable, but gemini-2.5-flash-image is recommended for speed. **Q: How do I increase the image resolution?** A: Currently fixed at 1024x1024. Higher resolutions may be supported in future Gemini models. **Q: Can I generate images without an internet connection?** A: No, the agent requires internet access to call the Gemini API. **Q: How many concurrent requests can the agent handle?** A: Up to 5 concurrent requests, limited by memory and API quotas. ### Integration Questions **Q: How does the orchestrator call the visualization agent?** A: Via HTTP POST to the Blaxel endpoint with risk data and building type. **Q: What happens if visualization generation fails?** A: The orchestrator continues without visualization data, and the UI shows a message. **Q: Can I call the visualization agent directly from the UI?** A: Not recommended. Always call through the orchestrator for proper coordination. **Q: How is the generated image displayed in the UI?** A: The Gradio UI decodes the base64 image and displays it in the visualization tab. ### Cost and Limits **Q: How much does it cost to generate a visualization?** A: Depends on your Gemini API plan. Check Google AI Studio for pricing. **Q: What are the rate limits?** A: Free tier: 60 requests/minute. Paid tiers vary by plan. **Q: Can I increase my API quota?** A: Yes, upgrade your Gemini API plan at Google AI Studio. **Q: Is there a limit on the number of visualizations?** A: Only limited by your API quota and rate limits. ### Troubleshooting **Q: Why am I getting "Invalid API Key" errors?** A: Check that GEMINI_API_KEY is set correctly in your .env file and is valid. **Q: Why are my visualizations timing out?** A: Check your internet connection and Gemini API status. Simplify prompts if needed. **Q: Why don't the disaster-resistant features show clearly?** A: Ensure risk data is accurate and hazard severity is set appropriately. AI generation may vary. **Q: How do I debug generation issues?** A: Enable debug logging and check the prompt_used field in the response. ## Roadmap ### Planned Features - **Multiple View Angles**: Generate front, side, and aerial views - **Before/After Comparisons**: Show standard vs. disaster-resistant designs - **Higher Resolution**: Support 4K resolution with Gemini 3 Pro - **Style Variations**: Allow users to choose architectural styles - **Annotation Overlay**: Add labels pointing to disaster-resistant features - **Interactive Refinement**: Support multi-turn conversations for improvements - **Cost Visualization**: Overlay cost information on the visualization - **3D Models**: Generate 3D models in addition to 2D sketches ### Future Enhancements - Caching layer for identical requests - Batch processing for multiple buildings - Custom style templates - Integration with CAD software - Export to additional formats (SVG, PDF) - Localization for other languages ## Contributing This agent is part of the Disaster Risk Construction Planner system. For contributions: 1. Follow the existing code structure and patterns 2. Add tests for new features 3. Update documentation 4. Ensure compatibility with orchestrator and UI ## Version History - **v1.0.0** (2024-01): Initial release - Basic visualization generation - Support for 10 building types - Integration with orchestrator - Gemini 2.5 Flash Image model ## License Part of the Disaster Risk Construction Planner system. ## Support For issues or questions: - Check this documentation first - Review the troubleshooting section - Check the main project documentation - Review Gemini API documentation at [Google AI Studio](https://ai.google.dev/docs) ## Acknowledgments - Google Gemini API for image generation - Blaxel platform for agent deployment - Philippines Disaster Risk data sources - Open-source community for tools and libraries