dexteredep's picture
Add visualization
9b24c4d

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

Visualization Agent

AI agent that generates architectural sketches of disaster-resistant buildings using Google's Gemini image generation API (Nano Banana).

Overview

The Visualization Agent receives risk assessment data and building specifications to create contextual architectural visualizations. It analyzes disaster risks (seismic, volcanic, hydrometeorological) and generates prompts that incorporate appropriate disaster-resistant features, then uses Gemini's image generation API to create visual representations.

Features

  • Risk-Aware Visualization: Incorporates disaster-resistant features based on risk assessment
  • Building Type Support: Generates appropriate architecture for residential, commercial, institutional, industrial, and infrastructure projects
  • Philippine Context: Includes tropical climate and local architectural considerations
  • Fast Generation: Uses gemini-2.5-flash-image model for quick results (~10-20 seconds)
  • Detailed Metadata: Returns prompt used, features included, and generation timestamp
  • Backward Compatibility: Supports both legacy format (risk_data + building_type) and new format (prompt + construction_data)

Architecture

High-Level Architecture

Orchestrator Agent
    ↓
Visualization Agent (FastAPI)
    β”œβ”€β†’ Request Validator
    β”œβ”€β†’ Prompt Generator
    β”‚   β”œβ”€β†’ Hazard Analyzer
    β”‚   β”œβ”€β†’ Feature Mapper
    β”‚   └─→ Context Builder
    β”œβ”€β†’ Gemini API Client
    β”‚   β”œβ”€β†’ API Request Handler
    β”‚   β”œβ”€β†’ Error Handler
    β”‚   └─→ Response Parser
    └─→ Response Formatter
        β”œβ”€β†’ Base64 Encoder
        β”œβ”€β†’ Metadata Generator
        └─→ Feature List Compiler
    ↓
Returns VisualizationData to Orchestrator
    ↓
Gradio UI (displays image)

Component Details

1. VisualizationAgent (Main Class)

Responsibilities:

  • Orchestrate the visualization generation process
  • Coordinate between prompt generator and API client
  • Handle errors and format responses

Key Methods:

  • generate_visualization(): Main entry point
  • _validate_input(): Validate request parameters
  • _format_response(): Format final response with metadata

2. PromptGenerator

Responsibilities:

  • Analyze risk data to identify relevant hazards
  • Map hazards to visual features
  • Generate descriptive prompts for Gemini API
  • Add Philippine architectural context

Key Methods:

  • generate_prompt(): Create complete prompt
  • _extract_hazard_features(): Extract features from risk data
  • _get_building_description(): Get building type description
  • _add_philippine_context(): Add contextual elements
  • _prioritize_features(): Prioritize features for multi-hazard scenarios

Feature Mapping Logic:

# Seismic hazards
if risk_data.seismic_risk == "high":
    features.append("Reinforced concrete frame with cross-bracing")
    features.append("Moment-resisting frames")
    
# Flood hazards
if risk_data.flood_risk == "high":
    features.append("Elevated first floor on stilts")
    
# Volcanic hazards
if risk_data.volcanic_risk == "high":
    features.append("Steep-pitched roof for ash shedding")

3. GeminiAPIClient

Responsibilities:

  • Communicate with Google Gemini API
  • Handle API authentication
  • Manage timeouts and retries
  • Parse API responses

Key Methods:

  • generate_image(): Call Gemini API
  • _handle_api_error(): Convert API errors to structured format
  • _validate_response(): Validate API response

API Configuration:

  • Model: gemini-2.5-flash-image
  • Resolution: 1024x1024
  • Format: PNG
  • Timeout: 30 seconds

4. Response Formatter

Responsibilities:

  • Encode image data to base64
  • Generate metadata
  • Compile features list
  • Format final response

Metadata Included:

  • Prompt used for generation
  • Model version
  • Generation timestamp (ISO 8601)
  • Image format and resolution
  • List of disaster-resistant features

Data Flow

1. Request arrives at FastAPI endpoint
   ↓
2. Request validation (Pydantic models)
   ↓
3. VisualizationAgent.generate_visualization()
   ↓
4. PromptGenerator.generate_prompt()
   - Analyze risk_data
   - Extract hazard features
   - Get building description
   - Add Philippine context
   - Compile final prompt
   ↓
5. GeminiAPIClient.generate_image()
   - Send prompt to Gemini API
   - Wait for response (10-20 seconds)
   - Receive image bytes
   ↓
6. Response Formatter
   - Encode image to base64
   - Generate metadata
   - Compile features list
   ↓
7. Return VisualizationResponse
   ↓
8. Orchestrator receives response
   ↓
9. Gradio UI displays image

Error Handling Flow

Error occurs at any stage
   ↓
Error caught by try/except block
   ↓
Error categorized (AUTH, RATE_LIMIT, NETWORK, etc.)
   ↓
ErrorDetail object created
   ↓
Response with success=false returned
   ↓
Orchestrator handles error gracefully
   ↓
UI shows error message or continues without visualization

Technology Stack

  • Framework: FastAPI (HTTP server)
  • AI API: Google Gemini (gemini-2.5-flash-image)
  • Image Processing: Pillow (PIL)
  • Data Validation: Pydantic v2
  • Deployment: Blaxel platform
  • Language: Python 3.11+

Performance Characteristics

  • Latency: 10-20 seconds typical (Gemini API call)
  • Throughput: 5 concurrent requests
  • Memory: ~300MB per request
  • CPU: Minimal (mostly I/O bound)
  • Network: ~2-5MB per request (image download)

Installation

Prerequisites

  • Python 3.11+
  • Gemini API key (Google AI Studio)

Setup

  1. Install dependencies:
cd visualization-agent
pip install -r requirements.txt
  1. Configure environment variables:
cp .env.example .env
# Edit .env and add your GEMINI_API_KEY
  1. Test the agent:
python test_agent.py

Usage

As HTTP Service

Start the FastAPI server:

python main.py

Send POST request to generate visualization:

curl -X POST http://localhost:8000/ \
  -H "Content-Type: application/json" \
  -d '{
    "risk_data": {
      "seismic_risk": "high",
      "flood_risk": "medium",
      "location": {"latitude": 14.5995, "longitude": 120.9842}
    },
    "building_type": "residential_single_family",
    "recommendations": {...}
  }'

As Python Module

from agent import VisualizationAgent

agent = VisualizationAgent()

visualization_data = agent.generate_visualization(
    risk_data=risk_data,
    building_type="residential_single_family",
    recommendations=recommendations
)

# Access generated image
image_base64 = visualization_data.image_base64
prompt_used = visualization_data.prompt_used
features = visualization_data.features_included

API Reference

Endpoint

POST /

Request Formats

The agent supports three request formats for backward compatibility:

Format 1: Legacy Format (risk_data + building_type)

This format is supported for backward compatibility with older orchestrator versions:

{
    "risk_data": {
        "location": {...},
        "hazards": {...}
    },
    "building_type": "residential_single_family",
    "recommendations": {...}  // optional
}

The agent automatically converts this to the new format by:

  1. Generating a prompt based on building_type
  2. Creating construction_data from risk_data, building_type, and recommendations
  3. Processing as a context-aware request

Format 2: New Format (prompt + construction_data)

This is the recommended format for new integrations:

{
    "prompt": "A disaster-resistant school building in the Philippines",
    "construction_data": {
        "building_type": "institutional_school",
        "location": {...},
        "risk_data": {...},
        "recommendations": {...}
    },
    "config": {
        "aspect_ratio": "16:9",
        "image_size": "1K"
    }
}

Format 3: Basic Format (prompt only)

For simple use cases without context:

{
    "prompt": "A modern disaster-resistant building in the Philippines"
}

Request Format Details

Complete Request Schema

{
    "risk_data": {
        "seismic_risk": str,           # "low", "medium", "high"
        "flood_risk": str,              # "low", "medium", "high"
        "volcanic_risk": str,           # "low", "medium", "high"
        "location": {
            "latitude": float,          # 4.0 to 21.0 (Philippines)
            "longitude": float,         # 116.0 to 127.0 (Philippines)
            "municipality": str,        # Optional
            "province": str             # Optional
        },
        "hazards": [                    # Optional, detailed hazard list
            {
                "type": str,            # "seismic", "volcanic", "hydrometeorological"
                "category": str,        # Specific hazard category
                "severity": str,        # "low", "medium", "high"
                "description": str      # Human-readable description
            }
        ]
    },
    "building_type": str,               # See Building Types section
    "recommendations": {                # Optional
        "structural": [
            {
                "category": str,
                "priority": str,
                "description": str
            }
        ]
    }
}

Minimal Request

{
    "risk_data": {
        "seismic_risk": "high",
        "flood_risk": "low",
        "volcanic_risk": "low",
        "location": {
            "latitude": 14.5995,
            "longitude": 120.9842
        }
    },
    "building_type": "residential_single_family"
}

Response Format

Success Response

{
    "success": true,
    "visualization_data": {
        "image_base64": str,            # Base64-encoded PNG image
        "prompt_used": str,             # Full prompt sent to Gemini
        "model_version": str,           # "gemini-2.5-flash-image"
        "generation_timestamp": str,    # ISO 8601 format
        "image_format": "PNG",          # Always PNG
        "resolution": "1024x1024",      # Always 1024x1024
        "features_included": [str]      # List of disaster-resistant features
    },
    "error": null
}

Error Response

{
    "success": false,
    "visualization_data": null,
    "error": {
        "code": str,                    # Error code (see Error Codes section)
        "message": str,                 # Human-readable error message
        "retry_possible": bool          # Whether retry is recommended
    }
}

Request Parameters

risk_data (required)

Field Type Required Description
seismic_risk string Yes Overall seismic risk level: "low", "medium", "high"
flood_risk string Yes Overall flood risk level: "low", "medium", "high"
volcanic_risk string Yes Overall volcanic risk level: "low", "medium", "high"
location object Yes Geographic location data
hazards array No Detailed hazard information

location (required)

Field Type Required Description
latitude float Yes Latitude (4.0 to 21.0 for Philippines)
longitude float Yes Longitude (116.0 to 127.0 for Philippines)
municipality string No Municipality name
province string No Province name

building_type (required)

Value Description
residential_single_family Single-family home
residential_multi_family Multi-family residential (2-4 units)
residential_high_rise High-rise apartment building
commercial_office Modern office building
commercial_retail Retail shopping center
industrial_warehouse Industrial warehouse facility
institutional_school School building
institutional_hospital Hospital or healthcare facility
infrastructure_bridge Bridge structure
mixed_use Mixed-use development

recommendations (optional)

Optional construction recommendations from research agent. If provided, may influence feature selection.

Response Fields

visualization_data

Field Type Description
image_base64 string Base64-encoded PNG image data
prompt_used string Complete prompt sent to Gemini API
model_version string Gemini model version used
generation_timestamp string ISO 8601 timestamp of generation
image_format string Always "PNG"
resolution string Always "1024x1024"
features_included array List of disaster-resistant features shown

error

Field Type Description
code string Error code (see Error Codes section)
message string Human-readable error description
retry_possible boolean Whether the request can be retried

HTTP Status Codes

Status Code Description
200 Success (check success field in response)
400 Bad Request (invalid input parameters)
401 Unauthorized (invalid API key)
429 Too Many Requests (rate limit exceeded)
500 Internal Server Error
504 Gateway Timeout (generation took > 30 seconds)

Rate Limits

  • Free Tier: 60 requests per minute
  • Paid Tier: Varies by plan
  • Concurrent Requests: Maximum 5 simultaneous requests

Authentication

When deployed on Blaxel:

curl -X POST https://run.blaxel.ai/{workspace}/agents/visualization-agent \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {BLAXEL_API_KEY}" \
  -d @request.json

Content Type

All requests and responses use application/json.

Supported Building Types

The agent supports 10 building type categories, each with specific architectural characteristics:

Residential Buildings

Building Type Code Description Typical Features
Single Family residential_single_family Single-family home 1-2 stories, pitched roof, residential scale
Multi Family residential_multi_family Multi-family residential (2-4 units) 2-3 stories, multiple entrances, shared spaces
High Rise residential_high_rise High-rise apartment building 10+ stories, elevator core, balconies

Commercial Buildings

Building Type Code Description Typical Features
Office commercial_office Modern office building 3-10 stories, glass facade, modern design
Retail commercial_retail Retail shopping center 1-2 stories, large windows, parking area

Industrial Buildings

Building Type Code Description Typical Features
Warehouse industrial_warehouse Industrial warehouse facility Large open space, high ceilings, loading docks

Institutional Buildings

Building Type Code Description Typical Features
School institutional_school School building with classrooms 1-3 stories, multiple wings, playground area
Hospital institutional_hospital Hospital or healthcare facility 3-5 stories, emergency entrance, medical design

Infrastructure

Building Type Code Description Typical Features
Bridge infrastructure_bridge Bridge structure Span structure, support columns, roadway

Mixed Use

Building Type Code Description Typical Features
Mixed Use mixed_use Mixed-use development Commercial ground floor, residential upper floors

Building Type Selection Guide

Choose the appropriate building type based on your project:

  • Residential Projects: Use residential_single_family for houses, residential_multi_family for apartments/condos, residential_high_rise for towers
  • Commercial Projects: Use commercial_office for office buildings, commercial_retail for shops/malls
  • Industrial Projects: Use industrial_warehouse for factories, warehouses, distribution centers
  • Public Buildings: Use institutional_school for schools, institutional_hospital for hospitals/clinics
  • Infrastructure: Use infrastructure_bridge for bridges, overpasses
  • Mixed Projects: Use mixed_use for buildings combining residential and commercial spaces

Prompt Generation Strategy

The Visualization Agent uses a sophisticated prompt generation strategy to create contextual, risk-aware architectural visualizations.

Prompt Template Structure

[Building Type Description] in the Philippines, designed for disaster resistance.

Key Features:
- [Hazard-specific feature 1]
- [Hazard-specific feature 2]
- [Hazard-specific feature 3]

Architectural Style: [Philippine context]
Setting: [Tropical environment with appropriate landscaping]
Perspective: [Exterior view showing structural features]
Style: Architectural sketch, professional rendering

Feature Prioritization

When multiple hazards are present, the agent prioritizes features based on:

  1. Risk Level: High-risk hazards get priority over medium/low
  2. Structural Impact: Features that affect the entire building structure
  3. Visual Prominence: Features that are clearly visible in architectural sketches

Philippine Context Integration

The agent automatically adds contextual elements:

  • Tropical climate considerations (ventilation, sun protection)
  • Local architectural styles and materials
  • Appropriate landscaping (palm trees, tropical vegetation)
  • Regional building practices

Hazard-to-Feature Mappings

The agent uses detailed mappings to translate risk data into visual features:

Seismic Hazards

Hazard Type Risk Level Visual Features
Active Fault High Reinforced concrete frame with visible cross-bracing
Ground Shaking High Moment-resisting frames, shear walls
Liquefaction Medium-High Deep pile foundation visible at base
Earthquake All Structural reinforcements, seismic joints

Example Features:

  • Reinforced concrete frame with cross-bracing
  • Moment-resisting frames
  • Shear walls
  • Deep pile foundations
  • Seismic isolation systems

Volcanic Hazards

Hazard Type Risk Level Visual Features
Ashfall High Steep-pitched roof (45Β°+ angle) for ash shedding
Pyroclastic Flow High Reinforced concrete construction, protective barriers
Lahar Medium-High Elevated foundation, diversion channels
Volcanic Activity All Robust roof structure, sealed openings

Example Features:

  • Steep-pitched roof for ash shedding
  • Reinforced concrete construction
  • Protective barriers and walls
  • Elevated foundation
  • Sealed ventilation systems

Hydrometeorological Hazards

Hazard Type Risk Level Visual Features
Flood High Elevated first floor on stilts (2-3 meters)
Storm Surge High Coastal reinforcement, breakwaters
Severe Winds High Aerodynamic roof design, hurricane straps
Typhoon High Wind-resistant construction, storm shutters
Landslide Medium-High Retaining walls, terraced foundation

Example Features:

  • Elevated first floor on stilts
  • Raised foundation
  • Flood barriers
  • Aerodynamic roof design
  • Hurricane straps
  • Storm shutters
  • Retaining walls
  • Terraced foundation

Multi-Hazard Scenarios

When multiple hazards are present, the agent combines features intelligently:

Example: High Seismic + High Flood

  • Elevated foundation on reinforced concrete piles
  • Moment-resisting frames visible in structure
  • Cross-bracing on elevated sections

Example: High Volcanic + Medium Wind

  • Steep-pitched roof with aerodynamic design
  • Reinforced concrete construction
  • Storm shutters on windows

Example Requests and Responses

Example 1: Residential Building with High Seismic Risk

Request:

{
  "risk_data": {
    "seismic_risk": "high",
    "flood_risk": "low",
    "volcanic_risk": "low",
    "location": {
      "latitude": 14.5995,
      "longitude": 120.9842,
      "municipality": "Manila",
      "province": "Metro Manila"
    },
    "hazards": [
      {
        "type": "seismic",
        "category": "active_fault",
        "severity": "high",
        "description": "Near active fault line"
      }
    ]
  },
  "building_type": "residential_single_family",
  "recommendations": {
    "structural": [
      {
        "category": "foundation",
        "priority": "critical",
        "description": "Use reinforced concrete foundation with seismic isolation"
      }
    ]
  }
}

Response:

{
  "success": true,
  "visualization_data": {
    "image_base64": "iVBORw0KGgoAAAANSUhEUgAA...(truncated)",
    "prompt_used": "Single-family home in the Philippines, designed for disaster resistance.\n\nKey Features:\n- Reinforced concrete frame with visible cross-bracing\n- Moment-resisting frames for earthquake protection\n- Deep pile foundation visible at base\n\nArchitectural Style: Modern Filipino residential with tropical design elements\nSetting: Tropical environment with palm trees and lush vegetation\nPerspective: Exterior view showing structural reinforcements\nStyle: Architectural sketch, professional rendering",
    "model_version": "gemini-2.5-flash-image",
    "generation_timestamp": "2024-01-15T10:30:45.123Z",
    "image_format": "PNG",
    "resolution": "1024x1024",
    "features_included": [
      "Reinforced concrete frame with cross-bracing",
      "Moment-resisting frames",
      "Deep pile foundation"
    ]
  }
}

Example 2: Commercial Building with Flood Risk

Request:

{
  "risk_data": {
    "seismic_risk": "low",
    "flood_risk": "high",
    "volcanic_risk": "low",
    "location": {
      "latitude": 10.3157,
      "longitude": 123.8854,
      "municipality": "Cebu City",
      "province": "Cebu"
    },
    "hazards": [
      {
        "type": "hydrometeorological",
        "category": "flood",
        "severity": "high",
        "description": "Flood-prone area"
      }
    ]
  },
  "building_type": "commercial_office"
}

Response:

{
  "success": true,
  "visualization_data": {
    "image_base64": "iVBORw0KGgoAAAANSUhEUgAA...(truncated)",
    "prompt_used": "Modern office building in the Philippines, designed for disaster resistance.\n\nKey Features:\n- Elevated first floor on reinforced concrete stilts (2-3 meters)\n- Flood barriers around perimeter\n- Water-resistant materials for lower levels\n\nArchitectural Style: Contemporary Filipino commercial architecture\nSetting: Urban tropical environment with flood management features\nPerspective: Exterior view showing elevated foundation\nStyle: Architectural sketch, professional rendering",
    "model_version": "gemini-2.5-flash-image",
    "generation_timestamp": "2024-01-15T10:32:18.456Z",
    "image_format": "PNG",
    "resolution": "1024x1024",
    "features_included": [
      "Elevated first floor on stilts",
      "Flood barriers",
      "Water-resistant construction"
    ]
  }
}

Example 3: Multi-Hazard Scenario

Request:

{
  "risk_data": {
    "seismic_risk": "high",
    "flood_risk": "medium",
    "volcanic_risk": "high",
    "location": {
      "latitude": 13.2572,
      "longitude": 123.8144,
      "municipality": "Legazpi",
      "province": "Albay"
    },
    "hazards": [
      {
        "type": "volcanic",
        "category": "ashfall",
        "severity": "high"
      },
      {
        "type": "seismic",
        "category": "earthquake",
        "severity": "high"
      }
    ]
  },
  "building_type": "institutional_school"
}

Response:

{
  "success": true,
  "visualization_data": {
    "image_base64": "iVBORw0KGgoAAAANSUhEUgAA...(truncated)",
    "prompt_used": "School building with classrooms in the Philippines, designed for disaster resistance.\n\nKey Features:\n- Steep-pitched reinforced concrete roof for volcanic ash shedding\n- Reinforced concrete frame with seismic cross-bracing\n- Moment-resisting frames for earthquake protection\n- Protective barriers around building perimeter\n\nArchitectural Style: Institutional Filipino architecture with disaster-resistant design\nSetting: Tropical environment near volcanic area with protective landscaping\nPerspective: Exterior view showing roof design and structural reinforcements\nStyle: Architectural sketch, professional rendering",
    "model_version": "gemini-2.5-flash-image",
    "generation_timestamp": "2024-01-15T10:35:22.789Z",
    "image_format": "PNG",
    "resolution": "1024x1024",
    "features_included": [
      "Steep-pitched roof for ash shedding",
      "Reinforced concrete frame with cross-bracing",
      "Moment-resisting frames",
      "Protective barriers"
    ]
  }
}

Example 4: Error Response

Request:

{
  "risk_data": {...},
  "building_type": "residential_single_family"
}

Response (Invalid API Key):

{
  "success": false,
  "visualization_data": null,
  "error": {
    "code": "AUTH_ERROR",
    "message": "Invalid or missing Gemini API key. Please check your GEMINI_API_KEY environment variable.",
    "retry_possible": false
  }
}

Error Handling

The agent provides comprehensive error handling with detailed error codes and messages.

Error Codes

Error Code Description Retry Possible Recommended Action
AUTH_ERROR Invalid or missing API key No Check GEMINI_API_KEY environment variable
RATE_LIMIT API quota exceeded Yes Wait and retry after delay (typically 60 seconds)
GENERATION_FAILED Image generation failed Yes Retry with same or modified prompt
NETWORK_ERROR Connection issues Yes Check internet connection and retry
TIMEOUT Generation took longer than 30 seconds Yes Retry or simplify prompt
INVALID_INPUT Invalid request parameters No Check request format and parameters
MODEL_ERROR Gemini model error Yes Retry or contact support

Error Response Format

All errors follow this structure:

{
  "success": false,
  "visualization_data": null,
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable error description",
    "retry_possible": true/false
  }
}

Retry Strategy

For errors with retry_possible: true:

  1. Rate Limit Errors: Wait 60 seconds before retrying
  2. Network Errors: Retry immediately, then with exponential backoff (2s, 4s, 8s)
  3. Generation Failures: Retry up to 2 times with same prompt
  4. Timeouts: Retry once, then consider simplifying the prompt

Error Logging

All errors are logged with full context:

  • Request parameters
  • Error type and message
  • Timestamp
  • Stack trace (for debugging)

Example log entry:

2024-01-15 10:30:45 ERROR [VisualizationAgent] Image generation failed
  Error: RATE_LIMIT
  Message: API quota exceeded
  Request: building_type=residential_single_family, location=Manila
  Retry: true

Deployment

Prerequisites

Before deploying, ensure you have:

  1. Blaxel CLI installed:

    pip install blaxel
    
  2. Blaxel account and workspace:

    • Sign up at blaxel.ai
    • Create a workspace
    • Get your API key
  3. Gemini API key:

Local Development

  1. Install dependencies:

    cd visualization-agent
    pip install -r requirements.txt
    
  2. Configure environment:

    cp .env.example .env
    # Edit .env and add:
    # GEMINI_API_KEY=your_api_key_here
    
  3. Run locally:

    python main.py
    
  4. Test the agent:

    python test_agent.py
    

Blaxel Platform Deployment

Step 1: Configure Environment Variables

Create or update .env file:

GEMINI_API_KEY=your_gemini_api_key
GEMINI_MODEL=gemini-2.5-flash-image

Step 2: Review Configuration

Check blaxel.toml configuration:

name = "visualization-agent"
type = "agent"

[env]
GEMINI_API_KEY = "${GEMINI_API_KEY}"
GEMINI_MODEL = "${GEMINI_MODEL}"

[runtime]
timeout = 30
memory = 512

[entrypoint]
prod = "python main.py"

[[triggers]]
id = "trigger-visualization-agent"
type = "http"
timeout = 30

[triggers.configuration]
path = "agents/visualization-agent/process"
retry = 1
authenticationType = "private"

Step 3: Deploy to Blaxel

cd visualization-agent
bl deploy --env-file .env

Step 4: Verify Deployment

The agent will be available at:

https://run.blaxel.ai/{workspace}/agents/visualization-agent

Test the deployed agent:

curl -X POST https://run.blaxel.ai/{workspace}/agents/visualization-agent \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {BLAXEL_API_KEY}" \
  -d @test_request.json

Configuration Options

Runtime Configuration

Parameter Default Description
timeout 30 Maximum execution time (seconds)
memory 512 Memory limit (MB)
retry 1 Number of retry attempts

Model Configuration

Parameter Default Description
GEMINI_MODEL gemini-2.5-flash-image Gemini model version
GEMINI_API_KEY (required) Google Gemini API key

Image Configuration

  • Resolution: 1024x1024 (fixed)
  • Format: PNG
  • Watermark: SynthID (automatic)

Integration with Orchestrator

The orchestrator agent calls the visualization agent automatically. To integrate:

  1. Update orchestrator's blaxel.toml:

    [[resources]]
    id = "visualization-agent"
    type = "agent"
    name = "visualization-agent"
    
  2. Orchestrator calls visualization agent:

    visualization_response = await self.execute_visualization(
        risk_data=risk_data,
        building_type=building_type,
        recommendations=recommendations
    )
    
  3. Response flows to Gradio UI:

    • Image displayed in visualization tab
    • Metadata shown alongside image
    • Features list displayed

Monitoring and Logs

View Logs

bl logs visualization-agent

Monitor Performance

Key metrics to monitor:

  • Generation Time: Should be < 30 seconds
  • Success Rate: Should be > 95%
  • Error Rate: Monitor for rate limit errors
  • Memory Usage: Should stay under 512MB

Common Issues

  1. Timeout Errors:

    • Increase timeout in blaxel.toml
    • Simplify prompts
    • Check Gemini API status
  2. Rate Limit Errors:

    • Implement request throttling
    • Upgrade Gemini API quota
    • Add retry logic with backoff
  3. Memory Issues:

    • Increase memory limit in blaxel.toml
    • Optimize image processing
    • Check for memory leaks

Scaling Considerations

For high-volume deployments:

  1. Increase Memory: Set to 1024MB for better performance
  2. Add Caching: Cache generated images for identical requests
  3. Load Balancing: Deploy multiple instances
  4. Rate Limiting: Implement request queuing
  5. Monitoring: Set up alerts for errors and performance

Security Best Practices

  1. API Key Management:

    • Never commit API keys to version control
    • Use environment variables only
    • Rotate keys regularly
  2. Authentication:

    • Use authenticationType = "private" in blaxel.toml
    • Require BLAXEL_API_KEY for all requests
    • Validate request signatures
  3. Input Validation:

    • Validate all input parameters
    • Sanitize location data
    • Check building type against allowed values
  4. Output Security:

    • Ensure generated images don't contain sensitive data
    • Add watermarks (automatic with SynthID)
    • Log all generation requests

Testing

Unit Tests

Run the unit test suite:

python test_agent.py

The test suite includes:

Agent Initialization Tests

  • Verify agent initializes with correct configuration
  • Check Gemini API client setup
  • Validate environment variable loading

Building Type Tests

  • Residential Single Family: High seismic risk scenario
  • Commercial Office: Flood risk scenario
  • Institutional School: Multiple hazards (volcanic + seismic)
  • Industrial Warehouse: Wind resistance scenario
  • Infrastructure Bridge: Multi-hazard scenario

Risk Scenario Tests

  • High seismic risk with active fault
  • High flood risk in coastal area
  • High volcanic risk with ashfall
  • Multiple hazards combined
  • Low risk baseline scenario

Error Handling Tests

  • Invalid API key handling
  • Network error simulation
  • Timeout handling
  • Rate limit error handling
  • Invalid input validation

Response Format Tests

  • Base64 encoding validation
  • Metadata completeness
  • Timestamp format verification
  • Features list accuracy

Integration Tests

Test the HTTP endpoint:

python test_http_endpoint.py

Integration tests cover:

  • POST endpoint functionality
  • Request validation
  • Response format compatibility
  • Error response handling
  • Orchestrator integration

Manual Testing

Test with real Gemini API:

  1. Set up environment:

    export GEMINI_API_KEY=your_api_key
    
  2. Run test script:

    python test_agent.py
    
  3. Verify output:

    • Check generated images are valid PNG files
    • Verify disaster-resistant features are visible
    • Confirm metadata is accurate
    • Validate generation time < 30 seconds

Test Coverage

Current test coverage:

  • Prompt Generation: 100%
  • API Client: 95% (excluding live API calls)
  • Response Formatting: 100%
  • Error Handling: 100%
  • Integration: 90%

Performance Testing

Test performance metrics:

# Test generation time
time python test_agent.py

# Test concurrent requests
python -m pytest test_concurrent.py -n 5

# Test memory usage
python -m memory_profiler test_agent.py

Expected performance:

  • Generation Time: 10-20 seconds average
  • Memory Usage: < 300MB per request
  • Success Rate: > 95%
  • Concurrent Requests: 5 simultaneous requests supported

Performance

  • Generation Time: 10-20 seconds typical
  • Image Size: ~500KB - 2MB per image
  • Resolution: 1024x1024 pixels
  • Format: PNG with SynthID watermark

Troubleshooting

Common Issues and Solutions

Issue: "Invalid API Key" Error

Symptoms:

{
  "error": {
    "code": "AUTH_ERROR",
    "message": "Invalid or missing Gemini API key"
  }
}

Solutions:

  1. Check .env file contains GEMINI_API_KEY=your_key
  2. Verify API key is valid at Google AI Studio
  3. Ensure no extra spaces or quotes around the key
  4. Restart the agent after updating .env

Issue: Rate Limit Exceeded

Symptoms:

{
  "error": {
    "code": "RATE_LIMIT",
    "message": "API quota exceeded"
  }
}

Solutions:

  1. Wait 60 seconds before retrying
  2. Check your API quota at Google AI Studio
  3. Upgrade to higher quota tier if needed
  4. Implement request throttling in orchestrator

Issue: Generation Timeout

Symptoms:

  • Request takes longer than 30 seconds
  • Timeout error returned

Solutions:

  1. Simplify the prompt (reduce number of features)
  2. Check Gemini API status
  3. Increase timeout in blaxel.toml (not recommended)
  4. Retry the request

Issue: Poor Quality Visualizations

Symptoms:

  • Generated images don't show disaster-resistant features clearly
  • Building type doesn't match expectations

Solutions:

  1. Verify risk data is accurate and complete
  2. Check building type is correct
  3. Ensure hazard severity levels are set appropriately
  4. Review prompt generation logic in agent.py

Issue: Network Errors

Symptoms:

{
  "error": {
    "code": "NETWORK_ERROR",
    "message": "Connection failed"
  }
}

Solutions:

  1. Check internet connection
  2. Verify firewall allows HTTPS to Google APIs
  3. Check proxy settings if applicable
  4. Retry with exponential backoff

Issue: Memory Errors

Symptoms:

  • Agent crashes with out-of-memory error
  • Slow performance

Solutions:

  1. Increase memory limit in blaxel.toml to 1024MB
  2. Check for memory leaks in custom code
  3. Reduce concurrent request limit
  4. Monitor memory usage with profiling tools

Debug Mode

Enable debug logging:

import logging
logging.basicConfig(level=logging.DEBUG)

This will show:

  • Detailed API request/response logs
  • Prompt generation steps
  • Error stack traces
  • Performance metrics

Getting Help

If you encounter issues not covered here:

  1. Check the logs: bl logs visualization-agent
  2. Review the Gemini API documentation
  3. Check the main project documentation
  4. Contact support with:
    • Error message and code
    • Request payload (sanitized)
    • Timestamp of the error
    • Agent version and configuration

Limitations

Technical Limitations

  • Internet Connection: Requires active internet for Gemini API
  • API Rate Limits: Subject to Gemini API quotas (varies by tier)
  • Generation Time: 10-30 seconds per image (cannot be reduced)
  • Resolution: Fixed at 1024x1024 pixels
  • Format: PNG only (no JPEG, SVG, or other formats)
  • Watermark: SynthID watermark automatically added (cannot be removed)

Functional Limitations

  • Artistic Interpretation: Generated images are conceptual sketches, not engineering drawings
  • Feature Visibility: Some structural features may not be clearly visible in exterior views
  • Accuracy: AI-generated images may not perfectly represent all specified features
  • Consistency: Multiple generations with same prompt may produce different results
  • Detail Level: Cannot generate detailed floor plans or technical specifications

Geographic Limitations

  • Philippine Context: Optimized for Philippine architecture and climate
  • Location Data: Requires valid Philippine coordinates
  • Regional Styles: May not accurately represent all regional architectural variations

Use Case Limitations

Appropriate Uses:

  • Conceptual visualization for stakeholders
  • Initial design exploration
  • Communication tool for non-technical audiences
  • Marketing and presentation materials

Inappropriate Uses:

  • Engineering drawings or construction blueprints
  • Structural analysis or calculations
  • Building permit applications
  • Detailed cost estimation basis
  • Legal or contractual documentation

Integration

The Visualization Agent integrates with:

  • Orchestrator Agent: Receives requests and returns visualization data
  • Gradio UI: Displays generated images in the web interface
  • Risk Assessment Agent: Uses risk data to inform feature selection

Environment Variables

Required Variables

  • GEMINI_API_KEY (required): Google Gemini API key for image generation

Optional Variables

  • VISUALIZATION_MODEL (optional): Gemini model version to use

    • Default: gemini-2.5-flash-image
    • Options: gemini-2.5-flash-image, gemini-3-pro-image-preview
    • Example: VISUALIZATION_MODEL=gemini-2.5-flash-image
  • VISUALIZATION_OUTPUT_DIR (optional): Directory where generated images will be saved

    • Default: ./generated_images
    • Example: VISUALIZATION_OUTPUT_DIR=./my_images
    • Note: Directory will be created automatically if it doesn't exist

Environment Variable Priority

The agent loads configuration in the following priority order (highest to lowest):

  1. Constructor parameters: Values passed directly to VisualizationAgent()
  2. Environment variables: Values from .env file or system environment
  3. Default values: Built-in defaults

Example:

# Priority 1: Constructor parameter (highest)
agent = VisualizationAgent(model="gemini-3-pro-image-preview")

# Priority 2: Environment variable
# VISUALIZATION_MODEL=gemini-2.5-flash-image

# Priority 3: Default value (lowest)
# Default: gemini-2.5-flash-image

Setting Environment Variables

Local Development

Create a .env file in the visualization-agent directory:

# Required
GEMINI_API_KEY=your_gemini_api_key_here

# Optional
VISUALIZATION_MODEL=gemini-2.5-flash-image
VISUALIZATION_OUTPUT_DIR=./generated_images

Blaxel Deployment

Set environment variables in blaxel.toml:

[env]
GEMINI_API_KEY = "${GEMINI_API_KEY}"
VISUALIZATION_MODEL = "${VISUALIZATION_MODEL}"
VISUALIZATION_OUTPUT_DIR = "${VISUALIZATION_OUTPUT_DIR}"

Then deploy with environment file:

bl deploy --env-file .env

System Environment

Set environment variables in your shell:

# Bash/Zsh
export GEMINI_API_KEY=your_api_key
export VISUALIZATION_MODEL=gemini-2.5-flash-image
export VISUALIZATION_OUTPUT_DIR=./generated_images

# Windows Command Prompt
set GEMINI_API_KEY=your_api_key
set VISUALIZATION_MODEL=gemini-2.5-flash-image
set VISUALIZATION_OUTPUT_DIR=./generated_images

# Windows PowerShell
$env:GEMINI_API_KEY="your_api_key"
$env:VISUALIZATION_MODEL="gemini-2.5-flash-image"
$env:VISUALIZATION_OUTPUT_DIR="./generated_images"

Best Practices

Prompt Optimization

  1. Be Specific: Include detailed hazard information for better feature selection
  2. Prioritize Hazards: Focus on the most critical risks (high severity)
  3. Provide Context: Include location data for better Philippine context
  4. Use Recommendations: Pass construction recommendations when available

Performance Optimization

  1. Batch Requests: Group multiple visualizations when possible
  2. Cache Results: Cache generated images for identical requests
  3. Async Processing: Use async/await for concurrent requests
  4. Monitor Quotas: Track API usage to avoid rate limits

Error Handling

  1. Implement Retries: Retry transient errors with exponential backoff
  2. Graceful Degradation: Continue without visualization if generation fails
  3. Log Errors: Log all errors with full context for debugging
  4. User Feedback: Provide clear error messages to users

Security

  1. Protect API Keys: Never expose GEMINI_API_KEY in client code
  2. Validate Input: Always validate and sanitize input parameters
  3. Rate Limiting: Implement rate limiting to prevent abuse
  4. Monitor Usage: Track API usage and set up alerts

Integration

  1. Async Calls: Call visualization agent asynchronously from orchestrator
  2. Timeout Handling: Set appropriate timeouts (30+ seconds)
  3. Fallback Logic: Have fallback behavior if visualization fails
  4. Response Validation: Validate response format before using

Frequently Asked Questions

General Questions

Q: How long does it take to generate a visualization? A: Typically 10-20 seconds, with a maximum timeout of 30 seconds.

Q: Can I generate multiple visualizations for the same building? A: Yes, but each request may produce slightly different results due to AI generation variability.

Q: What image format is returned? A: PNG format, base64-encoded, at 1024x1024 resolution.

Q: Can I remove the SynthID watermark? A: No, the watermark is automatically added by Gemini API and cannot be removed.

Technical Questions

Q: Can I use a different Gemini model? A: Yes, set GEMINI_MODEL environment variable, but gemini-2.5-flash-image is recommended for speed.

Q: How do I increase the image resolution? A: Currently fixed at 1024x1024. Higher resolutions may be supported in future Gemini models.

Q: Can I generate images without an internet connection? A: No, the agent requires internet access to call the Gemini API.

Q: How many concurrent requests can the agent handle? A: Up to 5 concurrent requests, limited by memory and API quotas.

Integration Questions

Q: How does the orchestrator call the visualization agent? A: Via HTTP POST to the Blaxel endpoint with risk data and building type.

Q: What happens if visualization generation fails? A: The orchestrator continues without visualization data, and the UI shows a message.

Q: Can I call the visualization agent directly from the UI? A: Not recommended. Always call through the orchestrator for proper coordination.

Q: How is the generated image displayed in the UI? A: The Gradio UI decodes the base64 image and displays it in the visualization tab.

Cost and Limits

Q: How much does it cost to generate a visualization? A: Depends on your Gemini API plan. Check Google AI Studio for pricing.

Q: What are the rate limits? A: Free tier: 60 requests/minute. Paid tiers vary by plan.

Q: Can I increase my API quota? A: Yes, upgrade your Gemini API plan at Google AI Studio.

Q: Is there a limit on the number of visualizations? A: Only limited by your API quota and rate limits.

Troubleshooting

Q: Why am I getting "Invalid API Key" errors? A: Check that GEMINI_API_KEY is set correctly in your .env file and is valid.

Q: Why are my visualizations timing out? A: Check your internet connection and Gemini API status. Simplify prompts if needed.

Q: Why don't the disaster-resistant features show clearly? A: Ensure risk data is accurate and hazard severity is set appropriately. AI generation may vary.

Q: How do I debug generation issues? A: Enable debug logging and check the prompt_used field in the response.

Roadmap

Planned Features

  • Multiple View Angles: Generate front, side, and aerial views
  • Before/After Comparisons: Show standard vs. disaster-resistant designs
  • Higher Resolution: Support 4K resolution with Gemini 3 Pro
  • Style Variations: Allow users to choose architectural styles
  • Annotation Overlay: Add labels pointing to disaster-resistant features
  • Interactive Refinement: Support multi-turn conversations for improvements
  • Cost Visualization: Overlay cost information on the visualization
  • 3D Models: Generate 3D models in addition to 2D sketches

Future Enhancements

  • Caching layer for identical requests
  • Batch processing for multiple buildings
  • Custom style templates
  • Integration with CAD software
  • Export to additional formats (SVG, PDF)
  • Localization for other languages

Contributing

This agent is part of the Disaster Risk Construction Planner system. For contributions:

  1. Follow the existing code structure and patterns
  2. Add tests for new features
  3. Update documentation
  4. Ensure compatibility with orchestrator and UI

Version History

  • v1.0.0 (2024-01): Initial release
    • Basic visualization generation
    • Support for 10 building types
    • Integration with orchestrator
    • Gemini 2.5 Flash Image model

License

Part of the Disaster Risk Construction Planner system.

Support

For issues or questions:

  • Check this documentation first
  • Review the troubleshooting section
  • Check the main project documentation
  • Review Gemini API documentation at Google AI Studio

Acknowledgments

  • Google Gemini API for image generation
  • Blaxel platform for agent deployment
  • Philippines Disaster Risk data sources
  • Open-source community for tools and libraries