Spaces:

MCP-1st-Birthday
/

building-planner-with-drm

Running

App Files Files Community

building-planner-with-drm / visualization-agent /README.md

dexteredep

Add visualization

9b24c4d 23 days ago

preview code

raw

history blame contribute delete

47.9 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

Visualization Agent

AI agent that generates architectural sketches of disaster-resistant buildings using Google's Gemini image generation API (Nano Banana).

Overview

The Visualization Agent receives risk assessment data and building specifications to create contextual architectural visualizations. It analyzes disaster risks (seismic, volcanic, hydrometeorological) and generates prompts that incorporate appropriate disaster-resistant features, then uses Gemini's image generation API to create visual representations.

Features

Risk-Aware Visualization: Incorporates disaster-resistant features based on risk assessment
Building Type Support: Generates appropriate architecture for residential, commercial, institutional, industrial, and infrastructure projects
Philippine Context: Includes tropical climate and local architectural considerations
Fast Generation: Uses gemini-2.5-flash-image model for quick results (~10-20 seconds)
Detailed Metadata: Returns prompt used, features included, and generation timestamp
Backward Compatibility: Supports both legacy format (risk_data + building_type) and new format (prompt + construction_data)

Architecture

High-Level Architecture

Orchestrator Agent
    ↓
Visualization Agent (FastAPI)
    ├─→ Request Validator
    ├─→ Prompt Generator
    │   ├─→ Hazard Analyzer
    │   ├─→ Feature Mapper
    │   └─→ Context Builder
    ├─→ Gemini API Client
    │   ├─→ API Request Handler
    │   ├─→ Error Handler
    │   └─→ Response Parser
    └─→ Response Formatter
        ├─→ Base64 Encoder
        ├─→ Metadata Generator
        └─→ Feature List Compiler
    ↓
Returns VisualizationData to Orchestrator
    ↓
Gradio UI (displays image)

Component Details

1. VisualizationAgent (Main Class)

Responsibilities:

Orchestrate the visualization generation process
Coordinate between prompt generator and API client
Handle errors and format responses

Key Methods:

generate_visualization(): Main entry point
_validate_input(): Validate request parameters
_format_response(): Format final response with metadata

2. PromptGenerator

Responsibilities:

Analyze risk data to identify relevant hazards
Map hazards to visual features
Generate descriptive prompts for Gemini API
Add Philippine architectural context

Key Methods:

generate_prompt(): Create complete prompt
_extract_hazard_features(): Extract features from risk data
_get_building_description(): Get building type description
_add_philippine_context(): Add contextual elements
_prioritize_features(): Prioritize features for multi-hazard scenarios

Feature Mapping Logic:

# Seismic hazards
if risk_data.seismic_risk == "high":
    features.append("Reinforced concrete frame with cross-bracing")
    features.append("Moment-resisting frames")
    
# Flood hazards
if risk_data.flood_risk == "high":
    features.append("Elevated first floor on stilts")
    
# Volcanic hazards
if risk_data.volcanic_risk == "high":
    features.append("Steep-pitched roof for ash shedding")

3. GeminiAPIClient

Responsibilities:

Communicate with Google Gemini API
Handle API authentication
Manage timeouts and retries
Parse API responses

Key Methods:

generate_image(): Call Gemini API
_handle_api_error(): Convert API errors to structured format
_validate_response(): Validate API response

API Configuration:

Model: gemini-2.5-flash-image
Resolution: 1024x1024
Format: PNG
Timeout: 30 seconds

4. Response Formatter

Responsibilities:

Encode image data to base64
Generate metadata
Compile features list
Format final response

Metadata Included:

Prompt used for generation
Model version
Generation timestamp (ISO 8601)
Image format and resolution
List of disaster-resistant features

Data Flow

1. Request arrives at FastAPI endpoint
   ↓
2. Request validation (Pydantic models)
   ↓
3. VisualizationAgent.generate_visualization()
   ↓
4. PromptGenerator.generate_prompt()
   - Analyze risk_data
   - Extract hazard features
   - Get building description
   - Add Philippine context
   - Compile final prompt
   ↓
5. GeminiAPIClient.generate_image()
   - Send prompt to Gemini API
   - Wait for response (10-20 seconds)
   - Receive image bytes
   ↓
6. Response Formatter
   - Encode image to base64
   - Generate metadata
   - Compile features list
   ↓
7. Return VisualizationResponse
   ↓
8. Orchestrator receives response
   ↓
9. Gradio UI displays image

Error Handling Flow

Error occurs at any stage
   ↓
Error caught by try/except block
   ↓
Error categorized (AUTH, RATE_LIMIT, NETWORK, etc.)
   ↓
ErrorDetail object created
   ↓
Response with success=false returned
   ↓
Orchestrator handles error gracefully
   ↓
UI shows error message or continues without visualization

Technology Stack

Framework: FastAPI (HTTP server)
AI API: Google Gemini (gemini-2.5-flash-image)
Image Processing: Pillow (PIL)
Data Validation: Pydantic v2
Deployment: Blaxel platform
Language: Python 3.11+

Performance Characteristics

Latency: 10-20 seconds typical (Gemini API call)
Throughput: 5 concurrent requests
Memory: ~300MB per request
CPU: Minimal (mostly I/O bound)
Network: ~2-5MB per request (image download)

Installation

Prerequisites

Python 3.11+
Gemini API key (Google AI Studio)

Setup

Install dependencies:

cd visualization-agent
pip install -r requirements.txt

Configure environment variables:

cp .env.example .env
# Edit .env and add your GEMINI_API_KEY

Test the agent:

python test_agent.py

Usage

As HTTP Service

Start the FastAPI server:

python main.py

Send POST request to generate visualization:

curl -X POST http://localhost:8000/ \
  -H "Content-Type: application/json" \
  -d '{
    "risk_data": {
      "seismic_risk": "high",
      "flood_risk": "medium",
      "location": {"latitude": 14.5995, "longitude": 120.9842}
    },
    "building_type": "residential_single_family",
    "recommendations": {...}
  }'

As Python Module

from agent import VisualizationAgent

agent = VisualizationAgent()

visualization_data = agent.generate_visualization(
    risk_data=risk_data,
    building_type="residential_single_family",
    recommendations=recommendations
)

# Access generated image
image_base64 = visualization_data.image_base64
prompt_used = visualization_data.prompt_used
features = visualization_data.features_included

API Reference

Endpoint

POST /

Request Formats

The agent supports three request formats for backward compatibility:

Format 1: Legacy Format (risk_data + building_type)

This format is supported for backward compatibility with older orchestrator versions:

{
    "risk_data": {
        "location": {...},
        "hazards": {...}
    },
    "building_type": "residential_single_family",
    "recommendations": {...}  // optional
}

The agent automatically converts this to the new format by:

Generating a prompt based on building_type
Creating construction_data from risk_data, building_type, and recommendations
Processing as a context-aware request

Format 2: New Format (prompt + construction_data)

This is the recommended format for new integrations:

{
    "prompt": "A disaster-resistant school building in the Philippines",
    "construction_data": {
        "building_type": "institutional_school",
        "location": {...},
        "risk_data": {...},
        "recommendations": {...}
    },
    "config": {
        "aspect_ratio": "16:9",
        "image_size": "1K"
    }
}

Format 3: Basic Format (prompt only)

For simple use cases without context:

{
    "prompt": "A modern disaster-resistant building in the Philippines"
}

Request Format Details

Complete Request Schema

{
    "risk_data": {
        "seismic_risk": str,           # "low", "medium", "high"
        "flood_risk": str,              # "low", "medium", "high"
        "volcanic_risk": str,           # "low", "medium", "high"
        "location": {
            "latitude": float,          # 4.0 to 21.0 (Philippines)
            "longitude": float,         # 116.0 to 127.0 (Philippines)
            "municipality": str,        # Optional
            "province": str             # Optional
        },
        "hazards": [                    # Optional, detailed hazard list
            {
                "type": str,            # "seismic", "volcanic", "hydrometeorological"
                "category": str,        # Specific hazard category
                "severity": str,        # "low", "medium", "high"
                "description": str      # Human-readable description
            }
        ]
    },
    "building_type": str,               # See Building Types section
    "recommendations": {                # Optional
        "structural": [
            {
                "category": str,
                "priority": str,
                "description": str
            }
        ]
    }
}

Minimal Request

{
    "risk_data": {
        "seismic_risk": "high",
        "flood_risk": "low",
        "volcanic_risk": "low",
        "location": {
            "latitude": 14.5995,
            "longitude": 120.9842
        }
    },
    "building_type": "residential_single_family"
}

Response Format

Success Response

{
    "success": true,
    "visualization_data": {
        "image_base64": str,            # Base64-encoded PNG image
        "prompt_used": str,             # Full prompt sent to Gemini
        "model_version": str,           # "gemini-2.5-flash-image"
        "generation_timestamp": str,    # ISO 8601 format
        "image_format": "PNG",          # Always PNG
        "resolution": "1024x1024",      # Always 1024x1024
        "features_included": [str]      # List of disaster-resistant features
    },
    "error": null
}

Error Response

{
    "success": false,
    "visualization_data": null,
    "error": {
        "code": str,                    # Error code (see Error Codes section)
        "message": str,                 # Human-readable error message
        "retry_possible": bool          # Whether retry is recommended
    }
}

Request Parameters

risk_data (required)

Field	Type	Required	Description
`seismic_risk`	string	Yes	Overall seismic risk level: "low", "medium", "high"
`flood_risk`	string	Yes	Overall flood risk level: "low", "medium", "high"
`volcanic_risk`	string	Yes	Overall volcanic risk level: "low", "medium", "high"
`location`	object	Yes	Geographic location data
`hazards`	array	No	Detailed hazard information

location (required)

Field	Type	Required	Description
`latitude`	float	Yes	Latitude (4.0 to 21.0 for Philippines)
`longitude`	float	Yes	Longitude (116.0 to 127.0 for Philippines)
`municipality`	string	No	Municipality name
`province`	string	No	Province name

building_type (required)

Value	Description
`residential_single_family`	Single-family home
`residential_multi_family`	Multi-family residential (2-4 units)
`residential_high_rise`	High-rise apartment building
`commercial_office`	Modern office building
`commercial_retail`	Retail shopping center
`industrial_warehouse`	Industrial warehouse facility
`institutional_school`	School building
`institutional_hospital`	Hospital or healthcare facility
`infrastructure_bridge`	Bridge structure
`mixed_use`	Mixed-use development

recommendations (optional)

Optional construction recommendations from research agent. If provided, may influence feature selection.

Response Fields

visualization_data

Field	Type	Description
`image_base64`	string	Base64-encoded PNG image data
`prompt_used`	string	Complete prompt sent to Gemini API
`model_version`	string	Gemini model version used
`generation_timestamp`	string	ISO 8601 timestamp of generation
`image_format`	string	Always "PNG"
`resolution`	string	Always "1024x1024"
`features_included`	array	List of disaster-resistant features shown

error

Field	Type	Description
`code`	string	Error code (see Error Codes section)
`message`	string	Human-readable error description
`retry_possible`	boolean	Whether the request can be retried

HTTP Status Codes

Status Code	Description
200	Success (check `success` field in response)
400	Bad Request (invalid input parameters)
401	Unauthorized (invalid API key)
429	Too Many Requests (rate limit exceeded)
500	Internal Server Error
504	Gateway Timeout (generation took > 30 seconds)

Rate Limits

Free Tier: 60 requests per minute
Paid Tier: Varies by plan
Concurrent Requests: Maximum 5 simultaneous requests

Authentication

When deployed on Blaxel:

curl -X POST https://run.blaxel.ai/{workspace}/agents/visualization-agent \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {BLAXEL_API_KEY}" \
  -d @request.json

Content Type

All requests and responses use application/json.

Supported Building Types

The agent supports 10 building type categories, each with specific architectural characteristics:

Residential Buildings

Building Type	Code	Description	Typical Features
Single Family	`residential_single_family`	Single-family home	1-2 stories, pitched roof, residential scale
Multi Family	`residential_multi_family`	Multi-family residential (2-4 units)	2-3 stories, multiple entrances, shared spaces
High Rise	`residential_high_rise`	High-rise apartment building	10+ stories, elevator core, balconies

Commercial Buildings

Building Type	Code	Description	Typical Features
Office	`commercial_office`	Modern office building	3-10 stories, glass facade, modern design
Retail	`commercial_retail`	Retail shopping center	1-2 stories, large windows, parking area

Industrial Buildings

Building Type	Code	Description	Typical Features
Warehouse	`industrial_warehouse`	Industrial warehouse facility	Large open space, high ceilings, loading docks

Institutional Buildings

Building Type	Code	Description	Typical Features
School	`institutional_school`	School building with classrooms	1-3 stories, multiple wings, playground area
Hospital	`institutional_hospital`	Hospital or healthcare facility	3-5 stories, emergency entrance, medical design

Infrastructure

Building Type	Code	Description	Typical Features
Bridge	`infrastructure_bridge`	Bridge structure	Span structure, support columns, roadway

Mixed Use

Building Type	Code	Description	Typical Features
Mixed Use	`mixed_use`	Mixed-use development	Commercial ground floor, residential upper floors

Building Type Selection Guide

Choose the appropriate building type based on your project:

Residential Projects: Use residential_single_family for houses, residential_multi_family for apartments/condos, residential_high_rise for towers
Commercial Projects: Use commercial_office for office buildings, commercial_retail for shops/malls
Industrial Projects: Use industrial_warehouse for factories, warehouses, distribution centers
Public Buildings: Use institutional_school for schools, institutional_hospital for hospitals/clinics
Infrastructure: Use infrastructure_bridge for bridges, overpasses
Mixed Projects: Use mixed_use for buildings combining residential and commercial spaces

Prompt Generation Strategy

The Visualization Agent uses a sophisticated prompt generation strategy to create contextual, risk-aware architectural visualizations.

Prompt Template Structure

[Building Type Description] in the Philippines, designed for disaster resistance.

Key Features:
- [Hazard-specific feature 1]
- [Hazard-specific feature 2]
- [Hazard-specific feature 3]

Architectural Style: [Philippine context]
Setting: [Tropical environment with appropriate landscaping]
Perspective: [Exterior view showing structural features]
Style: Architectural sketch, professional rendering

Feature Prioritization

When multiple hazards are present, the agent prioritizes features based on:

Risk Level: High-risk hazards get priority over medium/low
Structural Impact: Features that affect the entire building structure
Visual Prominence: Features that are clearly visible in architectural sketches

Philippine Context Integration

The agent automatically adds contextual elements:

Tropical climate considerations (ventilation, sun protection)
Local architectural styles and materials
Appropriate landscaping (palm trees, tropical vegetation)
Regional building practices

Hazard-to-Feature Mappings

The agent uses detailed mappings to translate risk data into visual features:

Seismic Hazards

Hazard Type	Risk Level	Visual Features
Active Fault	High	Reinforced concrete frame with visible cross-bracing
Ground Shaking	High	Moment-resisting frames, shear walls
Liquefaction	Medium-High	Deep pile foundation visible at base
Earthquake	All	Structural reinforcements, seismic joints

Example Features:

Reinforced concrete frame with cross-bracing
Moment-resisting frames
Shear walls
Deep pile foundations
Seismic isolation systems

Volcanic Hazards

Hazard Type	Risk Level	Visual Features
Ashfall	High	Steep-pitched roof (45°+ angle) for ash shedding
Pyroclastic Flow	High	Reinforced concrete construction, protective barriers
Lahar	Medium-High	Elevated foundation, diversion channels
Volcanic Activity	All	Robust roof structure, sealed openings

Example Features:

Steep-pitched roof for ash shedding
Reinforced concrete construction
Protective barriers and walls
Elevated foundation
Sealed ventilation systems

Hydrometeorological Hazards

Hazard Type	Risk Level	Visual Features
Flood	High	Elevated first floor on stilts (2-3 meters)
Storm Surge	High	Coastal reinforcement, breakwaters
Severe Winds	High	Aerodynamic roof design, hurricane straps
Typhoon	High	Wind-resistant construction, storm shutters
Landslide	Medium-High	Retaining walls, terraced foundation

Example Features:

Elevated first floor on stilts
Raised foundation
Flood barriers
Aerodynamic roof design
Hurricane straps
Storm shutters
Retaining walls
Terraced foundation

Multi-Hazard Scenarios

When multiple hazards are present, the agent combines features intelligently:

Example: High Seismic + High Flood

Elevated foundation on reinforced concrete piles
Moment-resisting frames visible in structure
Cross-bracing on elevated sections

Example: High Volcanic + Medium Wind

Steep-pitched roof with aerodynamic design
Reinforced concrete construction
Storm shutters on windows

Example Requests and Responses

Example 1: Residential Building with High Seismic Risk

Request:

{
  "risk_data": {
    "seismic_risk": "high",
    "flood_risk": "low",
    "volcanic_risk": "low",
    "location": {
      "latitude": 14.5995,
      "longitude": 120.9842,
      "municipality": "Manila",
      "province": "Metro Manila"
    },
    "hazards": [
      {
        "type": "seismic",
        "category": "active_fault",
        "severity": "high",
        "description": "Near active fault line"
      }
    ]
  },
  "building_type": "residential_single_family",
  "recommendations": {
    "structural": [
      {
        "category": "foundation",
        "priority": "critical",
        "description": "Use reinforced concrete foundation with seismic isolation"
      }
    ]
  }
}

Response:

{
  "success": true,
  "visualization_data": {
    "image_base64": "iVBORw0KGgoAAAANSUhEUgAA...(truncated)",
    "prompt_used": "Single-family home in the Philippines, designed for disaster resistance.\n\nKey Features:\n- Reinforced concrete frame with visible cross-bracing\n- Moment-resisting frames for earthquake protection\n- Deep pile foundation visible at base\n\nArchitectural Style: Modern Filipino residential with tropical design elements\nSetting: Tropical environment with palm trees and lush vegetation\nPerspective: Exterior view showing structural reinforcements\nStyle: Architectural sketch, professional rendering",
    "model_version": "gemini-2.5-flash-image",
    "generation_timestamp": "2024-01-15T10:30:45.123Z",
    "image_format": "PNG",
    "resolution": "1024x1024",
    "features_included": [
      "Reinforced concrete frame with cross-bracing",
      "Moment-resisting frames",
      "Deep pile foundation"
    ]
  }
}

Example 2: Commercial Building with Flood Risk

Request:

{
  "risk_data": {
    "seismic_risk": "low",
    "flood_risk": "high",
    "volcanic_risk": "low",
    "location": {
      "latitude": 10.3157,
      "longitude": 123.8854,
      "municipality": "Cebu City",
      "province": "Cebu"
    },
    "hazards": [
      {
        "type": "hydrometeorological",
        "category": "flood",
        "severity": "high",
        "description": "Flood-prone area"
      }
    ]
  },
  "building_type": "commercial_office"
}

Response:

{
  "success": true,
  "visualization_data": {
    "image_base64": "iVBORw0KGgoAAAANSUhEUgAA...(truncated)",
    "prompt_used": "Modern office building in the Philippines, designed for disaster resistance.\n\nKey Features:\n- Elevated first floor on reinforced concrete stilts (2-3 meters)\n- Flood barriers around perimeter\n- Water-resistant materials for lower levels\n\nArchitectural Style: Contemporary Filipino commercial architecture\nSetting: Urban tropical environment with flood management features\nPerspective: Exterior view showing elevated foundation\nStyle: Architectural sketch, professional rendering",
    "model_version": "gemini-2.5-flash-image",
    "generation_timestamp": "2024-01-15T10:32:18.456Z",
    "image_format": "PNG",
    "resolution": "1024x1024",
    "features_included": [
      "Elevated first floor on stilts",
      "Flood barriers",
      "Water-resistant construction"
    ]
  }
}

Example 3: Multi-Hazard Scenario

Request:

{
  "risk_data": {
    "seismic_risk": "high",
    "flood_risk": "medium",
    "volcanic_risk": "high",
    "location": {
      "latitude": 13.2572,
      "longitude": 123.8144,
      "municipality": "Legazpi",
      "province": "Albay"
    },
    "hazards": [
      {
        "type": "volcanic",
        "category": "ashfall",
        "severity": "high"
      },
      {
        "type": "seismic",
        "category": "earthquake",
        "severity": "high"
      }
    ]
  },
  "building_type": "institutional_school"
}

Response:

{
  "success": true,
  "visualization_data": {
    "image_base64": "iVBORw0KGgoAAAANSUhEUgAA...(truncated)",
    "prompt_used": "School building with classrooms in the Philippines, designed for disaster resistance.\n\nKey Features:\n- Steep-pitched reinforced concrete roof for volcanic ash shedding\n- Reinforced concrete frame with seismic cross-bracing\n- Moment-resisting frames for earthquake protection\n- Protective barriers around building perimeter\n\nArchitectural Style: Institutional Filipino architecture with disaster-resistant design\nSetting: Tropical environment near volcanic area with protective landscaping\nPerspective: Exterior view showing roof design and structural reinforcements\nStyle: Architectural sketch, professional rendering",
    "model_version": "gemini-2.5-flash-image",
    "generation_timestamp": "2024-01-15T10:35:22.789Z",
    "image_format": "PNG",
    "resolution": "1024x1024",
    "features_included": [
      "Steep-pitched roof for ash shedding",
      "Reinforced concrete frame with cross-bracing",
      "Moment-resisting frames",
      "Protective barriers"
    ]
  }
}

Example 4: Error Response

Request:

{
  "risk_data": {...},
  "building_type": "residential_single_family"
}

Response (Invalid API Key):

{
  "success": false,
  "visualization_data": null,
  "error": {
    "code": "AUTH_ERROR",
    "message": "Invalid or missing Gemini API key. Please check your GEMINI_API_KEY environment variable.",
    "retry_possible": false
  }
}

Error Handling

The agent provides comprehensive error handling with detailed error codes and messages.

Error Codes

Error Code	Description	Retry Possible	Recommended Action
`AUTH_ERROR`	Invalid or missing API key	No	Check GEMINI_API_KEY environment variable
`RATE_LIMIT`	API quota exceeded	Yes	Wait and retry after delay (typically 60 seconds)
`GENERATION_FAILED`	Image generation failed	Yes	Retry with same or modified prompt
`NETWORK_ERROR`	Connection issues	Yes	Check internet connection and retry
`TIMEOUT`	Generation took longer than 30 seconds	Yes	Retry or simplify prompt
`INVALID_INPUT`	Invalid request parameters	No	Check request format and parameters
`MODEL_ERROR`	Gemini model error	Yes	Retry or contact support

Error Response Format

All errors follow this structure:

{
  "success": false,
  "visualization_data": null,
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable error description",
    "retry_possible": true/false
  }
}

Retry Strategy

For errors with retry_possible: true:

Rate Limit Errors: Wait 60 seconds before retrying
Network Errors: Retry immediately, then with exponential backoff (2s, 4s, 8s)
Generation Failures: Retry up to 2 times with same prompt
Timeouts: Retry once, then consider simplifying the prompt

Error Logging

All errors are logged with full context:

Request parameters
Error type and message
Timestamp
Stack trace (for debugging)

Example log entry:

2024-01-15 10:30:45 ERROR [VisualizationAgent] Image generation failed
  Error: RATE_LIMIT
  Message: API quota exceeded
  Request: building_type=residential_single_family, location=Manila
  Retry: true

Deployment

Prerequisites

Before deploying, ensure you have:

Blaxel CLI installed:
```
pip install blaxel
```
Blaxel account and workspace:
- Sign up at blaxel.ai
- Create a workspace
- Get your API key
Gemini API key:
- Get API key from Google AI Studio
- Add to .env file

Local Development

Install dependencies:

cd visualization-agent
pip install -r requirements.txt

Configure environment:

cp .env.example .env
# Edit .env and add:
# GEMINI_API_KEY=your_api_key_here

Run locally:
```
python main.py
```
Test the agent:
```
python test_agent.py
```

Blaxel Platform Deployment

Step 1: Configure Environment Variables

Create or update .env file:

GEMINI_API_KEY=your_gemini_api_key
GEMINI_MODEL=gemini-2.5-flash-image

Step 2: Review Configuration

Check blaxel.toml configuration:

name = "visualization-agent"
type = "agent"

[env]
GEMINI_API_KEY = "${GEMINI_API_KEY}"
GEMINI_MODEL = "${GEMINI_MODEL}"

[runtime]
timeout = 30
memory = 512

[entrypoint]
prod = "python main.py"

[[triggers]]
id = "trigger-visualization-agent"
type = "http"
timeout = 30

[triggers.configuration]
path = "agents/visualization-agent/process"
retry = 1
authenticationType = "private"

Step 3: Deploy to Blaxel

cd visualization-agent
bl deploy --env-file .env

Step 4: Verify Deployment

The agent will be available at:

https://run.blaxel.ai/{workspace}/agents/visualization-agent

Test the deployed agent:

curl -X POST https://run.blaxel.ai/{workspace}/agents/visualization-agent \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {BLAXEL_API_KEY}" \
  -d @test_request.json

Configuration Options

Runtime Configuration

Parameter	Default	Description
`timeout`	30	Maximum execution time (seconds)
`memory`	512	Memory limit (MB)
`retry`	1	Number of retry attempts

Model Configuration

Parameter	Default	Description
`GEMINI_MODEL`	gemini-2.5-flash-image	Gemini model version
`GEMINI_API_KEY`	(required)	Google Gemini API key

Image Configuration

Resolution: 1024x1024 (fixed)
Format: PNG
Watermark: SynthID (automatic)

Integration with Orchestrator

The orchestrator agent calls the visualization agent automatically. To integrate:

Update orchestrator's blaxel.toml:

[[resources]]
id = "visualization-agent"
type = "agent"
name = "visualization-agent"

Orchestrator calls visualization agent:

visualization_response = await self.execute_visualization(
    risk_data=risk_data,
    building_type=building_type,
    recommendations=recommendations
)

Response flows to Gradio UI:
- Image displayed in visualization tab
- Metadata shown alongside image
- Features list displayed

Monitoring and Logs

View Logs

bl logs visualization-agent

Monitor Performance

Key metrics to monitor:

Generation Time: Should be < 30 seconds
Success Rate: Should be > 95%
Error Rate: Monitor for rate limit errors
Memory Usage: Should stay under 512MB

Common Issues

Timeout Errors:
- Increase timeout in blaxel.toml
- Simplify prompts
- Check Gemini API status
Rate Limit Errors:
- Implement request throttling
- Upgrade Gemini API quota
- Add retry logic with backoff
Memory Issues:
- Increase memory limit in blaxel.toml
- Optimize image processing
- Check for memory leaks

Scaling Considerations

For high-volume deployments:

Increase Memory: Set to 1024MB for better performance
Add Caching: Cache generated images for identical requests
Load Balancing: Deploy multiple instances
Rate Limiting: Implement request queuing
Monitoring: Set up alerts for errors and performance

Security Best Practices

API Key Management:
- Never commit API keys to version control
- Use environment variables only
- Rotate keys regularly
Authentication:
- Use authenticationType = "private" in blaxel.toml
- Require BLAXEL_API_KEY for all requests
- Validate request signatures
Input Validation:
- Validate all input parameters
- Sanitize location data
- Check building type against allowed values
Output Security:
- Ensure generated images don't contain sensitive data
- Add watermarks (automatic with SynthID)
- Log all generation requests

Testing

Unit Tests

Run the unit test suite:

python test_agent.py

The test suite includes:

Agent Initialization Tests

Verify agent initializes with correct configuration
Check Gemini API client setup
Validate environment variable loading

Building Type Tests

Residential Single Family: High seismic risk scenario
Commercial Office: Flood risk scenario
Institutional School: Multiple hazards (volcanic + seismic)
Industrial Warehouse: Wind resistance scenario
Infrastructure Bridge: Multi-hazard scenario

Risk Scenario Tests

High seismic risk with active fault
High flood risk in coastal area
High volcanic risk with ashfall
Multiple hazards combined
Low risk baseline scenario

Error Handling Tests

Invalid API key handling
Network error simulation
Timeout handling
Rate limit error handling
Invalid input validation

Response Format Tests

Base64 encoding validation
Metadata completeness
Timestamp format verification
Features list accuracy

Integration Tests

Test the HTTP endpoint:

python test_http_endpoint.py

Integration tests cover:

POST endpoint functionality
Request validation
Response format compatibility
Error response handling
Orchestrator integration

Manual Testing

Test with real Gemini API:

Set up environment:
```
export GEMINI_API_KEY=your_api_key
```
Run test script:
```
python test_agent.py
```
Verify output:
- Check generated images are valid PNG files
- Verify disaster-resistant features are visible
- Confirm metadata is accurate
- Validate generation time < 30 seconds

Test Coverage

Current test coverage:

Prompt Generation: 100%
API Client: 95% (excluding live API calls)
Response Formatting: 100%
Error Handling: 100%
Integration: 90%

Performance Testing

Test performance metrics:

# Test generation time
time python test_agent.py

# Test concurrent requests
python -m pytest test_concurrent.py -n 5

# Test memory usage
python -m memory_profiler test_agent.py

Expected performance:

Generation Time: 10-20 seconds average
Memory Usage: < 300MB per request
Success Rate: > 95%
Concurrent Requests: 5 simultaneous requests supported

Performance

Generation Time: 10-20 seconds typical
Image Size: ~500KB - 2MB per image
Resolution: 1024x1024 pixels
Format: PNG with SynthID watermark

Troubleshooting

Common Issues and Solutions

Issue: "Invalid API Key" Error

Symptoms:

{
  "error": {
    "code": "AUTH_ERROR",
    "message": "Invalid or missing Gemini API key"
  }
}

Solutions:

Check .env file contains GEMINI_API_KEY=your_key
Verify API key is valid at Google AI Studio
Ensure no extra spaces or quotes around the key
Restart the agent after updating .env

Issue: Rate Limit Exceeded

Symptoms:

{
  "error": {
    "code": "RATE_LIMIT",
    "message": "API quota exceeded"
  }
}

Solutions:

Wait 60 seconds before retrying
Check your API quota at Google AI Studio
Upgrade to higher quota tier if needed
Implement request throttling in orchestrator

Issue: Generation Timeout

Symptoms:

Request takes longer than 30 seconds
Timeout error returned

Solutions:

Simplify the prompt (reduce number of features)
Check Gemini API status
Increase timeout in blaxel.toml (not recommended)
Retry the request

Issue: Poor Quality Visualizations

Symptoms:

Generated images don't show disaster-resistant features clearly
Building type doesn't match expectations

Solutions:

Verify risk data is accurate and complete
Check building type is correct
Ensure hazard severity levels are set appropriately
Review prompt generation logic in agent.py

Issue: Network Errors

Symptoms:

{
  "error": {
    "code": "NETWORK_ERROR",
    "message": "Connection failed"
  }
}

Solutions:

Check internet connection
Verify firewall allows HTTPS to Google APIs
Check proxy settings if applicable
Retry with exponential backoff

Issue: Memory Errors

Symptoms:

Agent crashes with out-of-memory error
Slow performance

Solutions:

Increase memory limit in blaxel.toml to 1024MB
Check for memory leaks in custom code
Reduce concurrent request limit
Monitor memory usage with profiling tools

Debug Mode

Enable debug logging:

import logging
logging.basicConfig(level=logging.DEBUG)

This will show:

Detailed API request/response logs
Prompt generation steps
Error stack traces
Performance metrics

Getting Help

If you encounter issues not covered here:

Check the logs: bl logs visualization-agent
Review the Gemini API documentation
Check the main project documentation
Contact support with:
- Error message and code
- Request payload (sanitized)
- Timestamp of the error
- Agent version and configuration

Limitations

Technical Limitations

Internet Connection: Requires active internet for Gemini API
API Rate Limits: Subject to Gemini API quotas (varies by tier)
Generation Time: 10-30 seconds per image (cannot be reduced)
Resolution: Fixed at 1024x1024 pixels
Format: PNG only (no JPEG, SVG, or other formats)
Watermark: SynthID watermark automatically added (cannot be removed)

Functional Limitations

Artistic Interpretation: Generated images are conceptual sketches, not engineering drawings
Feature Visibility: Some structural features may not be clearly visible in exterior views
Accuracy: AI-generated images may not perfectly represent all specified features
Consistency: Multiple generations with same prompt may produce different results
Detail Level: Cannot generate detailed floor plans or technical specifications

Geographic Limitations

Philippine Context: Optimized for Philippine architecture and climate
Location Data: Requires valid Philippine coordinates
Regional Styles: May not accurately represent all regional architectural variations

Use Case Limitations

Appropriate Uses:

Conceptual visualization for stakeholders
Initial design exploration
Communication tool for non-technical audiences
Marketing and presentation materials

Inappropriate Uses:

Engineering drawings or construction blueprints
Structural analysis or calculations
Building permit applications
Detailed cost estimation basis
Legal or contractual documentation

Integration

The Visualization Agent integrates with:

Orchestrator Agent: Receives requests and returns visualization data
Gradio UI: Displays generated images in the web interface
Risk Assessment Agent: Uses risk data to inform feature selection

Environment Variables

Required Variables

GEMINI_API_KEY (required): Google Gemini API key for image generation
- Get your API key from: https://makersuite.google.com/app/apikey
- Alternative: GOOGLE_API_KEY can be used instead of GEMINI_API_KEY

Optional Variables

VISUALIZATION_MODEL (optional): Gemini model version to use
- Default: gemini-2.5-flash-image
- Options: gemini-2.5-flash-image, gemini-3-pro-image-preview
- Example: VISUALIZATION_MODEL=gemini-2.5-flash-image
VISUALIZATION_OUTPUT_DIR (optional): Directory where generated images will be saved
- Default: ./generated_images
- Example: VISUALIZATION_OUTPUT_DIR=./my_images
- Note: Directory will be created automatically if it doesn't exist

Environment Variable Priority

The agent loads configuration in the following priority order (highest to lowest):

Constructor parameters: Values passed directly to VisualizationAgent()
Environment variables: Values from .env file or system environment
Default values: Built-in defaults

Example:

# Priority 1: Constructor parameter (highest)
agent = VisualizationAgent(model="gemini-3-pro-image-preview")

# Priority 2: Environment variable
# VISUALIZATION_MODEL=gemini-2.5-flash-image

# Priority 3: Default value (lowest)
# Default: gemini-2.5-flash-image

Setting Environment Variables

Local Development

Create a .env file in the visualization-agent directory:

# Required
GEMINI_API_KEY=your_gemini_api_key_here

# Optional
VISUALIZATION_MODEL=gemini-2.5-flash-image
VISUALIZATION_OUTPUT_DIR=./generated_images

Blaxel Deployment

Set environment variables in blaxel.toml:

[env]
GEMINI_API_KEY = "${GEMINI_API_KEY}"
VISUALIZATION_MODEL = "${VISUALIZATION_MODEL}"
VISUALIZATION_OUTPUT_DIR = "${VISUALIZATION_OUTPUT_DIR}"

Then deploy with environment file:

bl deploy --env-file .env

System Environment

Set environment variables in your shell:

# Bash/Zsh
export GEMINI_API_KEY=your_api_key
export VISUALIZATION_MODEL=gemini-2.5-flash-image
export VISUALIZATION_OUTPUT_DIR=./generated_images

# Windows Command Prompt
set GEMINI_API_KEY=your_api_key
set VISUALIZATION_MODEL=gemini-2.5-flash-image
set VISUALIZATION_OUTPUT_DIR=./generated_images

# Windows PowerShell
$env:GEMINI_API_KEY="your_api_key"
$env:VISUALIZATION_MODEL="gemini-2.5-flash-image"
$env:VISUALIZATION_OUTPUT_DIR="./generated_images"

Best Practices

Prompt Optimization

Be Specific: Include detailed hazard information for better feature selection
Prioritize Hazards: Focus on the most critical risks (high severity)
Provide Context: Include location data for better Philippine context
Use Recommendations: Pass construction recommendations when available

Performance Optimization

Batch Requests: Group multiple visualizations when possible
Cache Results: Cache generated images for identical requests
Async Processing: Use async/await for concurrent requests
Monitor Quotas: Track API usage to avoid rate limits

Error Handling

Implement Retries: Retry transient errors with exponential backoff
Graceful Degradation: Continue without visualization if generation fails
Log Errors: Log all errors with full context for debugging
User Feedback: Provide clear error messages to users

Security

Protect API Keys: Never expose GEMINI_API_KEY in client code
Validate Input: Always validate and sanitize input parameters
Rate Limiting: Implement rate limiting to prevent abuse
Monitor Usage: Track API usage and set up alerts

Integration

Async Calls: Call visualization agent asynchronously from orchestrator
Timeout Handling: Set appropriate timeouts (30+ seconds)
Fallback Logic: Have fallback behavior if visualization fails
Response Validation: Validate response format before using

Frequently Asked Questions

General Questions

Q: How long does it take to generate a visualization? A: Typically 10-20 seconds, with a maximum timeout of 30 seconds.

Q: Can I generate multiple visualizations for the same building? A: Yes, but each request may produce slightly different results due to AI generation variability.

Q: What image format is returned? A: PNG format, base64-encoded, at 1024x1024 resolution.

Q: Can I remove the SynthID watermark? A: No, the watermark is automatically added by Gemini API and cannot be removed.

Technical Questions

Q: Can I use a different Gemini model? A: Yes, set GEMINI_MODEL environment variable, but gemini-2.5-flash-image is recommended for speed.

Q: How do I increase the image resolution? A: Currently fixed at 1024x1024. Higher resolutions may be supported in future Gemini models.

Q: Can I generate images without an internet connection? A: No, the agent requires internet access to call the Gemini API.

Q: How many concurrent requests can the agent handle? A: Up to 5 concurrent requests, limited by memory and API quotas.

Integration Questions

Q: How does the orchestrator call the visualization agent? A: Via HTTP POST to the Blaxel endpoint with risk data and building type.

Q: What happens if visualization generation fails? A: The orchestrator continues without visualization data, and the UI shows a message.

Q: Can I call the visualization agent directly from the UI? A: Not recommended. Always call through the orchestrator for proper coordination.

Q: How is the generated image displayed in the UI? A: The Gradio UI decodes the base64 image and displays it in the visualization tab.

Cost and Limits

Q: How much does it cost to generate a visualization? A: Depends on your Gemini API plan. Check Google AI Studio for pricing.

Q: What are the rate limits? A: Free tier: 60 requests/minute. Paid tiers vary by plan.

Q: Can I increase my API quota? A: Yes, upgrade your Gemini API plan at Google AI Studio.

Q: Is there a limit on the number of visualizations? A: Only limited by your API quota and rate limits.

Troubleshooting

Q: Why am I getting "Invalid API Key" errors? A: Check that GEMINI_API_KEY is set correctly in your .env file and is valid.

Q: Why are my visualizations timing out? A: Check your internet connection and Gemini API status. Simplify prompts if needed.

Q: Why don't the disaster-resistant features show clearly? A: Ensure risk data is accurate and hazard severity is set appropriately. AI generation may vary.

Q: How do I debug generation issues? A: Enable debug logging and check the prompt_used field in the response.

Roadmap

Planned Features

Multiple View Angles: Generate front, side, and aerial views
Before/After Comparisons: Show standard vs. disaster-resistant designs
Higher Resolution: Support 4K resolution with Gemini 3 Pro
Style Variations: Allow users to choose architectural styles
Annotation Overlay: Add labels pointing to disaster-resistant features
Interactive Refinement: Support multi-turn conversations for improvements
Cost Visualization: Overlay cost information on the visualization
3D Models: Generate 3D models in addition to 2D sketches

Future Enhancements

Caching layer for identical requests
Batch processing for multiple buildings
Custom style templates
Integration with CAD software
Export to additional formats (SVG, PDF)
Localization for other languages

Contributing

This agent is part of the Disaster Risk Construction Planner system. For contributions:

Follow the existing code structure and patterns
Add tests for new features
Update documentation
Ensure compatibility with orchestrator and UI

Version History

v1.0.0 (2024-01): Initial release
- Basic visualization generation
- Support for 10 building types
- Integration with orchestrator
- Gemini 2.5 Flash Image model

License

Part of the Disaster Risk Construction Planner system.

Support

For issues or questions:

Check this documentation first
Review the troubleshooting section
Check the main project documentation
Review Gemini API documentation at Google AI Studio

Acknowledgments

Google Gemini API for image generation
Blaxel platform for agent deployment
Philippines Disaster Risk data sources
Open-source community for tools and libraries