A newer version of the Gradio SDK is available:
6.1.0
Visualization Agent
AI agent that generates architectural sketches of disaster-resistant buildings using Google's Gemini image generation API (Nano Banana).
Overview
The Visualization Agent receives risk assessment data and building specifications to create contextual architectural visualizations. It analyzes disaster risks (seismic, volcanic, hydrometeorological) and generates prompts that incorporate appropriate disaster-resistant features, then uses Gemini's image generation API to create visual representations.
Features
- Risk-Aware Visualization: Incorporates disaster-resistant features based on risk assessment
- Building Type Support: Generates appropriate architecture for residential, commercial, institutional, industrial, and infrastructure projects
- Philippine Context: Includes tropical climate and local architectural considerations
- Fast Generation: Uses gemini-2.5-flash-image model for quick results (~10-20 seconds)
- Detailed Metadata: Returns prompt used, features included, and generation timestamp
- Backward Compatibility: Supports both legacy format (risk_data + building_type) and new format (prompt + construction_data)
Architecture
High-Level Architecture
Orchestrator Agent
β
Visualization Agent (FastAPI)
βββ Request Validator
βββ Prompt Generator
β βββ Hazard Analyzer
β βββ Feature Mapper
β βββ Context Builder
βββ Gemini API Client
β βββ API Request Handler
β βββ Error Handler
β βββ Response Parser
βββ Response Formatter
βββ Base64 Encoder
βββ Metadata Generator
βββ Feature List Compiler
β
Returns VisualizationData to Orchestrator
β
Gradio UI (displays image)
Component Details
1. VisualizationAgent (Main Class)
Responsibilities:
- Orchestrate the visualization generation process
- Coordinate between prompt generator and API client
- Handle errors and format responses
Key Methods:
generate_visualization(): Main entry point_validate_input(): Validate request parameters_format_response(): Format final response with metadata
2. PromptGenerator
Responsibilities:
- Analyze risk data to identify relevant hazards
- Map hazards to visual features
- Generate descriptive prompts for Gemini API
- Add Philippine architectural context
Key Methods:
generate_prompt(): Create complete prompt_extract_hazard_features(): Extract features from risk data_get_building_description(): Get building type description_add_philippine_context(): Add contextual elements_prioritize_features(): Prioritize features for multi-hazard scenarios
Feature Mapping Logic:
# Seismic hazards
if risk_data.seismic_risk == "high":
features.append("Reinforced concrete frame with cross-bracing")
features.append("Moment-resisting frames")
# Flood hazards
if risk_data.flood_risk == "high":
features.append("Elevated first floor on stilts")
# Volcanic hazards
if risk_data.volcanic_risk == "high":
features.append("Steep-pitched roof for ash shedding")
3. GeminiAPIClient
Responsibilities:
- Communicate with Google Gemini API
- Handle API authentication
- Manage timeouts and retries
- Parse API responses
Key Methods:
generate_image(): Call Gemini API_handle_api_error(): Convert API errors to structured format_validate_response(): Validate API response
API Configuration:
- Model:
gemini-2.5-flash-image - Resolution: 1024x1024
- Format: PNG
- Timeout: 30 seconds
4. Response Formatter
Responsibilities:
- Encode image data to base64
- Generate metadata
- Compile features list
- Format final response
Metadata Included:
- Prompt used for generation
- Model version
- Generation timestamp (ISO 8601)
- Image format and resolution
- List of disaster-resistant features
Data Flow
1. Request arrives at FastAPI endpoint
β
2. Request validation (Pydantic models)
β
3. VisualizationAgent.generate_visualization()
β
4. PromptGenerator.generate_prompt()
- Analyze risk_data
- Extract hazard features
- Get building description
- Add Philippine context
- Compile final prompt
β
5. GeminiAPIClient.generate_image()
- Send prompt to Gemini API
- Wait for response (10-20 seconds)
- Receive image bytes
β
6. Response Formatter
- Encode image to base64
- Generate metadata
- Compile features list
β
7. Return VisualizationResponse
β
8. Orchestrator receives response
β
9. Gradio UI displays image
Error Handling Flow
Error occurs at any stage
β
Error caught by try/except block
β
Error categorized (AUTH, RATE_LIMIT, NETWORK, etc.)
β
ErrorDetail object created
β
Response with success=false returned
β
Orchestrator handles error gracefully
β
UI shows error message or continues without visualization
Technology Stack
- Framework: FastAPI (HTTP server)
- AI API: Google Gemini (gemini-2.5-flash-image)
- Image Processing: Pillow (PIL)
- Data Validation: Pydantic v2
- Deployment: Blaxel platform
- Language: Python 3.11+
Performance Characteristics
- Latency: 10-20 seconds typical (Gemini API call)
- Throughput: 5 concurrent requests
- Memory: ~300MB per request
- CPU: Minimal (mostly I/O bound)
- Network: ~2-5MB per request (image download)
Installation
Prerequisites
- Python 3.11+
- Gemini API key (Google AI Studio)
Setup
- Install dependencies:
cd visualization-agent
pip install -r requirements.txt
- Configure environment variables:
cp .env.example .env
# Edit .env and add your GEMINI_API_KEY
- Test the agent:
python test_agent.py
Usage
As HTTP Service
Start the FastAPI server:
python main.py
Send POST request to generate visualization:
curl -X POST http://localhost:8000/ \
-H "Content-Type: application/json" \
-d '{
"risk_data": {
"seismic_risk": "high",
"flood_risk": "medium",
"location": {"latitude": 14.5995, "longitude": 120.9842}
},
"building_type": "residential_single_family",
"recommendations": {...}
}'
As Python Module
from agent import VisualizationAgent
agent = VisualizationAgent()
visualization_data = agent.generate_visualization(
risk_data=risk_data,
building_type="residential_single_family",
recommendations=recommendations
)
# Access generated image
image_base64 = visualization_data.image_base64
prompt_used = visualization_data.prompt_used
features = visualization_data.features_included
API Reference
Endpoint
POST /
Request Formats
The agent supports three request formats for backward compatibility:
Format 1: Legacy Format (risk_data + building_type)
This format is supported for backward compatibility with older orchestrator versions:
{
"risk_data": {
"location": {...},
"hazards": {...}
},
"building_type": "residential_single_family",
"recommendations": {...} // optional
}
The agent automatically converts this to the new format by:
- Generating a prompt based on building_type
- Creating construction_data from risk_data, building_type, and recommendations
- Processing as a context-aware request
Format 2: New Format (prompt + construction_data)
This is the recommended format for new integrations:
{
"prompt": "A disaster-resistant school building in the Philippines",
"construction_data": {
"building_type": "institutional_school",
"location": {...},
"risk_data": {...},
"recommendations": {...}
},
"config": {
"aspect_ratio": "16:9",
"image_size": "1K"
}
}
Format 3: Basic Format (prompt only)
For simple use cases without context:
{
"prompt": "A modern disaster-resistant building in the Philippines"
}
Request Format Details
Complete Request Schema
{
"risk_data": {
"seismic_risk": str, # "low", "medium", "high"
"flood_risk": str, # "low", "medium", "high"
"volcanic_risk": str, # "low", "medium", "high"
"location": {
"latitude": float, # 4.0 to 21.0 (Philippines)
"longitude": float, # 116.0 to 127.0 (Philippines)
"municipality": str, # Optional
"province": str # Optional
},
"hazards": [ # Optional, detailed hazard list
{
"type": str, # "seismic", "volcanic", "hydrometeorological"
"category": str, # Specific hazard category
"severity": str, # "low", "medium", "high"
"description": str # Human-readable description
}
]
},
"building_type": str, # See Building Types section
"recommendations": { # Optional
"structural": [
{
"category": str,
"priority": str,
"description": str
}
]
}
}
Minimal Request
{
"risk_data": {
"seismic_risk": "high",
"flood_risk": "low",
"volcanic_risk": "low",
"location": {
"latitude": 14.5995,
"longitude": 120.9842
}
},
"building_type": "residential_single_family"
}
Response Format
Success Response
{
"success": true,
"visualization_data": {
"image_base64": str, # Base64-encoded PNG image
"prompt_used": str, # Full prompt sent to Gemini
"model_version": str, # "gemini-2.5-flash-image"
"generation_timestamp": str, # ISO 8601 format
"image_format": "PNG", # Always PNG
"resolution": "1024x1024", # Always 1024x1024
"features_included": [str] # List of disaster-resistant features
},
"error": null
}
Error Response
{
"success": false,
"visualization_data": null,
"error": {
"code": str, # Error code (see Error Codes section)
"message": str, # Human-readable error message
"retry_possible": bool # Whether retry is recommended
}
}
Request Parameters
risk_data (required)
| Field | Type | Required | Description |
|---|---|---|---|
seismic_risk |
string | Yes | Overall seismic risk level: "low", "medium", "high" |
flood_risk |
string | Yes | Overall flood risk level: "low", "medium", "high" |
volcanic_risk |
string | Yes | Overall volcanic risk level: "low", "medium", "high" |
location |
object | Yes | Geographic location data |
hazards |
array | No | Detailed hazard information |
location (required)
| Field | Type | Required | Description |
|---|---|---|---|
latitude |
float | Yes | Latitude (4.0 to 21.0 for Philippines) |
longitude |
float | Yes | Longitude (116.0 to 127.0 for Philippines) |
municipality |
string | No | Municipality name |
province |
string | No | Province name |
building_type (required)
| Value | Description |
|---|---|
residential_single_family |
Single-family home |
residential_multi_family |
Multi-family residential (2-4 units) |
residential_high_rise |
High-rise apartment building |
commercial_office |
Modern office building |
commercial_retail |
Retail shopping center |
industrial_warehouse |
Industrial warehouse facility |
institutional_school |
School building |
institutional_hospital |
Hospital or healthcare facility |
infrastructure_bridge |
Bridge structure |
mixed_use |
Mixed-use development |
recommendations (optional)
Optional construction recommendations from research agent. If provided, may influence feature selection.
Response Fields
visualization_data
| Field | Type | Description |
|---|---|---|
image_base64 |
string | Base64-encoded PNG image data |
prompt_used |
string | Complete prompt sent to Gemini API |
model_version |
string | Gemini model version used |
generation_timestamp |
string | ISO 8601 timestamp of generation |
image_format |
string | Always "PNG" |
resolution |
string | Always "1024x1024" |
features_included |
array | List of disaster-resistant features shown |
error
| Field | Type | Description |
|---|---|---|
code |
string | Error code (see Error Codes section) |
message |
string | Human-readable error description |
retry_possible |
boolean | Whether the request can be retried |
HTTP Status Codes
| Status Code | Description |
|---|---|
| 200 | Success (check success field in response) |
| 400 | Bad Request (invalid input parameters) |
| 401 | Unauthorized (invalid API key) |
| 429 | Too Many Requests (rate limit exceeded) |
| 500 | Internal Server Error |
| 504 | Gateway Timeout (generation took > 30 seconds) |
Rate Limits
- Free Tier: 60 requests per minute
- Paid Tier: Varies by plan
- Concurrent Requests: Maximum 5 simultaneous requests
Authentication
When deployed on Blaxel:
curl -X POST https://run.blaxel.ai/{workspace}/agents/visualization-agent \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {BLAXEL_API_KEY}" \
-d @request.json
Content Type
All requests and responses use application/json.
Supported Building Types
The agent supports 10 building type categories, each with specific architectural characteristics:
Residential Buildings
| Building Type | Code | Description | Typical Features |
|---|---|---|---|
| Single Family | residential_single_family |
Single-family home | 1-2 stories, pitched roof, residential scale |
| Multi Family | residential_multi_family |
Multi-family residential (2-4 units) | 2-3 stories, multiple entrances, shared spaces |
| High Rise | residential_high_rise |
High-rise apartment building | 10+ stories, elevator core, balconies |
Commercial Buildings
| Building Type | Code | Description | Typical Features |
|---|---|---|---|
| Office | commercial_office |
Modern office building | 3-10 stories, glass facade, modern design |
| Retail | commercial_retail |
Retail shopping center | 1-2 stories, large windows, parking area |
Industrial Buildings
| Building Type | Code | Description | Typical Features |
|---|---|---|---|
| Warehouse | industrial_warehouse |
Industrial warehouse facility | Large open space, high ceilings, loading docks |
Institutional Buildings
| Building Type | Code | Description | Typical Features |
|---|---|---|---|
| School | institutional_school |
School building with classrooms | 1-3 stories, multiple wings, playground area |
| Hospital | institutional_hospital |
Hospital or healthcare facility | 3-5 stories, emergency entrance, medical design |
Infrastructure
| Building Type | Code | Description | Typical Features |
|---|---|---|---|
| Bridge | infrastructure_bridge |
Bridge structure | Span structure, support columns, roadway |
Mixed Use
| Building Type | Code | Description | Typical Features |
|---|---|---|---|
| Mixed Use | mixed_use |
Mixed-use development | Commercial ground floor, residential upper floors |
Building Type Selection Guide
Choose the appropriate building type based on your project:
- Residential Projects: Use
residential_single_familyfor houses,residential_multi_familyfor apartments/condos,residential_high_risefor towers - Commercial Projects: Use
commercial_officefor office buildings,commercial_retailfor shops/malls - Industrial Projects: Use
industrial_warehousefor factories, warehouses, distribution centers - Public Buildings: Use
institutional_schoolfor schools,institutional_hospitalfor hospitals/clinics - Infrastructure: Use
infrastructure_bridgefor bridges, overpasses - Mixed Projects: Use
mixed_usefor buildings combining residential and commercial spaces
Prompt Generation Strategy
The Visualization Agent uses a sophisticated prompt generation strategy to create contextual, risk-aware architectural visualizations.
Prompt Template Structure
[Building Type Description] in the Philippines, designed for disaster resistance.
Key Features:
- [Hazard-specific feature 1]
- [Hazard-specific feature 2]
- [Hazard-specific feature 3]
Architectural Style: [Philippine context]
Setting: [Tropical environment with appropriate landscaping]
Perspective: [Exterior view showing structural features]
Style: Architectural sketch, professional rendering
Feature Prioritization
When multiple hazards are present, the agent prioritizes features based on:
- Risk Level: High-risk hazards get priority over medium/low
- Structural Impact: Features that affect the entire building structure
- Visual Prominence: Features that are clearly visible in architectural sketches
Philippine Context Integration
The agent automatically adds contextual elements:
- Tropical climate considerations (ventilation, sun protection)
- Local architectural styles and materials
- Appropriate landscaping (palm trees, tropical vegetation)
- Regional building practices
Hazard-to-Feature Mappings
The agent uses detailed mappings to translate risk data into visual features:
Seismic Hazards
| Hazard Type | Risk Level | Visual Features |
|---|---|---|
| Active Fault | High | Reinforced concrete frame with visible cross-bracing |
| Ground Shaking | High | Moment-resisting frames, shear walls |
| Liquefaction | Medium-High | Deep pile foundation visible at base |
| Earthquake | All | Structural reinforcements, seismic joints |
Example Features:
- Reinforced concrete frame with cross-bracing
- Moment-resisting frames
- Shear walls
- Deep pile foundations
- Seismic isolation systems
Volcanic Hazards
| Hazard Type | Risk Level | Visual Features |
|---|---|---|
| Ashfall | High | Steep-pitched roof (45Β°+ angle) for ash shedding |
| Pyroclastic Flow | High | Reinforced concrete construction, protective barriers |
| Lahar | Medium-High | Elevated foundation, diversion channels |
| Volcanic Activity | All | Robust roof structure, sealed openings |
Example Features:
- Steep-pitched roof for ash shedding
- Reinforced concrete construction
- Protective barriers and walls
- Elevated foundation
- Sealed ventilation systems
Hydrometeorological Hazards
| Hazard Type | Risk Level | Visual Features |
|---|---|---|
| Flood | High | Elevated first floor on stilts (2-3 meters) |
| Storm Surge | High | Coastal reinforcement, breakwaters |
| Severe Winds | High | Aerodynamic roof design, hurricane straps |
| Typhoon | High | Wind-resistant construction, storm shutters |
| Landslide | Medium-High | Retaining walls, terraced foundation |
Example Features:
- Elevated first floor on stilts
- Raised foundation
- Flood barriers
- Aerodynamic roof design
- Hurricane straps
- Storm shutters
- Retaining walls
- Terraced foundation
Multi-Hazard Scenarios
When multiple hazards are present, the agent combines features intelligently:
Example: High Seismic + High Flood
- Elevated foundation on reinforced concrete piles
- Moment-resisting frames visible in structure
- Cross-bracing on elevated sections
Example: High Volcanic + Medium Wind
- Steep-pitched roof with aerodynamic design
- Reinforced concrete construction
- Storm shutters on windows
Example Requests and Responses
Example 1: Residential Building with High Seismic Risk
Request:
{
"risk_data": {
"seismic_risk": "high",
"flood_risk": "low",
"volcanic_risk": "low",
"location": {
"latitude": 14.5995,
"longitude": 120.9842,
"municipality": "Manila",
"province": "Metro Manila"
},
"hazards": [
{
"type": "seismic",
"category": "active_fault",
"severity": "high",
"description": "Near active fault line"
}
]
},
"building_type": "residential_single_family",
"recommendations": {
"structural": [
{
"category": "foundation",
"priority": "critical",
"description": "Use reinforced concrete foundation with seismic isolation"
}
]
}
}
Response:
{
"success": true,
"visualization_data": {
"image_base64": "iVBORw0KGgoAAAANSUhEUgAA...(truncated)",
"prompt_used": "Single-family home in the Philippines, designed for disaster resistance.\n\nKey Features:\n- Reinforced concrete frame with visible cross-bracing\n- Moment-resisting frames for earthquake protection\n- Deep pile foundation visible at base\n\nArchitectural Style: Modern Filipino residential with tropical design elements\nSetting: Tropical environment with palm trees and lush vegetation\nPerspective: Exterior view showing structural reinforcements\nStyle: Architectural sketch, professional rendering",
"model_version": "gemini-2.5-flash-image",
"generation_timestamp": "2024-01-15T10:30:45.123Z",
"image_format": "PNG",
"resolution": "1024x1024",
"features_included": [
"Reinforced concrete frame with cross-bracing",
"Moment-resisting frames",
"Deep pile foundation"
]
}
}
Example 2: Commercial Building with Flood Risk
Request:
{
"risk_data": {
"seismic_risk": "low",
"flood_risk": "high",
"volcanic_risk": "low",
"location": {
"latitude": 10.3157,
"longitude": 123.8854,
"municipality": "Cebu City",
"province": "Cebu"
},
"hazards": [
{
"type": "hydrometeorological",
"category": "flood",
"severity": "high",
"description": "Flood-prone area"
}
]
},
"building_type": "commercial_office"
}
Response:
{
"success": true,
"visualization_data": {
"image_base64": "iVBORw0KGgoAAAANSUhEUgAA...(truncated)",
"prompt_used": "Modern office building in the Philippines, designed for disaster resistance.\n\nKey Features:\n- Elevated first floor on reinforced concrete stilts (2-3 meters)\n- Flood barriers around perimeter\n- Water-resistant materials for lower levels\n\nArchitectural Style: Contemporary Filipino commercial architecture\nSetting: Urban tropical environment with flood management features\nPerspective: Exterior view showing elevated foundation\nStyle: Architectural sketch, professional rendering",
"model_version": "gemini-2.5-flash-image",
"generation_timestamp": "2024-01-15T10:32:18.456Z",
"image_format": "PNG",
"resolution": "1024x1024",
"features_included": [
"Elevated first floor on stilts",
"Flood barriers",
"Water-resistant construction"
]
}
}
Example 3: Multi-Hazard Scenario
Request:
{
"risk_data": {
"seismic_risk": "high",
"flood_risk": "medium",
"volcanic_risk": "high",
"location": {
"latitude": 13.2572,
"longitude": 123.8144,
"municipality": "Legazpi",
"province": "Albay"
},
"hazards": [
{
"type": "volcanic",
"category": "ashfall",
"severity": "high"
},
{
"type": "seismic",
"category": "earthquake",
"severity": "high"
}
]
},
"building_type": "institutional_school"
}
Response:
{
"success": true,
"visualization_data": {
"image_base64": "iVBORw0KGgoAAAANSUhEUgAA...(truncated)",
"prompt_used": "School building with classrooms in the Philippines, designed for disaster resistance.\n\nKey Features:\n- Steep-pitched reinforced concrete roof for volcanic ash shedding\n- Reinforced concrete frame with seismic cross-bracing\n- Moment-resisting frames for earthquake protection\n- Protective barriers around building perimeter\n\nArchitectural Style: Institutional Filipino architecture with disaster-resistant design\nSetting: Tropical environment near volcanic area with protective landscaping\nPerspective: Exterior view showing roof design and structural reinforcements\nStyle: Architectural sketch, professional rendering",
"model_version": "gemini-2.5-flash-image",
"generation_timestamp": "2024-01-15T10:35:22.789Z",
"image_format": "PNG",
"resolution": "1024x1024",
"features_included": [
"Steep-pitched roof for ash shedding",
"Reinforced concrete frame with cross-bracing",
"Moment-resisting frames",
"Protective barriers"
]
}
}
Example 4: Error Response
Request:
{
"risk_data": {...},
"building_type": "residential_single_family"
}
Response (Invalid API Key):
{
"success": false,
"visualization_data": null,
"error": {
"code": "AUTH_ERROR",
"message": "Invalid or missing Gemini API key. Please check your GEMINI_API_KEY environment variable.",
"retry_possible": false
}
}
Error Handling
The agent provides comprehensive error handling with detailed error codes and messages.
Error Codes
| Error Code | Description | Retry Possible | Recommended Action |
|---|---|---|---|
AUTH_ERROR |
Invalid or missing API key | No | Check GEMINI_API_KEY environment variable |
RATE_LIMIT |
API quota exceeded | Yes | Wait and retry after delay (typically 60 seconds) |
GENERATION_FAILED |
Image generation failed | Yes | Retry with same or modified prompt |
NETWORK_ERROR |
Connection issues | Yes | Check internet connection and retry |
TIMEOUT |
Generation took longer than 30 seconds | Yes | Retry or simplify prompt |
INVALID_INPUT |
Invalid request parameters | No | Check request format and parameters |
MODEL_ERROR |
Gemini model error | Yes | Retry or contact support |
Error Response Format
All errors follow this structure:
{
"success": false,
"visualization_data": null,
"error": {
"code": "ERROR_CODE",
"message": "Human-readable error description",
"retry_possible": true/false
}
}
Retry Strategy
For errors with retry_possible: true:
- Rate Limit Errors: Wait 60 seconds before retrying
- Network Errors: Retry immediately, then with exponential backoff (2s, 4s, 8s)
- Generation Failures: Retry up to 2 times with same prompt
- Timeouts: Retry once, then consider simplifying the prompt
Error Logging
All errors are logged with full context:
- Request parameters
- Error type and message
- Timestamp
- Stack trace (for debugging)
Example log entry:
2024-01-15 10:30:45 ERROR [VisualizationAgent] Image generation failed
Error: RATE_LIMIT
Message: API quota exceeded
Request: building_type=residential_single_family, location=Manila
Retry: true
Deployment
Prerequisites
Before deploying, ensure you have:
Blaxel CLI installed:
pip install blaxelBlaxel account and workspace:
- Sign up at blaxel.ai
- Create a workspace
- Get your API key
Gemini API key:
- Get API key from Google AI Studio
- Add to
.envfile
Local Development
Install dependencies:
cd visualization-agent pip install -r requirements.txtConfigure environment:
cp .env.example .env # Edit .env and add: # GEMINI_API_KEY=your_api_key_hereRun locally:
python main.pyTest the agent:
python test_agent.py
Blaxel Platform Deployment
Step 1: Configure Environment Variables
Create or update .env file:
GEMINI_API_KEY=your_gemini_api_key
GEMINI_MODEL=gemini-2.5-flash-image
Step 2: Review Configuration
Check blaxel.toml configuration:
name = "visualization-agent"
type = "agent"
[env]
GEMINI_API_KEY = "${GEMINI_API_KEY}"
GEMINI_MODEL = "${GEMINI_MODEL}"
[runtime]
timeout = 30
memory = 512
[entrypoint]
prod = "python main.py"
[[triggers]]
id = "trigger-visualization-agent"
type = "http"
timeout = 30
[triggers.configuration]
path = "agents/visualization-agent/process"
retry = 1
authenticationType = "private"
Step 3: Deploy to Blaxel
cd visualization-agent
bl deploy --env-file .env
Step 4: Verify Deployment
The agent will be available at:
https://run.blaxel.ai/{workspace}/agents/visualization-agent
Test the deployed agent:
curl -X POST https://run.blaxel.ai/{workspace}/agents/visualization-agent \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {BLAXEL_API_KEY}" \
-d @test_request.json
Configuration Options
Runtime Configuration
| Parameter | Default | Description |
|---|---|---|
timeout |
30 | Maximum execution time (seconds) |
memory |
512 | Memory limit (MB) |
retry |
1 | Number of retry attempts |
Model Configuration
| Parameter | Default | Description |
|---|---|---|
GEMINI_MODEL |
gemini-2.5-flash-image | Gemini model version |
GEMINI_API_KEY |
(required) | Google Gemini API key |
Image Configuration
- Resolution: 1024x1024 (fixed)
- Format: PNG
- Watermark: SynthID (automatic)
Integration with Orchestrator
The orchestrator agent calls the visualization agent automatically. To integrate:
Update orchestrator's
blaxel.toml:[[resources]] id = "visualization-agent" type = "agent" name = "visualization-agent"Orchestrator calls visualization agent:
visualization_response = await self.execute_visualization( risk_data=risk_data, building_type=building_type, recommendations=recommendations )Response flows to Gradio UI:
- Image displayed in visualization tab
- Metadata shown alongside image
- Features list displayed
Monitoring and Logs
View Logs
bl logs visualization-agent
Monitor Performance
Key metrics to monitor:
- Generation Time: Should be < 30 seconds
- Success Rate: Should be > 95%
- Error Rate: Monitor for rate limit errors
- Memory Usage: Should stay under 512MB
Common Issues
Timeout Errors:
- Increase timeout in
blaxel.toml - Simplify prompts
- Check Gemini API status
- Increase timeout in
Rate Limit Errors:
- Implement request throttling
- Upgrade Gemini API quota
- Add retry logic with backoff
Memory Issues:
- Increase memory limit in
blaxel.toml - Optimize image processing
- Check for memory leaks
- Increase memory limit in
Scaling Considerations
For high-volume deployments:
- Increase Memory: Set to 1024MB for better performance
- Add Caching: Cache generated images for identical requests
- Load Balancing: Deploy multiple instances
- Rate Limiting: Implement request queuing
- Monitoring: Set up alerts for errors and performance
Security Best Practices
API Key Management:
- Never commit API keys to version control
- Use environment variables only
- Rotate keys regularly
Authentication:
- Use
authenticationType = "private"in blaxel.toml - Require BLAXEL_API_KEY for all requests
- Validate request signatures
- Use
Input Validation:
- Validate all input parameters
- Sanitize location data
- Check building type against allowed values
Output Security:
- Ensure generated images don't contain sensitive data
- Add watermarks (automatic with SynthID)
- Log all generation requests
Testing
Unit Tests
Run the unit test suite:
python test_agent.py
The test suite includes:
Agent Initialization Tests
- Verify agent initializes with correct configuration
- Check Gemini API client setup
- Validate environment variable loading
Building Type Tests
- Residential Single Family: High seismic risk scenario
- Commercial Office: Flood risk scenario
- Institutional School: Multiple hazards (volcanic + seismic)
- Industrial Warehouse: Wind resistance scenario
- Infrastructure Bridge: Multi-hazard scenario
Risk Scenario Tests
- High seismic risk with active fault
- High flood risk in coastal area
- High volcanic risk with ashfall
- Multiple hazards combined
- Low risk baseline scenario
Error Handling Tests
- Invalid API key handling
- Network error simulation
- Timeout handling
- Rate limit error handling
- Invalid input validation
Response Format Tests
- Base64 encoding validation
- Metadata completeness
- Timestamp format verification
- Features list accuracy
Integration Tests
Test the HTTP endpoint:
python test_http_endpoint.py
Integration tests cover:
- POST endpoint functionality
- Request validation
- Response format compatibility
- Error response handling
- Orchestrator integration
Manual Testing
Test with real Gemini API:
Set up environment:
export GEMINI_API_KEY=your_api_keyRun test script:
python test_agent.pyVerify output:
- Check generated images are valid PNG files
- Verify disaster-resistant features are visible
- Confirm metadata is accurate
- Validate generation time < 30 seconds
Test Coverage
Current test coverage:
- Prompt Generation: 100%
- API Client: 95% (excluding live API calls)
- Response Formatting: 100%
- Error Handling: 100%
- Integration: 90%
Performance Testing
Test performance metrics:
# Test generation time
time python test_agent.py
# Test concurrent requests
python -m pytest test_concurrent.py -n 5
# Test memory usage
python -m memory_profiler test_agent.py
Expected performance:
- Generation Time: 10-20 seconds average
- Memory Usage: < 300MB per request
- Success Rate: > 95%
- Concurrent Requests: 5 simultaneous requests supported
Performance
- Generation Time: 10-20 seconds typical
- Image Size: ~500KB - 2MB per image
- Resolution: 1024x1024 pixels
- Format: PNG with SynthID watermark
Troubleshooting
Common Issues and Solutions
Issue: "Invalid API Key" Error
Symptoms:
{
"error": {
"code": "AUTH_ERROR",
"message": "Invalid or missing Gemini API key"
}
}
Solutions:
- Check
.envfile containsGEMINI_API_KEY=your_key - Verify API key is valid at Google AI Studio
- Ensure no extra spaces or quotes around the key
- Restart the agent after updating
.env
Issue: Rate Limit Exceeded
Symptoms:
{
"error": {
"code": "RATE_LIMIT",
"message": "API quota exceeded"
}
}
Solutions:
- Wait 60 seconds before retrying
- Check your API quota at Google AI Studio
- Upgrade to higher quota tier if needed
- Implement request throttling in orchestrator
Issue: Generation Timeout
Symptoms:
- Request takes longer than 30 seconds
- Timeout error returned
Solutions:
- Simplify the prompt (reduce number of features)
- Check Gemini API status
- Increase timeout in
blaxel.toml(not recommended) - Retry the request
Issue: Poor Quality Visualizations
Symptoms:
- Generated images don't show disaster-resistant features clearly
- Building type doesn't match expectations
Solutions:
- Verify risk data is accurate and complete
- Check building type is correct
- Ensure hazard severity levels are set appropriately
- Review prompt generation logic in
agent.py
Issue: Network Errors
Symptoms:
{
"error": {
"code": "NETWORK_ERROR",
"message": "Connection failed"
}
}
Solutions:
- Check internet connection
- Verify firewall allows HTTPS to Google APIs
- Check proxy settings if applicable
- Retry with exponential backoff
Issue: Memory Errors
Symptoms:
- Agent crashes with out-of-memory error
- Slow performance
Solutions:
- Increase memory limit in
blaxel.tomlto 1024MB - Check for memory leaks in custom code
- Reduce concurrent request limit
- Monitor memory usage with profiling tools
Debug Mode
Enable debug logging:
import logging
logging.basicConfig(level=logging.DEBUG)
This will show:
- Detailed API request/response logs
- Prompt generation steps
- Error stack traces
- Performance metrics
Getting Help
If you encounter issues not covered here:
- Check the logs:
bl logs visualization-agent - Review the Gemini API documentation
- Check the main project documentation
- Contact support with:
- Error message and code
- Request payload (sanitized)
- Timestamp of the error
- Agent version and configuration
Limitations
Technical Limitations
- Internet Connection: Requires active internet for Gemini API
- API Rate Limits: Subject to Gemini API quotas (varies by tier)
- Generation Time: 10-30 seconds per image (cannot be reduced)
- Resolution: Fixed at 1024x1024 pixels
- Format: PNG only (no JPEG, SVG, or other formats)
- Watermark: SynthID watermark automatically added (cannot be removed)
Functional Limitations
- Artistic Interpretation: Generated images are conceptual sketches, not engineering drawings
- Feature Visibility: Some structural features may not be clearly visible in exterior views
- Accuracy: AI-generated images may not perfectly represent all specified features
- Consistency: Multiple generations with same prompt may produce different results
- Detail Level: Cannot generate detailed floor plans or technical specifications
Geographic Limitations
- Philippine Context: Optimized for Philippine architecture and climate
- Location Data: Requires valid Philippine coordinates
- Regional Styles: May not accurately represent all regional architectural variations
Use Case Limitations
Appropriate Uses:
- Conceptual visualization for stakeholders
- Initial design exploration
- Communication tool for non-technical audiences
- Marketing and presentation materials
Inappropriate Uses:
- Engineering drawings or construction blueprints
- Structural analysis or calculations
- Building permit applications
- Detailed cost estimation basis
- Legal or contractual documentation
Integration
The Visualization Agent integrates with:
- Orchestrator Agent: Receives requests and returns visualization data
- Gradio UI: Displays generated images in the web interface
- Risk Assessment Agent: Uses risk data to inform feature selection
Environment Variables
Required Variables
GEMINI_API_KEY(required): Google Gemini API key for image generation- Get your API key from: https://makersuite.google.com/app/apikey
- Alternative:
GOOGLE_API_KEYcan be used instead ofGEMINI_API_KEY
Optional Variables
VISUALIZATION_MODEL(optional): Gemini model version to use- Default:
gemini-2.5-flash-image - Options:
gemini-2.5-flash-image,gemini-3-pro-image-preview - Example:
VISUALIZATION_MODEL=gemini-2.5-flash-image
- Default:
VISUALIZATION_OUTPUT_DIR(optional): Directory where generated images will be saved- Default:
./generated_images - Example:
VISUALIZATION_OUTPUT_DIR=./my_images - Note: Directory will be created automatically if it doesn't exist
- Default:
Environment Variable Priority
The agent loads configuration in the following priority order (highest to lowest):
- Constructor parameters: Values passed directly to
VisualizationAgent() - Environment variables: Values from
.envfile or system environment - Default values: Built-in defaults
Example:
# Priority 1: Constructor parameter (highest)
agent = VisualizationAgent(model="gemini-3-pro-image-preview")
# Priority 2: Environment variable
# VISUALIZATION_MODEL=gemini-2.5-flash-image
# Priority 3: Default value (lowest)
# Default: gemini-2.5-flash-image
Setting Environment Variables
Local Development
Create a .env file in the visualization-agent directory:
# Required
GEMINI_API_KEY=your_gemini_api_key_here
# Optional
VISUALIZATION_MODEL=gemini-2.5-flash-image
VISUALIZATION_OUTPUT_DIR=./generated_images
Blaxel Deployment
Set environment variables in blaxel.toml:
[env]
GEMINI_API_KEY = "${GEMINI_API_KEY}"
VISUALIZATION_MODEL = "${VISUALIZATION_MODEL}"
VISUALIZATION_OUTPUT_DIR = "${VISUALIZATION_OUTPUT_DIR}"
Then deploy with environment file:
bl deploy --env-file .env
System Environment
Set environment variables in your shell:
# Bash/Zsh
export GEMINI_API_KEY=your_api_key
export VISUALIZATION_MODEL=gemini-2.5-flash-image
export VISUALIZATION_OUTPUT_DIR=./generated_images
# Windows Command Prompt
set GEMINI_API_KEY=your_api_key
set VISUALIZATION_MODEL=gemini-2.5-flash-image
set VISUALIZATION_OUTPUT_DIR=./generated_images
# Windows PowerShell
$env:GEMINI_API_KEY="your_api_key"
$env:VISUALIZATION_MODEL="gemini-2.5-flash-image"
$env:VISUALIZATION_OUTPUT_DIR="./generated_images"
Best Practices
Prompt Optimization
- Be Specific: Include detailed hazard information for better feature selection
- Prioritize Hazards: Focus on the most critical risks (high severity)
- Provide Context: Include location data for better Philippine context
- Use Recommendations: Pass construction recommendations when available
Performance Optimization
- Batch Requests: Group multiple visualizations when possible
- Cache Results: Cache generated images for identical requests
- Async Processing: Use async/await for concurrent requests
- Monitor Quotas: Track API usage to avoid rate limits
Error Handling
- Implement Retries: Retry transient errors with exponential backoff
- Graceful Degradation: Continue without visualization if generation fails
- Log Errors: Log all errors with full context for debugging
- User Feedback: Provide clear error messages to users
Security
- Protect API Keys: Never expose GEMINI_API_KEY in client code
- Validate Input: Always validate and sanitize input parameters
- Rate Limiting: Implement rate limiting to prevent abuse
- Monitor Usage: Track API usage and set up alerts
Integration
- Async Calls: Call visualization agent asynchronously from orchestrator
- Timeout Handling: Set appropriate timeouts (30+ seconds)
- Fallback Logic: Have fallback behavior if visualization fails
- Response Validation: Validate response format before using
Frequently Asked Questions
General Questions
Q: How long does it take to generate a visualization? A: Typically 10-20 seconds, with a maximum timeout of 30 seconds.
Q: Can I generate multiple visualizations for the same building? A: Yes, but each request may produce slightly different results due to AI generation variability.
Q: What image format is returned? A: PNG format, base64-encoded, at 1024x1024 resolution.
Q: Can I remove the SynthID watermark? A: No, the watermark is automatically added by Gemini API and cannot be removed.
Technical Questions
Q: Can I use a different Gemini model?
A: Yes, set GEMINI_MODEL environment variable, but gemini-2.5-flash-image is recommended for speed.
Q: How do I increase the image resolution? A: Currently fixed at 1024x1024. Higher resolutions may be supported in future Gemini models.
Q: Can I generate images without an internet connection? A: No, the agent requires internet access to call the Gemini API.
Q: How many concurrent requests can the agent handle? A: Up to 5 concurrent requests, limited by memory and API quotas.
Integration Questions
Q: How does the orchestrator call the visualization agent? A: Via HTTP POST to the Blaxel endpoint with risk data and building type.
Q: What happens if visualization generation fails? A: The orchestrator continues without visualization data, and the UI shows a message.
Q: Can I call the visualization agent directly from the UI? A: Not recommended. Always call through the orchestrator for proper coordination.
Q: How is the generated image displayed in the UI? A: The Gradio UI decodes the base64 image and displays it in the visualization tab.
Cost and Limits
Q: How much does it cost to generate a visualization? A: Depends on your Gemini API plan. Check Google AI Studio for pricing.
Q: What are the rate limits? A: Free tier: 60 requests/minute. Paid tiers vary by plan.
Q: Can I increase my API quota? A: Yes, upgrade your Gemini API plan at Google AI Studio.
Q: Is there a limit on the number of visualizations? A: Only limited by your API quota and rate limits.
Troubleshooting
Q: Why am I getting "Invalid API Key" errors? A: Check that GEMINI_API_KEY is set correctly in your .env file and is valid.
Q: Why are my visualizations timing out? A: Check your internet connection and Gemini API status. Simplify prompts if needed.
Q: Why don't the disaster-resistant features show clearly? A: Ensure risk data is accurate and hazard severity is set appropriately. AI generation may vary.
Q: How do I debug generation issues? A: Enable debug logging and check the prompt_used field in the response.
Roadmap
Planned Features
- Multiple View Angles: Generate front, side, and aerial views
- Before/After Comparisons: Show standard vs. disaster-resistant designs
- Higher Resolution: Support 4K resolution with Gemini 3 Pro
- Style Variations: Allow users to choose architectural styles
- Annotation Overlay: Add labels pointing to disaster-resistant features
- Interactive Refinement: Support multi-turn conversations for improvements
- Cost Visualization: Overlay cost information on the visualization
- 3D Models: Generate 3D models in addition to 2D sketches
Future Enhancements
- Caching layer for identical requests
- Batch processing for multiple buildings
- Custom style templates
- Integration with CAD software
- Export to additional formats (SVG, PDF)
- Localization for other languages
Contributing
This agent is part of the Disaster Risk Construction Planner system. For contributions:
- Follow the existing code structure and patterns
- Add tests for new features
- Update documentation
- Ensure compatibility with orchestrator and UI
Version History
- v1.0.0 (2024-01): Initial release
- Basic visualization generation
- Support for 10 building types
- Integration with orchestrator
- Gemini 2.5 Flash Image model
License
Part of the Disaster Risk Construction Planner system.
Support
For issues or questions:
- Check this documentation first
- Review the troubleshooting section
- Check the main project documentation
- Review Gemini API documentation at Google AI Studio
Acknowledgments
- Google Gemini API for image generation
- Blaxel platform for agent deployment
- Philippines Disaster Risk data sources
- Open-source community for tools and libraries