Spaces:
Running
Profiling Quick Start
1. Set Environment Variables
export LANGFUSE_PUBLIC_KEY="pk-..."
export LANGFUSE_SECRET_KEY="sk-..."
Or add to .env file in project root.
2. Run an Experiment
# Run default experiment (fast vs balanced)
./system_tests/profiling/run_experiment.sh
# Compare different providers
./system_tests/profiling/run_experiment.sh --config providers_comparison.yaml
# Compare all modes for one provider
./system_tests/profiling/run_experiment.sh --config fast_vs_accurate.yaml
# Full matrix: providers Γ modes
./system_tests/profiling/run_experiment.sh --config full_matrix_comparison.yaml
3. View Results
# Start HTTP server and open browser
./system_tests/profiling/serve.sh --open
# Or just start server (visit http://localhost:8080/comparison.html)
./system_tests/profiling/serve.sh
Comparison Types
Mode Comparison (Same Provider)
Compare fast vs balanced vs accurate modes using the same LLM provider.
Example output files: fast_20250930.json, balanced_20250930.json, accurate_20250930.json
Provider Comparison (Same Mode)
Compare OpenAI vs Azure vs WatsonX using the same mode (e.g., balanced).
Example output files: openai_balanced_20250930.json, azure_balanced_20250930.json, watsonx_balanced_20250930.json
Full Matrix Comparison
Compare all combinations of providers and modes (2 providers Γ 2 modes = 4 experiments).
Example output files: openai_fast_20250930.json, openai_balanced_20250930.json, azure_fast_20250930.json, azure_balanced_20250930.json
Available Scripts
| Script | Purpose |
|---|---|
run_experiment.sh |
Run profiling experiments with YAML config |
serve.sh |
Start HTTP server to view results |
bin/run_profiling.sh |
Lower-level profiling script with CLI args |
bin/profile_digital_sales_tasks.py |
Core Python profiling tool |
Configuration Files
Located in config/:
default_experiment.yaml- Fast vs Balanced comparisonfast_vs_accurate.yaml- Fast vs Accurate comparisonproviders_comparison.yaml- OpenAI vs Azure vs WatsonX (same mode)full_matrix_comparison.yaml- Full provider Γ mode matrix.secrets.yaml- Your Langfuse credentials (git-ignored)
Example: Provider Comparison
Create or use config/providers_comparison.yaml:
experiment:
name: "providers_comparison"
runs:
- name: "openai_balanced"
test_id: "settings.openai.toml:balanced:test_get_top_account_by_revenue_stream"
iterations: 3
output: "experiments/openai_balanced_{{timestamp}}.json"
- name: "azure_balanced"
test_id: "settings.azure.toml:balanced:test_get_top_account_by_revenue_stream"
iterations: 3
output: "experiments/azure_balanced_{{timestamp}}.json"
Then run:
./system_tests/profiling/run_experiment.sh --config providers_comparison.yaml
./system_tests/profiling/serve.sh --open
Color Coding in Charts
The comparison HTML automatically color-codes experiments:
Modes:
- Fast = Green π’
- Balanced = Blue π΅
- Accurate = Orange π
Providers:
- OpenAI = Teal π¦
- Azure = Azure Blue π
- WatsonX = IBM Blue π΅
Combined Labels (e.g., openai_balanced) get colors based on provider first, then mode.
Directory Structure
system_tests/profiling/
βββ run_experiment.sh # Main entry point
βββ serve.sh # View results
βββ bin/ # Internal scripts
βββ config/ # YAML configurations
βββ experiments/ # Results + HTML viewer
βββ reports/ # Individual reports
Tips
- π‘ HTML auto-loads all JSON files in experiments/
- π‘ Naming format:
{provider}_{mode}_{timestamp}.jsonor{mode}_{timestamp}.json - π‘ CLI args override YAML config settings
- π‘ Use
{{timestamp}}in output paths for unique files - π‘ Retry mechanism handles Langfuse propagation delays
- π‘ Stop server with Ctrl+C
For full documentation, see README.md.