AI Gateway Performance
Monitor API routing, response times, provider performance, and infrastructure health
124ms
                    Avg Response Time
                    
                        
                        -8ms improved
                    
                89ms
                    P95 Response Time
                    
                        
                        -12ms improved
                    
                0.15%
                    Error Rate
                    
                        
                        -0.05% improved
                    
                99.9%
                    Uptime
                    
                        
                        Excellent
                    
                
                            
                            Response Time Distribution
                        
                    
                            
                            Error Rate Monitoring
                        
                    
                    
                    Detailed Performance Metrics
                
                
                45ms
                        P50 Response Time
                        50% of requests complete within this time
                    89ms
                        P95 Response Time
                        95% of requests complete within this time
                    156ms
                        P99 Response Time
                        99% of requests complete within this time
                    1,247/min
                        Throughput
                        Average requests processed per minute
                    2.3GB
                        Memory Usage
                        Current memory consumption
                    45%
                        CPU Utilization
                        Average CPU usage across instances
                    API Endpoints Performance
                | Method | Endpoint | Description | Status | Avg Response | Success Rate | 24h Calls | 
|---|---|---|---|---|---|---|
| POST | /v1/chat/completions | Generate chat completions | 200 | 245ms | 99.2% | 8,247 | 
| POST | /v1/embeddings | Create text embeddings | 200 | 123ms | 99.8% | 2,156 | 
| POST | /v1/chat/completions | Rate limited request | 429 | 12ms | 0% | 47 | 
| GET | /v1/models | List available models | 200 | 89ms | 100% | 892 | 
| POST | /v1/images/generations | Generate images from text | 200 | 2.4s | 97.1% | 234 | 
| POST | /v1/audio/transcriptions | Transcribe audio to text | 200 | 1.2s | 98.5% | 567 | 
| GET | /v1/usage | Get usage statistics | 200 | 67ms | 99.9% | 1,423 | 
| POST | /v1/fine-tuning/jobs | Create fine-tuning job | 201 | 456ms | 95.8% | 23 | 
| DELETE | /v1/files/{file_id} | Delete uploaded file | 204 | 34ms | 99.6% | 156 | 
| PATCH | /v1/assistants/{id} | Update assistant settings | 200 | 178ms | 98.9% | 89 | 
| GET | /v1/health | System health check | 200 | 15ms | 100% | 4,892 | 
| POST | /v1/moderations | Content moderation | 200 | 78ms | 99.4% | 1,234 | 
AWS CloudWatch Performance Monitoring
Advanced performance metrics, alerting, and system health monitoring