124ms
Avg Response Time
-8ms improved
89ms
P95 Response Time
-12ms improved
0.15%
Error Rate
-0.05% improved
99.9%
Uptime
Excellent
Response Time Distribution
Error Rate Monitoring
Detailed Performance Metrics
45ms
P50 Response Time
50% of requests complete within this time
89ms
P95 Response Time
95% of requests complete within this time
156ms
P99 Response Time
99% of requests complete within this time
1,247/min
Throughput
Average requests processed per minute
2.3GB
Memory Usage
Current memory consumption
45%
CPU Utilization
Average CPU usage across instances
API Endpoints Performance
Method Endpoint Description Status Avg Response Success Rate 24h Calls
POST /v1/chat/completions Generate chat completions 200 245ms 99.2% 8,247
POST /v1/embeddings Create text embeddings 200 123ms 99.8% 2,156
POST /v1/chat/completions Rate limited request 429 12ms 0% 47
GET /v1/models List available models 200 89ms 100% 892
POST /v1/images/generations Generate images from text 200 2.4s 97.1% 234
POST /v1/audio/transcriptions Transcribe audio to text 200 1.2s 98.5% 567
GET /v1/usage Get usage statistics 200 67ms 99.9% 1,423
POST /v1/fine-tuning/jobs Create fine-tuning job 201 456ms 95.8% 23
DELETE /v1/files/{file_id} Delete uploaded file 204 34ms 99.6% 156
PATCH /v1/assistants/{id} Update assistant settings 200 178ms 98.9% 89
GET /v1/health System health check 200 15ms 100% 4,892
POST /v1/moderations Content moderation 200 78ms 99.4% 1,234

AWS CloudWatch Performance Monitoring

Advanced performance metrics, alerting, and system health monitoring