Requests are now scheduled by per-key RPM window first, then by current in-flight load. `Acquire Wait` is the maximum time a request will wait for a slot before fast-failing.
Key Health Check
Uses a minimal upstream chat completion probe. `403` is treated as invalid or banned, while timeout and network issues are recorded as probe failures only.
Add Key
Add Model Alias
Enable this for vision or multimodal target models.