Available on Enterprise Self Hosting plan only.Requires 1.17.0 or higher version of the Gateway.
Policy Types
1. Usage Limit Policies
Control spend (cost) or token consumption with configurable limits and periodic resets.2. Rate Limit Policies
Control request throughput with requests-per-minute (rpm), requests-per-hour (rph), or requests-per-day (rpd) limits.Policy Structure
Conditions
Conditions determine which requests a policy applies to. All conditions must match (AND logic). Each condition supports:| Field | Type | Description |
|---|---|---|
key | string | The dimension to match against |
value | string | string[] | Value(s) to match (OR logic for arrays). Use "*" for wildcard. |
excludes | string | string[] | Value(s) to exclude from matching |
Supported Condition Keys
| Key | Description | Example Value | Version |
|---|---|---|---|
api_key | Match by API key ID | "uuid_of_api_key" | 1.17.0+ |
metadata.* | Match by request metadata | "metadata._user" with value "user123" | 1.17.0+ |
workspace_id | Match by workspace | "workspace id" | 1.17.0+ |
virtual_key | Match by virtual key slug | "virtual key slug" | 2.0.0+ |
provider | Match by provider | "openai", "anthropic", "azure-openai" | 2.0.0+ |
config | Match by config slug | "config slug" | 2.0.0+ |
prompt | Match by prompt slug | "prompt slug" | 2.0.0+ |
model | Match by model (with wildcard support) | "@openai/gpt-4o", "@anthropic/*" | 2.0.0+ |
Group By
Group by determines how usage/limits are bucketed. Each unique combination of group_by values gets its own counter. Example:Authentication
All policy endpoints require authentication using:- API Key: Include in
x-portkey-api-keyheader
Permissions
Policies require the following RBAC permissions:policies:create- Create policiespolicies:read- Read policiespolicies:update- Update policiespolicies:delete- Delete policiespolicies:list- List policies
Base URL
All policy endpoints are under:Usage Limits Policies
Usage limits policies allow you to set maximum usage (cost or tokens) that can be consumed over a period. When the limit is reached, requests will be blocked until the limit resets.When a usage limit is exceeded, Portkey returns a 412 Precondition Failed HTTP status code.
Policy Types
cost: Limit based on total cost (in dollars)tokens: Limit based on total tokens consumed
Parameters
credit_limit (required)
The maximum usage limit that can be consumed before requests are blocked.
- Type: Number (integer or float)
- Minimum value:
- For
costtype:1(represents $1.00) - For
tokenstype:100tokens
- For
- Units:
- For
costtype: Value is in USD (dollars) - For
tokenstype: Value is in tokens
- For
- Behaviour: When the credit limit is reached, all matching requests will be blocked until the limit resets (if
periodic_resetis configured)
alert_threshold (optional)
An optional threshold that triggers notifications before the credit limit is reached.
- Type: Number (integer or float)
- Minimum value:
1 - Units:
- For
costtype: Value is in USD (dollars) - For
tokenstype: Value is in tokens
- For
- Validation: Must be less than
credit_limitif provided - Behaviour:
- When usage reaches this threshold, email notifications are sent to configured recipients
- The API key continues to function normally until the full
credit_limitis reached - Useful for proactive monitoring and budget management
periodic_reset (optional)
Configures automatic reset of the usage limit at regular intervals.
- Type: String (enum)
- Valid values:
"weekly"- Budget limits automatically reset every week"monthly"- Budget limits automatically reset every month- Omitted/not provided - No periodic reset (limit applies until exhausted)
- Reset timing:
- Weekly: Resets occur every Monday at 12:00 AM UTC
- Monthly: Resets occur on the 1st calendar day of each month at 12:00 AM UTC
- Behaviour: When a reset occurs, the usage counter resets to zero and the limit becomes available again
Validation Rules
- Conditions: Each condition must have
keyandvaluefields. - Group By: Each group must have a
keyfield. - Valid Keys: For both
conditionsandgroup_by, valid keys includeapi_key,workspace_id,virtual_key,provider,config,prompt,model, or any key starting withmetadata. - Alert Threshold: Must be less than
credit_limitif provided. - Workspace: Workspace ID is required (can be provided via API key or request body).
Rate Limits Policies
Rate limits policies allow you to control the rate of requests or tokens consumed per minute, hour, or day. When the rate limit is exceeded, requests will be throttled.When a rate limit is exceeded, Portkey returns a 429 Too Many Requests HTTP status code.
Policy Types
requests: Limit based on number of requeststokens: Limit based on number of tokens
Rate Units
rpm: Requests/Tokens per minuterph: Requests/Tokens per hourrpd: Requests/Tokens per day
Parameters
type (required)
The type of rate limit to enforce.
- Type: String (enum)
- Valid values:
"requests"- Limit based on number of API requests"tokens"- Limit based on number of tokens consumed
- Behaviour: Determines what metric is being rate-limited
unit (required)
The time interval unit for the rate limit.
- Type: String (enum)
- Valid values:
"rpm"- Requests/Tokens per minute"rph"- Requests/Tokens per hour"rpd"- Requests/Tokens per day
- Behaviour:
- Defines the time window over which the rate limit is calculated
- Limits reset automatically at the start of each time period
value (required)
The maximum number of requests or tokens allowed within the specified time unit.
- Type: Number (integer)
- Minimum value:
1 - Units:
- For
requeststype: Value represents the number of API requests - For
tokenstype: Value represents the number of tokens
- For
- Behaviour: When the rate limit is exceeded, subsequent requests are throttled/rejected until the time period resets
Validation Rules
- Conditions: Each condition must have
keyandvaluefields. - Group By: Each group must have a
keyfield. - Valid Keys: For both
conditionsandgroup_by, valid keys includeapi_key,workspace_id,virtual_key,provider,config,prompt,model, or any key starting withmetadata. - Value: Must be a numeric value.
- Workspace: Workspace ID is required (can be provided via API key or request body).
Use Cases
Use Case 1: Global Workspace Rate Limit
Limit all requests in a workspace to 1000 requests per minute.Use Case 2: Per User Rate Limit
Limit each user (identified by_user metadata) to 100 requests per minute.
Use Case 3: Per User Monthly Spend Budget
Limit each user to $50/month in API costs.Use Case 4: Provider Specific Rate Limit
Limit OpenAI requests to 500 RPM, separate from other providers.Use Case 5: Model Specific Token Rate Limit
Limit GPT-4o to 100,000 tokens per minute.Use Case 6: Limit All Models from a Provider (Wildcard)
Limit all Anthropic models to 50,000 tokens per minute.Use Case 7: Per Virtual Key Weekly Budget
Track and limit spend per virtual key to $100/week.Use Case 8: Config Specific Rate Limit
Limit requests using a specific gateway config to 200 RPM.Use Case 9: Prompt Specific Usage Budget
Limit a specific prompt template to 1 million tokens per month.Use Case 10: Multiple Allowed Values (OR Logic)
Allow only specific API keys to use expensive models, with a combined rate limit.Use Case 11: Exclude Specific Models
Apply rate limit to all OpenAI models EXCEPT GPT-4o.Use Case 12: Per User, Per Model Budget
Track spend separately for each user AND model combination.Use Case 13: Team Based Provider Quota
Limit each team (from metadata) to specific token quotas per provider.Use Case 14: Exclude Internal API Keys from Limits
Apply rate limits to all API keys except internal ones.Use Case 15: Combined Conditions - Premium Users on Specific Models
Rate limit premium tier users only when using expensive models.Important Notes
- Condition Matching: All conditions must match (AND). Within a condition, multiple values use OR logic.
-
Model Format: Models are specified as
@provider/model-name. Use@provider/*for wildcard matching. -
Metadata Keys: Use
metadata.prefix followed by your metadata key name (e.g.,metadata._user,metadata._team). -
Periodic Reset Options:
"weekly","monthly", ornullfor no reset. -
Rate Limit Units:
"rpm"(per minute),"rph"(per hour),"rpd"(per day). -
Usage Limit Types:
"cost"(in dollars) or"tokens".

