Skip to main content

What It Is

Token limits and budgets is the section of the Admin Panel where administrators see and control workspace spend on AI usage. It opens under Admin Panel → Token Limits and is the only place in PANTA OS where token related configuration happens. The page has three sections: Spending & Budget (the workspace level view), Per Model Cost Caps, and Spend By Model (a breakdown of spend by model in the current cycle).
This page handles the day to day administration of budgets. For the strategic overview of how PANTA OS approaches token management, see Token Management.
English Images Documentations (2) 1

Why It Matters

AI cost grows with usage. Without limits, the growth is invisible until the bill arrives. The Token Limits tab gives administrators a fixed envelope to work inside, plus the optional tools to shape consumption by model.

Predictability

The plan budget is fixed by your PANTA OS plan, so spend cannot exceed the agreed envelope by surprise.

Visibility per cycle

The page shows what has been spent in the current billing cycle, against what is allowed, with a progress indicator.

Per model control

Optional Model limits cap consumption on individual models. Useful when one expensive model is driving the bill.

Automatic fallback

When a model limit is reached, requests can route automatically to a cheaper model so work continues without manual intervention.

How To Use It

Open the Admin Panel

Click Admin at the bottom of the sidebar. Visible to administrators only.

Open Token Limits

Switch to the Token Limits tab in the Admin Panel tab bar.

Read the current cycle status

The Spending & Budget section at the top shows the active billing period, the amount spent so far in the cycle, the plan budget, and the overage allowance.

Decide on automatic fallback

Toggle “Automatically switch to a cheaper model when a model limit is reached” on to keep work flowing when a model limit is reached. Save the change with Save.

Set per model limits if needed

In the Model limits section, enter a euro limit and activate the toggle for any specific model whose consumption you want to cap.

Review actual model spend

The Spend by model section breaks down the current cycle by model, in euros and as a percentage of total spend.

Spending & Budget

The top section is the workspace level overview. It is the single source of truth for cycle spend.

Current billing period

The active billing cycle, shown as a date range at the top right of the section.

Spent in this cycle

The amount actually consumed so far in the cycle, in euros.

Plan budget

The workspace budget for the cycle, set by the PANTA OS plan. Read only; the lock icon indicates that this value is not editable in the UI.

Overage limit

An additional overage allowance on top of the plan budget. Also read only; the lock icon indicates that this value is not editable in the UI.

Progress bar

A visual indicator of cycle progress with the spent over allowed ratio and a percentage. The status below (“Within plan budget”) shows whether you are still inside the plan or have moved into the overage allowance.

Resets monthly

The billing cycle resets monthly on the calendar date your plan started.

The automatic fallback toggle

Below the budget overview sits a single toggle: “Automatically switch to a cheaper model when a model limit is reached”.

Automatic fallback

When a model limit is reached and this toggle is on, PANTA OS routes new requests to a cheaper model automatically. Users keep working without seeing an error. When the toggle is off, requests on a capped model fail when the limit is reached. Save changes with Save at the right of the section.

Model Limits

The Model limits section sets optional caps per model. It is described in the UI as: “Optional: set a euro limit per cycle for a specific model. When the limit is reached, requests are automatically routed to a cheaper model (or blocked if none is available). Leave empty to use the shared org budget pool.” The table lists every available model in the workspace with four columns:

Model

The model name (for example Claude Sonnet 4.5, GPT-4o, GPT-5, GPT-5 Mini, GPT-5.4 Mini).

In this cycle

The amount the workspace has spent on this model in the current cycle, in euros.

Limit

An input field for the euro limit per cycle for this model. Leave empty to use the shared workspace budget pool without a model specific cap.

Enabled

Toggle that activates the model limit. Use it together with the Limit value.

Spend By Model (This Cycle)

The bottom section breaks down actual cycle spend by model.

Per model spend

Each model that has produced consumption in the cycle is listed with the Euro amount and its percentage of total cycle spend.

Use it to find cost drivers

Models at the top of the list are the largest contributors to your bill. Compare against the Model limits table to decide which models should be capped.

Key Settings or Options

Plan budget

Workspace cycle budget. Set by PANTA OS plan; not editable in the UI.

Overage limit

Additional overage allowance. Set by PANTA OS plan; not editable in the UI.

Automatic fallback toggle

Single workspace level toggle to route requests to a cheaper model when a model limit is reached.

Per model euro caps

Optional cap per model. Leave empty to use the shared workspace pool.

Monthly reset

The billing cycle resets monthly on the calendar date your plan started.

Per model spend breakdown

Euro and percentage per model in the current cycle, for direct visibility on cost drivers.

Tips and Best Practices

  • Keep automatic fallback on by default. It is the difference between users hitting an error and users continuing to work, and the platform always picks a model that is allowed.
  • Cap only the models that drive cost. Use Spend by model to identify them; setting limits on cheap models adds friction without saving money.
  • Watch the progress bar near the end of the cycle. If you are close to the plan budget, expect the overage allowance to be touched in the last days.
  • Review the page weekly during rollout. Spotting an unexpected model spike early is cheaper than discovering it later.
  • If the plan budget feels tight, the conversation is with your PANTA OS account contact rather than a UI change. Plan budget and Overage limit are plan level decisions.
Token limits in PANTA OS are workspace and per model only. There are no per team, per user, or per assistant token caps. Spend control happens through the plan envelope, model limits, and the automatic fallback.

Help Center

Open the Admin Panel from the bottom of the sidebar, then switch to the Token Limits tab. This is the only place in PANTA OS where token configuration happens.
No. The budget and overage limits are set by your PANTA OS plan and are read only in the UI (indicated by the lock icon). To raise either value, contact your PANTA OS account contact about a plan change.
No. Token limits exist only at the workspace level (plan budget plus overage) and the per model level. There is no per team, per user, or per assistant cap.
If the automatic fallback toggle is on, new requests route to a cheaper available model and work continues. If the toggle is off, requests on the capped model fail until the cycle resets.
The limit is reached and new requests are blocked. To continue working in the same cycle, the plan envelope has to be raised with your PANTA OS account contact.
The billing cycle resets monthly on the calendar date your plan started.
Scroll to Spend by model at the bottom of the Token limits tab. Each model that produced consumption in the current cycle is listed with its Euro amount and percentage of total spend.
Last modified on June 5, 2026