Skip to main content

What It Is

Token Management is the way PANTA OS keeps AI spend predictable. The concept has three components: a workspace level budget set by the PANTA OS plan, optional per model caps configured by administrators, and an automatic fallback that routes requests to a cheaper model when a limit is reached. A second tab, Analytics, complements Token limits with a usage dashboard for assistants and apps plus a CSV export. This page is the strategic overview. For the day to day configuration, see Token limits and budgets in the administration section.
Panta Documentations (EN) (1)

Why It Matters

AI cost grows with adoption. Without a structure, the growth becomes invisible until the next invoice. With one, leadership has a predictable envelope, administrators have the tools to shape consumption, and users can keep working without manual intervention when a limit is reached.

Predictable spend

The plan budget is fixed by your PANTA OS plan. Spend cannot exceed the agreed envelope by surprise.

Per model control

Optional caps per model let administrators slow down consumption on individual models, especially the most expensive ones.

Continuity for users

Automatic fallback routes requests to a cheaper model when a model limit is reached, so users do not see an error mid task.

Visibility in one place

The Token limits tab in the Admin Panel shows current cycle spend, per model breakdown, and the active billing period at a glance.

The Three Components

Plan budget

The workspace cycle budget set by the PANTA OS plan. Read only in the UI; changes happen through the PANTA OS account contact.

Overage limit

An additional allowance on top of the plan budget. Also plan level and read only in the UI.

Per model limits

Optional Euro caps per model per cycle, set by administrators. Useful when one model is driving the bill.

Automatic fallback

A workspace level toggle that routes requests to a cheaper model when a model limit is reached. Keeps work flowing without manual intervention.

How It Works In Practice

The plan defines the envelope

Planned budget and overage limits come with your PANTA OS plan. Administrators see the values in the Admin Panel under Token limits but do not edit them in the UI. Overage limits are confirmed by Administrators but set by the PANTA OS team.

Administrators optionally cap individual models

For models that drive cost, administrators can set a Euro cap per cycle in the Model limits table. Setting a cap is optional; leaving it empty means the model draws from the shared workspace budget pool.

The automatic fallback decides what happens at the cap

When the toggle is on, a request that hits the model cap routes to a cheaper available model and the user does not see an error. When it is off, the request fails until the cycle resets.

The cycle resets monthly

All counters reset monthly on the calendar date your plan started.

Where To See The Numbers

The Token limits tab in the Admin Panel shows three sections:

Spend & Budget

Current cycle spend, plan budget, overage limit, and a progress indicator with the spent over allowed ratio.

Model limits

Per model table with current cycle consumption, an optional Euro cap, and an Enabled toggle.

Spend by model

A breakdown of cycle spend by model in euros and as a percentage of total spend. Useful for spotting cost drivers.

Automatic fallback toggle

A single workspace level toggle that controls behavior when a model limit is reached.

Analytics Dashboard

The Analytics tab in the Admin Panel complements Token limits with usage views focused on adoption and value.

Top workflows by usage

A bar chart of the most used apps in the workspace.

Top assistants by usage

A ranking of the most used assistants with their request counts and total token values.

Time to first value

A chart showing how quickly new users reach productive usage, with a median TTFV value.

Quick actions

Two quick links: Manage users opens user management; Analytics export downloads a usage report as CSV.

Tips and Best Practices

  • Keep automatic fallback on by default. The difference between users hitting an error and users continuing to work is significant; the platform always picks a model that is allowed.
  • Cap only the models that drive cost. Setting limits on cheap models adds friction without saving money.
  • Review the Token Limits tab weekly during rollout. Spotting an unexpected spike on a specific model early is best.
  • For plan envelope changes, talk to your PANTA OS account contact. Plan budget and overage are plan level decisions, not UI changes.
PANTA OS does not offer per team or per user token caps. Spend control happens through the plan envelope, per model caps, and automatic fallback. The model is intentionally simple to keep administration manageable.

Help Center

No. Token limits exist at the workspace level (plan budget and overage) and at the per model level. There is no per team, per user, or per assistant cap.
If automatic fallback is on, new requests on that model route to a cheaper available model and work continues. If the toggle is off, requests on the capped model fail until the cycle resets.
No. Plan budget and overage limit are plan level values, shown in the UI with a lock icon. To change either, contact your PANTA OS manager about a plan change.
The billing cycle resets on the same calendar day each month based on the date your plan started.
In the Admin Panel under Token limits. That page covers per model caps, the automatic fallback toggle, and the cycle view. See Token limits and budgets for the step by step guide.
Last modified on June 8, 2026