What It Is
Token Management is the way PANTA OS keeps AI spend predictable. The concept has three components: a workspace level budget set by the PANTA OS plan, optional per model caps configured by administrators, and an automatic fallback that routes requests to a cheaper model when a limit is reached. A second tab, Analytics, complements Token limits with a usage dashboard for assistants and apps plus a CSV export. This page is the strategic overview. For the day to day configuration, see Token limits and budgets in the administration section.-(1).jpg?fit=max&auto=format&n=bxdWHin1js2LVo8f&q=85&s=1347f7980c7b96211eb73e8556daa87d)
Why It Matters
AI cost grows with adoption. Without a structure, the growth becomes invisible until the next invoice. With one, leadership has a predictable envelope, administrators have the tools to shape consumption, and users can keep working without manual intervention when a limit is reached.Predictable spend
The plan budget is fixed by your PANTA OS plan. Spend cannot exceed the agreed envelope by surprise.
Per model control
Optional caps per model let administrators slow down consumption on individual models, especially the most expensive ones.
Continuity for users
Automatic fallback routes requests to a cheaper model when a model limit is reached, so users do not see an error mid task.
Visibility in one place
The Token limits tab in the Admin Panel shows current cycle spend, per model breakdown, and the active billing period at a glance.
The Three Components
Plan budget
The workspace cycle budget set by the PANTA OS plan. Read only in the UI; changes happen through the PANTA OS account contact.
Overage limit
An additional allowance on top of the plan budget. Also plan level and read only in the UI.
Per model limits
Optional Euro caps per model per cycle, set by administrators. Useful when one model is driving the bill.
Automatic fallback
A workspace level toggle that routes requests to a cheaper model when a model limit is reached. Keeps work flowing without manual intervention.
How It Works In Practice
The plan defines the envelope
Planned budget and overage limits come with your PANTA OS plan. Administrators see the values in the Admin Panel under Token limits but do not edit them in the UI. Overage limits are confirmed by Administrators but set by the PANTA OS team.
Administrators optionally cap individual models
For models that drive cost, administrators can set a Euro cap per cycle in the Model limits table. Setting a cap is optional; leaving it empty means the model draws from the shared workspace budget pool.
The automatic fallback decides what happens at the cap
When the toggle is on, a request that hits the model cap routes to a cheaper available model and the user does not see an error. When it is off, the request fails until the cycle resets.
Where To See The Numbers
The Token limits tab in the Admin Panel shows three sections:Spend & Budget
Current cycle spend, plan budget, overage limit, and a progress indicator with the spent over allowed ratio.
Model limits
Per model table with current cycle consumption, an optional Euro cap, and an Enabled toggle.
Spend by model
A breakdown of cycle spend by model in euros and as a percentage of total spend. Useful for spotting cost drivers.
Automatic fallback toggle
A single workspace level toggle that controls behavior when a model limit is reached.
Analytics Dashboard
The Analytics tab in the Admin Panel complements Token limits with usage views focused on adoption and value.Top workflows by usage
A bar chart of the most used apps in the workspace.
Top assistants by usage
A ranking of the most used assistants with their request counts and total token values.
Time to first value
A chart showing how quickly new users reach productive usage, with a median TTFV value.
Quick actions
Two quick links: Manage users opens user management; Analytics export downloads a usage report as CSV.
Tips and Best Practices
- Keep automatic fallback on by default. The difference between users hitting an error and users continuing to work is significant; the platform always picks a model that is allowed.
- Cap only the models that drive cost. Setting limits on cheap models adds friction without saving money.
- Review the Token Limits tab weekly during rollout. Spotting an unexpected spike on a specific model early is best.
- For plan envelope changes, talk to your PANTA OS account contact. Plan budget and overage are plan level decisions, not UI changes.
Help Center
Can I set a budget for a single user or team
Can I set a budget for a single user or team
No. Token limits exist at the workspace level (plan budget and overage) and at the per model level. There is no per team, per user, or per assistant cap.
What happens when a model limit is reached
What happens when a model limit is reached
If automatic fallback is on, new requests on that model route to a cheaper available model and work continues. If the toggle is off, requests on the capped model fail until the cycle resets.
Can I change the plan budget myself
Can I change the plan budget myself
No. Plan budget and overage limit are plan level values, shown in the UI with a lock icon. To change either, contact your PANTA OS manager about a plan change.
When does the billing cycle reset
When does the billing cycle reset
The billing cycle resets on the same calendar day each month based on the date your plan started.
Where is the practical configuration done
Where is the practical configuration done
In the Admin Panel under Token limits. That page covers per model caps, the automatic fallback toggle, and the cycle view. See Token limits and budgets for the step by step guide.
