> ## Documentation Index > Fetch the complete documentation index at: https://help.pantaos.com/llms.txt > Use this file to discover all available pages before exploring further. # Token Management > How PANTA OS approaches AI spend: a plan envelope, optional per model caps, and automatic fallback to keep work flowing. ## What It Is Token Management is the way PANTA OS keeps AI spend predictable. The concept has three components: a workspace level budget set by the PANTA OS plan, optional per model caps configured by administrators, and an automatic fallback that routes requests to a cheaper model when a limit is reached. A second tab, Analytics, complements Token limits with a usage dashboard for assistants and apps plus a CSV export. This page is the strategic overview. For the day to day configuration, see [Token limits and budgets](/platform/administration/token-limits-and-budgets) in the administration section. Panta Documentations (EN) (1)

## Why It Matters AI cost grows with adoption. Without a structure, the growth becomes invisible until the next invoice. With one, leadership has a predictable envelope, administrators have the tools to shape consumption, and users can keep working without manual intervention when a limit is reached. The plan budget is fixed by your PANTA OS plan. Spend cannot exceed the agreed envelope by surprise. Optional caps per model let administrators slow down consumption on individual models, especially the most expensive ones. Automatic fallback routes requests to a cheaper model when a model limit is reached, so users do not see an error mid task. The Token limits tab in the Admin Panel shows current cycle spend, per model breakdown, and the active billing period at a glance. ## The Three Components The workspace cycle budget set by the PANTA OS plan. Read only in the UI; changes happen through the PANTA OS account contact. An additional allowance on top of the plan budget. Also plan level and read only in the UI. Optional Euro caps per model per cycle, set by administrators. Useful when one model is driving the bill. A workspace level toggle that routes requests to a cheaper model when a model limit is reached. Keeps work flowing without manual intervention. ## How It Works In Practice Planned budget and overage limits come with your PANTA OS plan. Administrators see the values in the Admin Panel under Token limits but do not edit them in the UI. Overage limits are confirmed by Administrators but set by the PANTA OS team. For models that drive cost, administrators can set a Euro cap per cycle in the Model limits table. Setting a cap is optional; leaving it empty means the model draws from the shared workspace budget pool. When the toggle is on, a request that hits the model cap routes to a cheaper available model and the user does not see an error. When it is off, the request fails until the cycle resets. All counters reset monthly on the calendar date your plan started. ## Where To See The Numbers The Token limits tab in the Admin Panel shows three sections: Current cycle spend, plan budget, overage limit, and a progress indicator with the spent over allowed ratio. Per model table with current cycle consumption, an optional Euro cap, and an Enabled toggle. A breakdown of cycle spend by model in euros and as a percentage of total spend. Useful for spotting cost drivers. A single workspace level toggle that controls behavior when a model limit is reached. ## Analytics Dashboard The Analytics tab in the Admin Panel complements Token limits with usage views focused on adoption and value. A bar chart of the most used apps in the workspace. A ranking of the most used assistants with their request counts and total token values. A chart showing how quickly new users reach productive usage, with a median TTFV value. Two quick links: Manage users opens user management; Analytics export downloads a usage report as CSV. ## Tips and Best Practices * Keep automatic fallback on by default. The difference between users hitting an error and users continuing to work is significant; the platform always picks a model that is allowed. * Cap only the models that drive cost. Setting limits on cheap models adds friction without saving money. * Review the Token Limits tab weekly during rollout. Spotting an unexpected spike on a specific model early is best. * For plan envelope changes, talk to your PANTA OS account contact. Plan budget and overage are plan level decisions, not UI changes. PANTA OS does not offer per team or per user token caps. Spend control happens through the plan envelope, per model caps, and automatic fallback. The model is intentionally simple to keep administration manageable. ## Help Center No. Token limits exist at the workspace level (plan budget and overage) and at the per model level. There is no per team, per user, or per assistant cap. If automatic fallback is on, new requests on that model route to a cheaper available model and work continues. If the toggle is off, requests on the capped model fail until the cycle resets. No. Plan budget and overage limit are plan level values, shown in the UI with a lock icon. To change either, contact your PANTA OS manager about a plan change. The billing cycle resets on the same calendar day each month based on the date your plan started. In the Admin Panel under Token limits. That page covers per model caps, the automatic fallback toggle, and the cycle view. See [Token limits and budgets](/platform/administration/token-limits-and-budgets) for the step by step guide.