> ## Documentation Index
> Fetch the complete documentation index at: https://help.pantaos.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Token Management

> How PANTA OS approaches AI spend: a plan envelope, optional per model caps, and automatic fallback to keep work flowing.

## What It Is

Token Management is the way PANTA OS keeps AI spend predictable. The concept has three components: a workspace level budget set by the PANTA OS plan, optional per model caps configured by administrators, and an automatic fallback that routes requests to a cheaper model when a limit is reached. A second tab, Analytics, complements Token limits with a usage dashboard for assistants and apps plus a CSV export.

This page is the strategic overview. For the day to day configuration, see [Token limits and budgets](/platform/administration/token-limits-and-budgets) in the administration section.

<Frame>
  <img src="https://mintcdn.com/panta/bxdWHin1js2LVo8f/images/Panta-Documentations-(EN)-(1).jpg?fit=max&auto=format&n=bxdWHin1js2LVo8f&q=85&s=1347f7980c7b96211eb73e8556daa87d" alt="Panta Documentations (EN) (1)" width="1920" height="1080" data-path="images/Panta-Documentations-(EN)-(1).jpg" />
</Frame>

## Why It Matters

AI cost grows with adoption. Without a structure, the growth becomes invisible until the next invoice. With one, leadership has a predictable envelope, administrators have the tools to shape consumption, and users can keep working without manual intervention when a limit is reached.

<CardGroup cols={2}>
  <Card title="Predictable spend" icon="shield">
    The plan budget is fixed by your PANTA OS plan. Spend cannot exceed the agreed envelope by surprise.
  </Card>

  <Card title="Per model control" icon="brain">
    Optional caps per model let administrators slow down consumption on individual models, especially the most expensive ones.
  </Card>

  <Card title="Continuity for users" icon="play">
    Automatic fallback routes requests to a cheaper model when a model limit is reached, so users do not see an error mid task.
  </Card>

  <Card title="Visibility in one place" icon="eye">
    The Token limits tab in the Admin Panel shows current cycle spend, per model breakdown, and the active billing period at a glance.
  </Card>
</CardGroup>

## The Three Components

<CardGroup cols={2}>
  <Card title="Plan budget" icon="lock">
    The workspace cycle budget set by the PANTA OS plan. Read only in the UI; changes happen through the PANTA OS account contact.
  </Card>

  <Card title="Overage limit" icon="lock">
    An additional allowance on top of the plan budget. Also plan level and read only in the UI.
  </Card>

  <Card title="Per model limits" icon="brain">
    Optional Euro caps per model per cycle, set by administrators. Useful when one model is driving the bill.
  </Card>

  <Card title="Automatic fallback" icon="rotate-cw">
    A workspace level toggle that routes requests to a cheaper model when a model limit is reached. Keeps work flowing without manual intervention.
  </Card>
</CardGroup>

## How It Works In Practice

<Steps>
  <Step title="The plan defines the envelope" icon="lock">
    Planned budget and overage limits come with your PANTA OS plan. Administrators see the values in the Admin Panel under Token limits but do not edit them in the UI. Overage limits are confirmed by Administrators but set by the PANTA OS team.
  </Step>

  <Step title="Administrators optionally cap individual models" icon="brain">
    For models that drive cost, administrators can set a Euro cap per cycle in the Model limits table. Setting a cap is optional; leaving it empty means the model draws from the shared workspace budget pool.
  </Step>

  <Step title="The automatic fallback decides what happens at the cap" icon="rotate-cw">
    When the toggle is on, a request that hits the model cap routes to a cheaper available model and the user does not see an error. When it is off, the request fails until the cycle resets.
  </Step>

  <Step title="The cycle resets monthly" icon="rotate-ccw">
    All counters reset monthly on the calendar date your plan started.
  </Step>
</Steps>

## Where To See The Numbers

The Token limits tab in the Admin Panel shows three sections:

<CardGroup cols={2}>
  <Card title="Spend & Budget" icon="trending-up">
    Current cycle spend, plan budget, overage limit, and a progress indicator with the spent over allowed ratio.
  </Card>

  <Card title="Model limits" icon="brain">
    Per model table with current cycle consumption, an optional Euro cap, and an Enabled toggle.
  </Card>

  <Card title="Spend by model" icon="coins">
    A breakdown of cycle spend by model in euros and as a percentage of total spend. Useful for spotting cost drivers.
  </Card>

  <Card title="Automatic fallback toggle" icon="rotate-cw">
    A single workspace level toggle that controls behavior when a model limit is reached.
  </Card>
</CardGroup>

## Analytics Dashboard

The Analytics tab in the Admin Panel complements Token limits with usage views focused on adoption and value.

<CardGroup cols={2}>
  <Card title="Top workflows by usage" icon="arrow-big-up">
    A bar chart of the most used apps in the workspace.
  </Card>

  <Card title="Top assistants by usage" icon="list">
    A ranking of the most used assistants with their request counts and total token values.
  </Card>

  <Card title="Time to first value" icon="clock">
    A chart showing how quickly new users reach productive usage, with a median TTFV value.
  </Card>

  <Card title="Quick actions" icon="zap">
    Two quick links: Manage users opens user management; Analytics export downloads a usage report as CSV.
  </Card>
</CardGroup>

## Tips and Best Practices

* Keep automatic fallback on by default. The difference between users hitting an error and users continuing to work is significant; the platform always picks a model that is allowed.
* Cap only the models that drive cost. Setting limits on cheap models adds friction without saving money.
* Review the Token Limits tab weekly during rollout. Spotting an unexpected spike on a specific model early is best.
* For plan envelope changes, talk to your PANTA OS account contact. Plan budget and overage are plan level decisions, not UI changes.

<Tip>
  PANTA OS does not offer per team or per user token caps. Spend control happens through the plan envelope, per model caps, and automatic fallback. The model is intentionally simple to keep administration manageable.
</Tip>

## Help Center

<AccordionGroup>
  <Accordion title="Can I set a budget for a single user or team" icon="users">
    No. Token limits exist at the workspace level (plan budget and overage) and at the per model level. There is no per team, per user, or per assistant cap.
  </Accordion>

  <Accordion title="What happens when a model limit is reached" icon="rotate-cw">
    If automatic fallback is on, new requests on that model route to a cheaper available model and work continues. If the toggle is off, requests on the capped model fail until the cycle resets.
  </Accordion>

  <Accordion title="Can I change the plan budget myself" icon="lock">
    No. Plan budget and overage limit are plan level values, shown in the UI with a lock icon. To change either, contact your PANTA OS manager about a plan change.
  </Accordion>

  <Accordion title="When does the billing cycle reset" icon="rotate-ccw">
    The billing cycle resets on the same calendar day each month based on the date your plan started.
  </Accordion>

  <Accordion title="Where is the practical configuration done" icon="settings">
    In the Admin Panel under Token limits. That page covers per model caps, the automatic fallback toggle, and the cycle view. See [Token limits and budgets](/platform/administration/token-limits-and-budgets) for the step by step guide.
  </Accordion>
</AccordionGroup>
