Skip to main content

Documentation Index

Fetch the complete documentation index at: https://help.pantaos.com/llms.txt

Use this file to discover all available pages before exploring further.

AI is an operating cost. Token Management is how you keep that cost predictable, attributable, and aligned with the value AI is producing.

What you get

Real-time usage tracking

Live dashboards showing token spend by user, team, assistant, and workflow.

Budgets at every level

Caps on the workspace, per team, per user, per assistant — independently set and stacked.

Smart alerts

Notifications when a budget hits 50%, 80%, 100%. Slack, email, or in-app.

Attribution by default

Every token spent is attributed to a person, an assistant, and a project. No mystery bills.

How budgets work

Set the workspace budget

Workspace-level cap is the ceiling. Nothing goes above it without an explicit raise.

Allocate to teams

Each team gets a slice. Teams own their slice and can subdivide further.

Per-assistant guardrails

Expensive assistants (deep models, long context) get their own caps so they can’t drain a team’s budget.

Override when needed

A short-lived spend bump for a campaign or quarter — temporary, with an end date.

What you see in analytics

Spend by team

Which department is using AI most. Often surprising — and useful.

Spend by assistant

The expensive assistants. The ones to optimize first.

Spend by user

Power users — usually your champions. Adoption signals.

Spend over time

Trends and anomalies, week over week.
Token Management is for organizational control. Individual users don’t see other users’ spend; they see their own and their team’s, depending on role.

Tips that save real money

Most chats don’t need the biggest model. Default to a fast, capable middle-tier and reserve the top for assistants that need it.
Long-winded system prompts get charged on every turn. Tighten them.
For repeat questions on stable knowledge, cache. PANTA OS caches transparently when it can.
Assistants no one runs still cost storage and indexing. Archive what’s stale.