Tenxyte AIRS (AI Responsibility & Security)

Overview

Tenxyte AIRS is a comprehensive suite of responsibility, security, and safeguards for integrated AI agents. It addresses major challenges posed by LLMs and agentic models in production environments (e.g., EchoLeak, Shadow Escape, runaway spend).

Core principle: An AI agent never acts on its own authority. It borrows a human user's identity and permissions via a scoped, time-limited token (AgentToken), and every action it takes is auditable, controllable, and suspendable.

1. Core Agentic Parity — AgentToken

An AgentToken is the identity token issued to an AI agent. It enables secure delegation: the agent acts on behalf of a human user, with a strict subset of their permissions, without ever handling user credentials.

Creation (API)

POST /ai/tokens/
Authorization: Bearer <user_jwt>
Content-Type: application/json

{
  "agent_id": "finance-agent-v2",
  "expires_in": 3600,
  "permissions": ["read:reports", "write:invoices"],
  "organization": "acme-corp",
  "budget_limit_usd": 5.00,
  "circuit_breaker": {
    "max_requests_per_minute": 30,
    "max_requests_total": 500
  },
  "dead_mans_switch": {
    "heartbeat_required_every": 300
  }
}

Response (201):

{
  "id": 42,
  "token": "eKj3...raw_token...Xz9",
  "agent_id": "finance-agent-v2",
  "status": "ACTIVE",
  "expires_at": "2024-01-20T16:00:00Z"
}

⚠️ The raw token value is returned only once at creation. Store it securely — only its SHA-256 hash is persisted in the database.

The agent then uses AgentBearer <token> in the Authorization header for all subsequent requests.

Double RBAC Validation

Every request made with an AgentToken goes through two permission checks:

Agent scope check: Does the AgentToken include the required permission in its granted_permissions?
Human check: Does the delegating user still hold that permission in the database (or within the organization)?

If either check fails, the request is rejected with 403 Forbidden.

Token Lifecycle

Status	Description
`ACTIVE`	Token is valid and can be used
`SUSPENDED`	Automatically disabled (circuit breaker, budget, heartbeat)
`REVOKED`	Manually revoked — permanent, irreversible
`EXPIRED`	Lifetime exceeded `expires_at`

2. Circuit Breaker & Rate Limiting

The circuit breaker is an autonomous firewall that protects against runaway agent behavior: infinite loops, data exfiltration spikes, or unexpected failure cascades.

Configurable thresholds (per token)

Field	Default	Description
`max_requests_per_minute`	60	Sliding window RPM limit (via cache)
`max_requests_total`	1000	Absolute request cap for the token lifetime
`max_failed_requests`	10	Max consecutive errors before suspension

If any threshold is exceeded, the token is automatically moved to SUSPENDED status with the appropriate reason:

Reason	Trigger
`RATE_LIMIT`	RPM or total requests exceeded
`ANOMALY`	Max failed requests exceeded
`HEARTBEAT_MISSING`	Dead Man's Switch timeout
`BUDGET_EXCEEDED`	LLM cost exceeded budget limit
`MANUAL`	Manually suspended by the delegating user

Dead Man's Switch

If heartbeat_required_every (seconds) is set, the agent must periodically call:

POST /ai/tokens/{id}/heartbeat/
Authorization: AgentBearer <raw_token>

If no heartbeat is received within the configured interval, the token is automatically suspended with HEARTBEAT_MISSING. This guarantees that if the orchestration container crashes or is compromised, the agent loses its access automatically.

Emergency Kill Switch

To immediately revoke all active tokens for a user (nuclear option):

POST /ai/tokens/revoke-all/
Authorization: Bearer <user_jwt>

3. Human in the Loop (HITL)

Some actions are too sensitive for an AI agent to execute autonomously. HITL ensures a human must explicitly approve them before they execute.

How it works

Endpoints decorated with @require_agent_clearance(human_in_the_loop_required=True) behave differently when called by an agent:

The agent calls the endpoint normally.
Instead of executing, Tenxyte creates an AgentPendingAction and returns 202 Accepted (not 200):

{
  "status": "pending_confirmation",
  "message": "This action requires human approval.",
  "confirmation_token": "hitl_a1b2c3d4e5f6...",
  "expires_at": "2024-01-20T16:10:00Z"
}

The human is notified (email, webhook, etc.) with the confirmation_token.
The human confirms or denies via the API.
The agent retries the original call, passing the confirmed token in the X-Action-Confirmation header:

X-Action-Confirmation: hitl_a1b2c3d4e5f6...

Tenxyte validates the confirmation and lets the request through.

Global HITL actions

Configure actions that always require human approval in settings.py:

TENXYTE_AIRS_CONFIRMATION_REQUIRED = [
    "users.delete",
    "billing.refund",
    "data.export_all",
]

Confirm/Deny endpoints

# Human approves
POST /ai/pending-actions/<confirmation_token>/confirm/
Authorization: Bearer <user_jwt>

# Human denies
POST /ai/pending-actions/<confirmation_token>/deny/
Authorization: Bearer <user_jwt>

The agent's pending actions can also be listed:

GET /ai/pending-actions/
Authorization: Bearer <user_jwt>

4. Guardrails: PII Redaction & Budget Tracking

PII Redaction

When TENXYTE_AIRS_REDACT_PII = True, a middleware intercepts all JSON responses sent to an AgentBearer requester and automatically anonymizes sensitive fields (emails, phone numbers, IBANs, etc.) by replacing them with ***REDACTED***.

This prevents LLMs from ingesting or memorizing personally identifiable information from your backend.

Budget Tracking

Budget tracking allows you to cap the financial impact an agent can have via LLM API calls (OpenAI, Anthropic, Google, etc.).

Important: Tenxyte does not know LLM pricing. It is model-agnostic. Your code is responsible for converting token counts into a USD cost and reporting it. Tenxyte only accumulates the reported cost_usd and suspends the agent when the limit is reached.

Enable in settings

TENXYTE_AIRS_BUDGET_TRACKING_ENABLED = True

Create a token with a budget cap

from tenxyte.services.agent_service import AgentTokenService

service = AgentTokenService()
token = service.create(
    triggered_by=user,
    application=app,
    granted_permissions=[],
    budget_limit_usd=1.00,  # $1.00 maximum
)

Calculate the cost in your LLM wrapper

You are responsible for maintaining a pricing table and computing costs:

# Example pricing table (keep updated as providers change rates)
MODEL_PRICING = {
    "claude-sonnet-4-5":    {"input": 3.00,  "output": 15.00},  # per million tokens
    "gemini-1.5-pro":       {"input": 3.50,  "output": 10.50},
    "gpt-4o":               {"input": 5.00,  "output": 15.00},
}

def calculate_cost_usd(model: str, prompt_tokens: int, completion_tokens: int) -> float:
    pricing = MODEL_PRICING.get(model, {"input": 0.0, "output": 0.0})
    return (
        (prompt_tokens    / 1_000_000) * pricing["input"] +
        (completion_tokens / 1_000_000) * pricing["output"]
    )

Report usage after each LLM call

Via Python service:

# After calling your LLM (Anthropic, Google, OpenAI, etc.)
prompt_tokens     = response.usage.input_tokens    # from provider's response
completion_tokens = response.usage.output_tokens

cost = calculate_cost_usd("claude-sonnet-4-5", prompt_tokens, completion_tokens)

success = service.report_usage(token, cost_usd=cost)

if not success:
    # Budget exceeded → token is now SUSPENDED
    # The agent will receive 401/403 errors on subsequent requests
    raise Exception("Agent budget exhausted")

Via REST API (from within the agent itself):

POST /ai/tokens/{id}/report-usage/
Authorization: AgentBearer <raw_token>
Content-Type: application/json

{
  "cost_usd": 0.042,
  "prompt_tokens": 1250,
  "completion_tokens": 450
}

Response when budget is exceeded (403):

{
  "error": "Budget exceeded",
  "status": "suspended"
}

What happens internally

report_usage(cost_usd=0.60, budget_limit=0.50)
    ↓
current_spend_usd += 0.60   →  0.60
current_spend_usd (0.60) >= budget_limit (0.50)
    ↓
token.status      = SUSPENDED
token.suspended_reason = BUDGET_EXCEEDED
    ↓
return False  (all future requests with this token → 403)

5. Forensic Audit

Every agent request can carry an X-Prompt-Trace-ID header. This ID is:

Stored in AgentPendingAction.prompt_trace_id
Linked in the AuditLog

This allows precise traceability: "which user prompt triggered which backend action", enabling post-incident investigation and compliance reporting.

POST /ai/tokens/{id}/some-action/
Authorization: AgentBearer <raw_token>
X-Prompt-Trace-ID: trace_7f3a2b9c-...

Configuration Reference

All settings are defined in settings.py. Defaults are managed via src/tenxyte/conf/airs.py.

Setting	Default	Description
`TENXYTE_AIRS_ENABLED`	`True`	Master switch for AIRS
`TENXYTE_AIRS_TOKEN_MAX_LIFETIME`	`86400`	Maximum token lifetime (seconds)
`TENXYTE_AIRS_DEFAULT_EXPIRY`	`3600`	Default token expiry if not specified (seconds)
`TENXYTE_AIRS_REQUIRE_EXPLICIT_PERMISSIONS`	`True`	Tokens must declare explicit permissions
`TENXYTE_AIRS_CIRCUIT_BREAKER_ENABLED`	`True`	Enable/disable circuit breaker
`TENXYTE_AIRS_DEFAULT_MAX_RPM`	`60`	Default max requests per minute
`TENXYTE_AIRS_DEFAULT_MAX_TOTAL`	`1000`	Default total request cap
`TENXYTE_AIRS_DEFAULT_MAX_FAILURES`	`10`	Default max failed requests before suspension
`TENXYTE_AIRS_CONFIRMATION_REQUIRED`	`[]`	List of permission codes always requiring HITL
`TENXYTE_AIRS_REDACT_PII`	`False`	Enable PII redaction for agent responses
`TENXYTE_AIRS_BUDGET_TRACKING_ENABLED`	`False`	Enable LLM budget tracking