Auth & Data Flows

This page traces every request path from HTTP in to database and back — authentication, authorization, MCP tool calls, chat agent invocations, and app-to-platform communication. Use it to understand what the platform enforces, where your code sits in each flow, and what invariants you can rely on.

JWT Anatomy#

Every authenticated request carries an HS256-signed JWT. Two signing keys exist:

Key	Used for	Config var
Control plane key	Human user login tokens	`BOOTSTRAP_ADMIN_SECRET`
App token key	Service account OAuth tokens	`APP_TOKEN_SECRET` (falls back to `BOOTSTRAP_ADMIN_SECRET`)

Standard JWT payload:

json

{
  "sub":       "user-uuid or app_id",
  "tenant_id": "tenant-uuid",
  "roles":     ["user"],
  "groups":    ["group-uuid-1"],
  "scope":     "",
  "iat":       1716300000,
  "exp":       1716303600
}

Roles are one of: global_admin, tenant_admin, user, service, app_service_account.

Role hierarchy (what each role subsumes):

code

global_admin  →  everything
tenant_admin  →  tenant_admin, user
user          →  user
service       →  service          (lateral — no user access)
app_service_account → app_service_account  (lateral — no user access)

code

POST /api/v1/auth/login
Body: { "email": "...", "password": "..." }

code

Client
  │
  ▼
POST /api/v1/auth/login
  │
  ├─ auth/handlers/login.py → cp_login(email, password)
  │      │
  │      ▼
  │   POST {CONTROL_PLANE_URL}/auth/login          ← argon2 password verify
  │      │
  │      ▼
  │   returns { token, refresh_token, sub, tenant_id, roles, expires_at }
  │      │
  ├─ Decode JWT (no sig verify) → extract groups
  │   If groups absent → query core_group_members by sub
  │      │
  ├─ Cache token → auth_tokens (MongoDB, TTL = exp)
  ├─ Cache roles → auth_roles (MongoDB, keyed by sub + email)
  ├─ Stamp last_login_at → core_scim_users
  ├─ audit_emit("user.login")
  │
  ▼
Response: { token, refresh_token, sub, tenant_id, user_name, roles, expires_at }

Collections touched: auth_tokens, auth_roles, core_group_members, core_scim_users

Security controls:

Password hashing (argon2) is owned by the control plane — the backend never sees plaintext
Token is cached locally so subsequent validation is pure JWT decode (no network call)

Flow 2 — App Service Account OAuth#

code

POST /api/v1/auth/token
Content-Type: application/x-www-form-urlencoded
Body: grant_type=client_credentials&client_id=<appId>&client_secret=<secret>

code

Client
  │
  ▼
POST /api/v1/auth/token
  │
  ├─ Validate grant_type == "client_credentials" → 400 otherwise
  │
  ├─ verify_client_credentials(client_id, client_secret)
  │      │
  │      ▼
  │   Query app_service_accounts WHERE app_id = client_id
  │   bcrypt.checkpw(client_secret, stored_hash) → 401 on mismatch
  │      │
  │      ▼
  │   returns { app_id, tenant_id, ... }
  │
  ├─ Build JWT payload:
  │   { sub: client_id, tenant_id, roles: ["app_service_account"],
  │     groups: [], scope, iat, exp: now+3600 }
  │
  ├─ Sign with APP_TOKEN_SECRET (fallback: BOOTSTRAP_ADMIN_SECRET)
  ├─ Cache in auth_tokens
  ├─ audit_emit("auth.token_issued")
  │
  ▼
Response: { access_token, token_type: "Bearer", expires_in: 3600, scope }

Token lifetime: 1 hour, no refresh token issued. Re-exchange client credentials to get a new token.

Secret storage: rotate_credentials generates secrets.token_urlsafe(32), bcrypt-hashes it, stores only the hash. Plaintext is returned once.

Production requirement: set APP_TOKEN_SECRET to a dedicated random secret (openssl rand -hex 32). Without it, the backend falls back to BOOTSTRAP_ADMIN_SECRET — rotating your admin password then silently invalidates all service account tokens.

Flow 3 — Token Validation (every authenticated request)#

Every protected endpoint depends on get_current_user:

code

Incoming request: Authorization: Bearer <token>
  │
  ├─ HTTPBearer extracts token → 401 if absent
  │
  ├─ has_sdk()? → 401 "Sidecar not available" if false
  │
  ├─ SidecarClient.authorize(token)
  │      │
  │      ▼
  │   pyjwt.decode(token, BOOTSTRAP_ADMIN_SECRET or APP_TOKEN_SECRET, HS256)
  │   → 401 on signature failure or expiry
  │      │
  │      ▼
  │   Claims { sub, tenant_id, roles, groups, exp }
  │
  ▼
Handler receives Claims object

No database lookup on the hot path. Validation is pure in-process JWT decode.

Three additional guards (applied per-endpoint):

Guard	What it checks
`require_roles("tenant_admin")`	Caller's roles (with hierarchy expansion) must include all specified roles
`require_groups("group-id")`	Caller's JWT `groups` claim must include the specified group
`require_policy(action, resource)`	gRPC sidecar evaluates Rego policy `data.cyberpod.rbac.allow`

Fail mode: CORESDK_FAIL_MODE controls what happens when the gRPC sidecar is unreachable during require_policy evaluation. Default is "open" (allow). Set to "closed" in production if policy enforcement must be strict.

Flow 4 — App Registration#

code

POST /api/v1/apps/register
Authorization: Bearer <any-valid-token>
Body: { "name": "my-app", "description": "..." }

code

Client (any authenticated user or service account)
  │
  ▼
POST /api/v1/apps/register
  │
  ├─ get_current_user → Claims
  │
  ├─ create_app({
  │     name, description,
  │     tenant_id: claims.tenant_id,   ← always from token, never from body
  │     registered_by: claims.sub
  │   })
  │      │
  │      ▼
  │   Generate app_id (UUID), status: "active"
  │   Insert into apps collection (MongoDB)
  │
  ├─ audit_emit("app.registered")
  │
  ▼
Response: {
  appId, name, description,
  tenantId, registeredBy, status, createdAt
}

Tenant isolation: tenant_id is always taken from the Bearer token — the request body cannot specify it.

Flow 5 — Credentials Rotation#

code

POST /api/v1/apps/{appId}/credentials/rotate
Authorization: Bearer <token where sub == appId OR role is tenant_admin/global_admin>

code

  ├─ _assert_app_access(app_id, claims)
  │   → 403 unless claims.sub == app_id OR caller is admin
  │
  ├─ Look up app in apps collection → resolve tenant_id
  │
  ├─ Generate plaintext secret: secrets.token_urlsafe(32)
  ├─ bcrypt.hashpw(secret, bcrypt.gensalt())
  ├─ Upsert into app_service_accounts:
  │   { app_id, tenant_id, client_secret_hash, updated_at }
  │
  ├─ audit_emit("app.credentials_rotate")
  │
  ▼
Response: { clientId: appId, clientSecret: "<plaintext — shown once only>" }

Store the returned clientSecret immediately in a secrets manager. It is never retrievable again.

Flow 6 — MCP Tool Registration#

code

POST /api/v1/apps/{appId}/mcp/tools
Authorization: Bearer <token where sub == appId OR admin>
Body: { "tools": [{ name, description, endpoint, method, inputSchema, tags, maskResponse, secretHeaders }] }

code

  ├─ _assert_app_access(app_id, claims)
  │
  ├─ Validate each tool: name, endpoint, method required
  │
  ├─ For each tool, upsert into mcpresttools:
  │   {
  │     tenantId: claims.tenant_id,
  │     appId: app_id,
  │     name, description, endpoint, method,
  │     inputSchema, tags,
  │     mask_response,     ← stored snake_case, returned as maskResponse
  │     secretHeaders,
  │     updatedAt
  │   }
  │   Unique index: (tenantId, appId, name)
  │
  ▼
Response: [{ id, name, description, endpoint, method, inputSchema, ... }]

Upsert semantics: re-registering the same tool name updates it in place. Safe to call on every startup.

Flow 7 — MCP Proxy Call (external agent)#

This is the path for external callers: AI agents, scripts, other apps.

code

POST /api/v1/mcp/call
Authorization: Bearer <token>
Body: { "tool": "get_person", "arguments": { "id": "per-001" }, "appId": "app_01..." }

code

External caller (Claude, LangChain, curl, another app)
  │
  ▼
POST /api/v1/mcp/call
  │
  ├─ get_current_user → Claims                  [1] JWT validate
  │
  ├─ assert_rate_limit("mcp_call:{sub}", 120/60s) [2] rate limit per caller sub
  │
  ├─ Extract raw Bearer token from Authorization header
  │   (this token is forwarded to the upstream endpoint)
  │
  ├─ find_tool(tenant_id, tool_name, app_id)     [3] tenant-scoped lookup
  │   → 404 if not found
  │
  ├─ Path param substitution:                    [4] resolve {id} → "per-001"
  │   endpoint: /api/v1/people/persons/{id}
  │   → path:   /api/v1/people/persons/per-001
  │   remaining args → query params or body
  │
  ├─ Assemble URL: SELF_BASE_URL + path           [5] URL assembly
  │
  ├─ is_safe_url(url)                             [6] SSRF guard (DNS resolve)
  │   → 403 if resolves to RFC-1918 / loopback
  │
  ├─ resolve_secrets_for_claims(claims,           [7] secret header injection
  │     tool.secretHeaders.values())
  │   → gRPC sidecar → plaintext values
  │   → injected as HTTP headers on upstream request
  │
  ├─ httpx.AsyncClient.request(method, url,       [8] upstream HTTP call
  │     headers={
  │       Authorization: Bearer <forwarded token>,
  │       Content-Type: application/json,
  │       ...secret headers
  │     })
  │   → 502 on upstream 4xx/5xx
  │
  ├─ _maybe_mask(response_text, tool)             [9] PII masking
  │   if tool.mask_response → sidecar gRPC mask_string_rpc
  │
  ├─ audit_emit("mcp.tool_call", outcome)         [10] audit trail
  │
  ▼
Response: { "content": [{ "type": "text", "text": "<json>" }], "tool": "get_person" }

The forwarded token: the caller's own JWT is passed as Authorization: Bearer to the upstream endpoint. The upstream app validates it against the same sidecar — so the tool call runs with the caller's identity and tenant scope.

Secret headers: secretHeaders: { "X-Api-Key": "my-bundle" } — the bundle name is resolved by the sidecar at call time. The plaintext value is injected as an HTTP header into the upstream call but never logged or returned to the caller.

Flow 8 — Chat Agent → MCP Tool (internal path)#

This is the path when a user's chat session triggers a tool call. It does not go through POST /mcp/call.

code

User sends chat message
  │
  ▼
POST /api/v1/chat/completions
  │
  ├─ get_current_user → Claims
  ├─ Rate limit: 60 chat turns/min per user
  ├─ get_or_create chat_session (scoped to tenant_id)
  ├─ PipelineContext built from claims (tenant_id, user_id, request_id)
  │
  ├─ OrchestratorFactory.select(chat_type) → orchestrator
  │
  ├─ orchestrator.run_streaming / run_non_streaming
  │      │
  │      ▼
  │   ChatAgent.ReAct loop
  │      │
  │      ├─ resolve_tools(agent_config, tenant_id, user_id)
  │      │    │
  │      │    ├─ BUILTIN_TOOLS (memory_recall, generate_artifact, ...)
  │      │    └─ find_callable_servers(tenant_id, user_id)
  │      │         scope=platform → visible to all tenants
  │      │         scope=tenant   → this tenant only
  │      │         scope=user     → this user only
  │      │
  │      ├─ LLM issues tool call → dispatch to ToolSpec.fn
  │      │
  │      └─ McpInvoker.call(server_doc, tool_name, args, ctx)
  │              │
  │              ├─ _prepare_server_doc: resolve ${BUNDLE} in headers → sidecar
  │              │
  │              ├─ _merge_metadata: inject tenant_id, user_id, project_ids
  │              │   into args — OVERWRITES any LLM-supplied identity values
  │              │   (security boundary: LLM cannot override caller identity)
  │              │
  │              ├─ Transport selection:
  │              │   mTLS → fresh transport + X-Tenant-ID, X-User-UUID headers
  │              │   secrets → fresh transport
  │              │   normal → cached client by (serverUrl, transport)
  │              │
  │              ├─ client.call_tool(tool_name, merged_args)  ← FastMCP protocol
  │              │
  │              ├─ audit_emit("mcp.tool_call", source: "chat")
  │              │
  │              └─ _result_to_envelope → { content, isError, metadata }
  │
  ▼
SSE stream: tool result → next LLM turn → ... → final message

Key difference from /mcp/call: the chat path uses FastMCP protocol directly (not HTTP REST). Identity is injected into tool arguments via _merge_metadata rather than forwarded as a Bearer token. The LLM cannot override tenant_id, user_id, or other identity fields — the sidecar enforces them server-side.

Flow 9 — HITL (Human-in-the-Loop) Approval#

When a chat agent selects a tool marked requires_approval=True:

code

Agent selects tool requiring approval
  │
  ├─ hitl_manager.create_pending({
  │     approvalId (UUID), chatId, turnId, tenantId, userId,
  │     toolName, toolArgs, expiresAt (now + HITL_TIMEOUT_SECONDS)
  │   }) → written to hitl_pending collection
  │
  ├─ HITLRequiredEvent → SSE queue → client UI shows approval dialog
  │
  ├─ Agent enters polling loop (every 2s, max 2× timeout)
  │
  │                              ┌──────────────────────────────┐
  │                              │  Human reviews in UI         │
  │                              │  POST /api/v1/chat/hitl/     │
  │                              │    {approval_id}/resolve     │
  │                              │  Body: { status: "approved"  │
  │                              │          or "rejected" }     │
  │                              │                              │
  │                              │  Auth: get_current_user      │
  │                              │  Scoped to claims.tenant_id  │
  │                              │  Updates hitl_pending doc    │
  │                              └──────────────────────────────┘
  │
  ├─ Polling loop detects resolved status
  │
  ├─ "approved" → continue ReAct loop, call tool
  │   "rejected" → return rejection message to LLM
  │   expired    → TTL index removes doc, polling returns timeout error
  │
  ▼
Agent continues

Tenant isolation: every hitl_pending query includes tenantId filter. A user from tenant A cannot resolve approvals belonging to tenant B.

Flow 10 — App → cPod EDM (service account calling the platform)#

A scaffolded app calling TypeScript client.people.persons.list():

code

Scaffolded app
  │
  ├─ CpodClient.fromEnv()
  │   reads CPOD_API_KEY (Bearer token — your service account JWT)
  │   reads CPOD_API_URL (defaults to https://api.cyberpod.app)
  │
  ├─ GET {CPOD_API_URL}/api/v1/people/persons
  │   Authorization: Bearer <service account JWT>
  │
  ▼
cpod-backend /api/v1/people/persons
  │
  ├─ get_current_user → decode JWT → Claims
  │   sub = your appId
  │   tenant_id = your tenant (from token, not from request)
  │   roles = ["app_service_account"]
  │
  ├─ Query scoped to claims.tenant_id
  │   → returns only your tenant's people
  │
  ▼
Response: { items: [...], total: N }

Tenant isolation is automatic. The JWT carries tenant_id; every EDM query filters by it. Your app cannot access another tenant's data even if it guesses their IDs.

Service account limits: app_service_account role does not inherit user rights. Endpoints that require user or tenant_admin via require_roles will return 403.

Authorization Matrix#

Endpoint	Minimum role required
`POST /api/v1/auth/login`	None (public)
`POST /api/v1/auth/token`	None (public, credentials in body)
`POST /api/v1/auth/register`	None (public)
`POST /api/v1/auth/invite`	`tenant_admin`
`POST /api/v1/apps/register`	Any valid token
`POST /api/v1/apps/{id}/credentials/rotate`	App owner (`sub==appId`) or admin
`POST /api/v1/apps/{id}/mcp/tools`	App owner or admin
`GET /api/v1/apps/{id}/mcp/tools`	App owner or admin
`DELETE /api/v1/apps/{id}/mcp/tools/{name}`	App owner or admin
`GET /api/v1/apps/{id}/mcp/proxy`	App owner or admin
`POST /api/v1/mcp/call`	Any valid token
`POST /api/v1/chat/completions`	Any valid token
`POST /api/v1/chat/hitl/{id}/resolve`	Any valid token (scoped to tenant)
`GET /api/v1/people/persons`, etc. (EDM)	Any valid token (scoped to tenant)
Admin routes (`/admin/*`)	`global_admin`

Tenant Isolation — Where It's Enforced#

Tenant isolation is enforced at the repository layer, not in middleware. Every collection query that could return cross-tenant data includes tenantId as a filter field:

Collection	Isolation field	Where enforced
`mcpresttools`	`tenantId`	`repos_rest.py` — all queries
`mcptools`	`tenantId`	`repos.py` — `find_callable_servers`
`chat_sessions`	`tenantId`	`session_manager`
`hitl_pending`	`tenantId`	`hitl_manager`
`apps`	`tenant_id`	`admin/repos.py`
`app_service_accounts`	`tenant_id`	`credentials.py`

What is NOT tenant-scoped: scope=platform MCP servers (tools registered as platform-wide, e.g., ARAI). These are intentionally visible to all tenants.

Security Controls Summary#

Control	Mechanism	Where
Password hashing	argon2 (control plane)	Login flow
Secret hashing	bcrypt	`credentials.py`
JWT signing	HS256	`client_credentials.py`
JWT verification	`pyjwt.decode` in-process	`sdk.py`
Rate limiting	gRPC sidecar (120 req/min MCP, 60 turns/min chat)	`ratelimit.py`
SSRF protection	DNS resolve + RFC-1918 blocklist	`url_guard.py`
PII masking	gRPC sidecar `mask_string_rpc`	`rest_tools.py`, `mcp_invoker.py`
Secret injection	gRPC sidecar `ResolveSecret`	`secrets_inject.py`
Identity injection	`_merge_metadata` overwrites LLM args	`mcp_invoker.py`
Tenant scoping	`tenantId` filter on every Mongo query	Repository layer
Policy evaluation	OPA/Rego via gRPC sidecar	`require_policy` guard
Audit trail	`audit_emit` on all mutating ops	`shared/audit.py`

Known Operational Notes#

CORESDK_FAIL_MODE=open (default): if the gRPC sidecar is unavailable, require_policy calls fail-open (allow). Change to closed for strict production enforcement.

APP_TOKEN_SECRET: must be set independently of BOOTSTRAP_ADMIN_SECRET in production. Generate with openssl rand -hex 32. Without it, admin password rotation invalidates all service tokens.

DNS rebinding: the SSRF guard resolves DNS at check time; httpx resolves again at connect time. There is a TOCTOU window. Mitigate by running the backend in a network namespace with no internal DNS reachable from SELF_BASE_URL.

SELF_BASE_URL: if misconfigured to an internal host, tool endpoints assembled from it bypass the SSRF guard (the guard runs on the assembled URL). Always point this at the public-facing hostname or loopback-only ingress.

No rate limit on /auth/token: the client_credentials endpoint has no per-IP or per-client_id throttle. bcrypt is inherently slow (~100ms) but this is not a substitute for rate limiting under a distributed brute-force scenario.

Docs

Auth & Data Flows

JWT Anatomy#

Flow 1 — Human User Login#

Flow 2 — App Service Account OAuth#

Flow 3 — Token Validation (every authenticated request)#

Flow 4 — App Registration#

Flow 5 — Credentials Rotation#

Flow 6 — MCP Tool Registration#

Flow 7 — MCP Proxy Call (external agent)#

Flow 8 — Chat Agent → MCP Tool (internal path)#

Flow 9 — HITL (Human-in-the-Loop) Approval#

Flow 10 — App → cPod EDM (service account calling the platform)#

Authorization Matrix#

Tenant Isolation — Where It's Enforced#

Security Controls Summary#

Known Operational Notes#