Design Decisions

This page documents the key design decisions made during the cPod platform and SDK design. Each decision is recorded with its rationale and trade-offs, in ADR (Architecture Decision Record) style.


DD-001: SDK Never Holds Database Credentials

Status: Accepted

Decision: All storage operations go through the cpod-backend REST API. The SDK holds only CPOD_CLIENT_ID and CPOD_CLIENT_SECRET. It never holds MongoDB connection strings, Redis passwords, or MinIO access keys.

Context: Early designs explored giving apps direct database access with per-app database users. This was rejected.

Rationale:

  • Credential rotation — platform can rotate database credentials without any app redeployment or SDK update
  • Audit trail — every storage operation flows through cpod-backend, where AuditService.EmitAuditEvent fires; impossible to bypass with a direct connection
  • Policy enforcement — Rego rules apply uniformly even to storage operations; direct DB access would bypass the sidecar entirely
  • No credential sprawl — one credential pair per app (client ID + secret); no per-developer database passwords, no credentials in .env files, no risk of developers committing DB URIs to git

Trade-off: An extra network hop for every storage operation vs. direct database access. Measured latency addition is ~2–5ms on the same network. Acceptable for enterprise data platform use cases.


DD-002: Tenant Isolation Enforced Server-Side

Status: Accepted

Decision: tenantId is extracted from the validated JWT by cpod-backend and stamped onto every database query. It is never read from the request body, query string, or any client-supplied header.

Context: Multi-tenant platforms can implement tenant isolation at the client (client passes its tenant ID), at middleware (server reads and validates a client-supplied ID), or at the auth layer (server derives tenant ID from the token). cPod uses the auth layer approach.

Rationale:

  • Clients cannot forge their tenant identity. Even if a client sends ?tenantId=tnt-other-org, the middleware ignores it — tenantId comes only from the JWT claim, which was set by the Control Plane at token issuance
  • Single enforcement point. All tenant isolation logic lives in one place (JWT extraction in cpod-backend middleware), not spread across every handler
  • Simpler SDK API. SDK consumers never pass tenantId anywhere — it is always automatic

Trade-off: Requires a valid JWT on every request — no anonymous or unauthenticated data access is possible. This is intentional: cPod is an authenticated enterprise platform.


DD-003: Storage Tiers Instead of Per-Collection Permissions

Status: Accepted

Decision: Storage is divided into three tiers (private, user, shared) rather than offering fine-grained ACLs on a per-collection or per-document basis.

Context: Alternative designs included: per-document ACL lists (like Firestore security rules), per-collection permissions, and role-based access to named collections.

Rationale:

  • Simple mental model. Three tiers cover the overwhelming majority of real use cases: “only my app”, “the current user”, “everyone in the org”
  • Direct mapping to OAuth scopes. Each tier corresponds to exactly two scopes (storage.<tier>.read / .write), making it easy for developers to reason about what a given token can access
  • 95% coverage. Analysis of internal app patterns showed three tiers cover all use cases encountered in practice. Fine-grained ACLs were over-engineered for the actual access patterns
  • Audit simplicity. Three tiers produce deterministic, human-readable audit log entries — easier to review than fine-grained ACL evaluations

Trade-off: Less flexibility than per-resource ACLs. Teams needing per-document permissions must implement their own access control layer on top of storage.db, using private or shared tier plus application-level checks.


DD-004: “Organization” as the SDK Term for Tenant

Status: Accepted

Decision: The SDK exposes sdk.organizations.getCurrent() and uses “Organization” in all developer-facing docs. The internal platform term “Tenant” is used in JWT claims, database keys, ID prefixes (tnt-), and internal APIs — but is not surfaced directly in the SDK.

Context: The platform uses “Tenant” as its isolation concept (standard SaaS/PaaS terminology). Enterprise developer audiences more commonly think in terms of “Organization” (as used by GitHub, Slack, Okta, etc.).

Rationale:

  • Familiarity. “Organization” is the term enterprise developers already use from GitHub, Okta, Slack, and similar tools. Reducing cognitive load increases adoption
  • Abstraction stability. The SDK/developer-facing term is stable even if internal platform terminology changes
  • Docs clarity. “Your organization’s data” reads more naturally than “your tenant’s data” in developer guides

Trade-off: The SDK and platform internals use different names for the same entity. Developers reading audit logs or debug output will see tnt- prefixed IDs and need to understand these map to their “Organization”. This is documented prominently in the Multi-Tenancy page.


DD-005: Audit Events Auto-Instrumented by the Backend

Status: Accepted

Decision: cpod-backend automatically emits audit events for all EDM mutations (create, update, delete), all storage writes, all auth decisions, and all platform calls. SDK consumers do not write audit instrumentation code for these built-in operations.

Context: Alternative designs required SDK consumers to call sdk.audit.emit(...) explicitly at each business operation.

Rationale:

  • Zero missed events. Auto-instrumentation via middleware means every operation is audited regardless of developer intent or oversight
  • Compliance by default. Enterprise customers (regulated industries, SOC 2 requirements) get a complete audit trail without any additional implementation effort
  • Consistent event shape. All auto-emitted events use the same schema (actor, action, resource, outcome, tenantId, timestamp), making audit log analysis uniform

Trade-off: Less control over auto-generated event shape and metadata. Developers cannot add custom fields to built-in audit events — they can only emit additional custom events via sdk.audit.emit(...). Custom events appear alongside auto-events in the audit log.


DD-006: Base64 Encoding for File Uploads

Status: Accepted

Decision: File content is sent as a base64-encoded string inside a JSON body rather than as multipart/form-data.

Context: REST APIs commonly use multipart/form-data for file uploads. cPod uses base64 JSON instead.

Rationale:

  • Simpler SDK client implementation. JSON parsing is a solved problem in every language; multipart encoding has edge cases (boundary collision, field ordering, streaming) that vary across HTTP client libraries
  • Works with all HTTP clients. Base64 JSON works identically with fetch, axios, requests, net/http, HttpClient, and any other HTTP client — no special multipart handling needed
  • Consistent content-type. All SDK calls use application/json — no mixed content-type handling in middleware or serialization layers
  • Easier testing. Curl, Postman, and httpie can all send base64 JSON without multipart complexity

Trade-off: Approximately 33% payload overhead for binary files (base64 expands 3 bytes to 4 characters). For files larger than 100 MB, the SDK provides a multipart upload API (createMultipartUpload) that uses chunked transfer — the overhead is only on the single-call upload path.


DD-007: Emulator Accepts Any Bearer Token

Status: Accepted

Decision: The local emulator (cpod-emulator) runs in fully permissive mode: it accepts any Authorization: Bearer <anything> header without signature verification, scope enforcement, or tenant isolation.

Context: Local development environments need fast iteration. Requiring a full OAuth setup (Control Plane running, app registration, token issuance) creates friction for new developers.

Rationale:

  • Zero-friction start. A developer can run cpod-emulator start and make SDK calls immediately with any placeholder credentials. No auth server setup, no app registration, no token exchange
  • Focus on business logic. Local development should test data shapes, API behavior, and application logic — not auth infrastructure
  • Consistent API surface. The emulator exposes the identical REST API as production — only auth enforcement is disabled. SDK calls look the same in dev and prod

Trade-off: The emulator is not a security test environment. It does not validate that your scopes are correct, that your tenant isolation works, or that your Rego policy allows your operations. Always test auth behavior against a staging environment with a real CoreSDK sidecar before production deployment.

🚫

Never run the emulator in production or staging with real data. The permissive auth mode bypasses all tenant isolation — any caller can read and write any data.


DD-008: TypeScript as the Canonical SDK

Status: Accepted

Decision: SDK types are defined in TypeScript first. Python, Go, and .NET types derive from the same EDM JSON Schema definitions, with TypeScript as the implementation reference.

Context: The platform needed consistent types across four language SDKs. Options included: separate implementations in each language, a shared IDL (Protobuf/Thrift/OpenAPI), or one language as canonical.

Rationale:

  • EDM is JSON Schema. All entities are defined in JSON Schema Draft 2020-12 — TypeScript types are generated directly from these schemas with minimal transformation
  • Fastest iteration. TypeScript type changes catch API drift at compile time; the other language SDKs are generated from the same schemas, so schema changes propagate automatically via scripts/sync-version.ts
  • Largest enterprise SDK audience. TypeScript/Node.js is the most common runtime for enterprise tooling and integrations in the target market

Trade-off: TypeScript SDK features sometimes land slightly before the other language SDKs, creating brief feature gaps. The SDK coverage table in the overview documents current parity.