DocumentationSecurity Checklist

Security Checklist

Everything that must be verified before going to production. Each item maps to a known failure mode from the security audit of the platform.


Network Boundaries

🚫

These are not optional. A misconfigured network boundary is a complete auth bypass.

  • Sidecar gRPC port (:50051) binds to loopback only. Verify: ss -tlnp | grep 50051 shows [::1]:50051, not 0.0.0.0:50051.
  • Control Plane (:8080) is firewalled — not reachable from the internet. Verify at infra level (security group, VPC ACL, or iptables rule).
  • Sidecar health port (:9091) is not publicly exposed. Scrape from internal Prometheus only.
  • cpod-backend is behind a TLS-terminating reverse proxy (nginx, Caddy, AWS ALB). No plain HTTP from the internet.
  • HSTS header set on the reverse proxy: Strict-Transport-Security: max-age=63072000; includeSubDomains.

JWT Validation

  • CORESDK_JWKS_URI is set in all non-dev environments. Without it the sidecar fails open — every token is accepted. Value should be https://your-idp/.well-known/jwks.json or the Control Plane’s /api/v1/jwks.
  • CORESDK_ENV is NOT development in staging or production. Dev mode disables all JWT signature verification.
  • aud claim is validated. The JWT aud must contain your API’s identifier. Without audience validation, a token issued for one service is replayable against any other service.
  • iss claim is validated. Set CORESDK_EXPECTED_ISSUER to your Control Plane’s issuer URL.
  • Algorithm allowlist enforced. CoreSDK rejects alg: none and HS256 (for externally-issued tokens). Only RS256, ES256, PS256 are accepted.
  • Token expiry is short. Default is 1 h — do not increase this. Short TTL limits the blast radius of a leaked token.

OAuth App Configuration

  • Every service has its own client_id. Never share credentials between services — leaked credentials must be rotatable independently.
  • Declared scopes follow least privilege. Each app declares only the scopes it actually needs. Tokens cannot exceed declared scopes.
  • client_secret is stored in a secrets manager (Vault, AWS Secrets Manager, Doppler, GCP Secret Manager). Never in source code, .env files committed to git, or CI environment variable logs.
  • client_secret is rotated on a schedule (90 days maximum) or immediately on any suspected leak. Use POST /v1/oauth/apps/:id/rotate-secret.
  • redirect_uri is exact-match registered for all PKCE/auth_code apps. Wildcard or prefix matching is not allowed — it enables open redirect attacks.
  • state parameter is validated on the PKCE callback — compare against the value stored in the user’s session before exchanging the code. Missing state check = CSRF vulnerability.
  • Auth codes are single-use and short-lived (60 s). Reject any attempt to replay an already-exchanged code.

Refresh Token Security

  • Refresh tokens are stored in HttpOnly + Secure + SameSite=Lax cookies for browser-facing apps. Never in localStorage or sessionStorage.
  • Refresh token rotation is enabled. Every /v1/oauth/token?grant_type=refresh_token call invalidates the old token and issues a new one.
  • Token replay detection is monitored. A second use of the same refresh token triggers family revocation. Monitor token-events for refresh_replay — it indicates credential theft.
  • Logout revokes all tokens. Call POST /v1/oauth/revoke for both access and refresh tokens on user logout.

cpod-backend Middleware

  • Entitlement middleware fails closed. The current implementation in app/middleware/entitlement.py fails open (quota bypass possible). Fix before production.
  • Every route is covered by auth middleware. There must be no unauthenticated routes except public endpoints that are intentionally open (e.g., health checks, login page).
  • cpod-arai AUTH_PROVIDER is not trust. Default is AUTH_PROVIDER=trust which disables all verification. Set AUTH_PROVIDER=coresdk in all environments.
  • X-Tenant-ID header is always set on internal calls to cpod-arai. The default is "default" which bypasses tenant isolation.

Secret Management

  • model_configs.api_key is encrypted at rest. Currently stored plaintext in MongoDB. Wrap with Vault transit encryption or AES-256-GCM column encryption.
  • WebhookEndpoint.secret is encrypted at rest. Currently stored plaintext. Apply the same encryption as above.
  • mcptools.headers base64 encoding is replaced with real encryption. Base64 is not encryption — these headers may contain API keys.
  • No hardcoded JWT secrets in source code. Search: grep -r "JWT_SECRET\|dev-secret\|change-in-production" . — zero results expected in production config.
  • Vault is configured and accessible. SecretsService.ResolveSecret depends on Vault KV v2. Verify Vault is reachable and the sidecar has the correct AppRole credentials.

Audit & Compliance

  • Audit log is tamper-evident. CoreSDK uses hash-chained audit records (previous_hash + record_hash). Periodically verify the chain: GET /api/v1/audit?verify_chain=true.
  • Audit log has an off-box sink. The hash chain living in the same database is insufficient for compliance. Pipe token-events and audit records to an append-only store (S3 Object Lock, CloudTrail, Splunk).
  • Retention policy meets compliance requirements. HIPAA requires 6 years, PCI-DSS requires 1 year, SOC 2 typically requires 1 year. Default TTL in some apps is 90–365 days — verify against your framework.
  • PII masking is active on all sensitive fields. Verify MaskingService is called before any user data is stored in logs or returned in API responses.

Apps Bypassing the Sidecar

🚫

12 of 30 platform apps currently bypass cpod-backend and access the database directly. Rego policy cannot enforce on these. This is the highest-priority security gap.

  • Identify all direct-DB callers. See data-model/SECURITY-MODEL.md Section F for the full list.
  • Route every service through cpod-backend. No app should have database credentials — only service tokens (client_credentials).
  • Track migration progress in the MIGRATION-PLAN.md Phase 1 work.

Production Environment Variables

Minimum required for a production deployment:

# cpod-backend
CPOD_API_URL=https://api.yourdomain.com       # public URL
CORESDK_SIDECAR_ADDR=[::1]:50051              # loopback — must not be 0.0.0.0
CORESDK_CP_URL=http://coresdk-cp:8080         # internal — firewall this
CORESDK_ENV=production                         # never "development"
CORESDK_JWKS_URI=https://api.yourdomain.com/api/v1/jwks
CORESDK_EXPECTED_ISSUER=https://api.yourdomain.com
CORESDK_EXPECTED_AUDIENCE=api.yourdomain.com
 
# CoreSDK Control Plane
DATABASE_URL=postgresql://user:pass@postgres:5432/coresdk   # not sqlite in prod
CORESDK_SIGNING_KEY_PATH=/secrets/signing-key.pem           # RS256 private key
CPOD_ADMIN_TOKEN=<long-lived-admin-token-from-secrets-manager>
 
# Optional — enable mTLS between cpod-backend and sidecar
CORESDK_TLS_CERT_FILE=/secrets/client.crt
CORESDK_TLS_KEY_FILE=/secrets/client.key
CORESDK_TLS_CA_FILE=/secrets/ca.crt

Quick Verification Commands

# 1. Sidecar is loopback only
ss -tlnp | grep 50051
# Expected: [::1]:50051    NOT 0.0.0.0:50051
 
# 2. Control plane not reachable from outside
curl --max-time 3 http://<public-ip>:8080/healthz
# Expected: connection refused or timeout
 
# 3. Dev mode is off
curl -H "Authorization: Bearer fake-token" https://api.yourdomain.com/v1/skills
# Expected: 401 Unauthorized    NOT 200
 
# 4. Audit chain integrity
curl -H "Authorization: Bearer $CPOD_ADMIN_TOKEN" \
  https://api.yourdomain.com/v1/audit?verify_chain=true
# Expected: {"chain_valid": true}
 
# 5. No hardcoded secrets
grep -r "dev-secret-change-in-production\|JWT_SECRET.*=.*dev" . \
  --include="*.py" --include="*.ts" --include="*.js" --include="*.env"
# Expected: no results