Skip to content

Automate REST API Docs from Swagger

Goal

Auto-generate docs/docs/reference/rest-api.md from the live OpenAPI spec at https://api.deeplake.ai/docs/doc.json so the docs never drift from the actual API.


Plan

1. Generator Script

File: docs/scripts/generate_rest_api.py

  • Fetch doc.json from https://api.deeplake.ai/docs/doc.json
  • Parse the OpenAPI 3.x spec
  • Group endpoints by tag (Organizations, Workspaces, Tables, etc.)
  • For each endpoint generate:
  • Header (### Summary)
  • Method + path (`GET /organizations/{id}`)
  • Parameters table (path, query, header params with type/required/description)
  • Request body fields (from $ref schema resolution)
  • Response codes with example bodies (from schema or examples)
  • curl example (auto-generated from method + path + params)
  • Write frontmatter (seo_title, description)
  • Write the endpoint summary table at the bottom
  • Output to docs/docs/reference/rest-api.md

Dependencies: requests, pyyaml (both already available in most envs; no exotic deps)

2. Endpoint Filtering

Not all endpoints belong in user-facing docs. The script should support a config section (top of file or separate YAML) to:

  • Include/exclude by tag: e.g. skip Billing Webhooks
  • Include/exclude by path prefix: e.g. skip /billing/webhooks/stripe
  • Custom ordering: define the section order (Health → Auth → Organizations → ...)
  • Override descriptions: allow hand-written notes per endpoint (e.g. the ?confirm= safety note on DELETE org)

Example config (inline dict or docs/scripts/rest_api_config.yaml):

exclude_paths:
  - /billing/webhooks/stripe

section_order:
  - Health
  - Authentication
  - Users & Tokens
  - Organizations
  - Workspaces
  - Tables
  - SQL
  - Credentials
  - Repositories
  - Billing
  - DB Credentials

overrides:
  "DELETE /organizations/{id}":
    note: "The `confirm` query parameter must match the organization name exactly."
  "PATCH /organizations/{id}/workspaces/type":
    note: "Valid types: `agentic_loops`, `physical_ai`, `generic`, `generative_media`."

3. Curl Example Generation

Auto-generate curl snippets using conventions:

$API_URL       → base URL variable
$DEEPLAKE_API_KEY → auth token
$DEEPLAKE_ORG_ID  → org ID header
  • GET: curl -s "$API_URL/path" -H "Authorization: Bearer $DEEPLAKE_API_KEY" -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID"
  • POST/PUT/PATCH: add -X METHOD, -H "Content-Type: application/json", -d '{...}' with placeholder body from schema
  • DELETE: add -X DELETE
  • Skip auth headers for /health, /ready

4. Schema Resolution

The spec uses $ref pointers (e.g. #/components/schemas/CreateOrganizationRequest). The script must:

  • Resolve all $ref recursively
  • Extract field names, types, required flags
  • Generate parameter/body tables from resolved schemas
  • Handle oneOf, allOf, anyOf if present

5. Integration with mkdocs

Two options (pick one):

A) Manual run (recommended to start):

python docs/scripts/generate_rest_api.py
# then commit the updated rest-api.md

B) mkdocs hook (later): Add to mkdocs.yml:

hooks:
  - docs/scripts/generate_rest_api_hook.py
The hook calls the generator on on_pre_build.

6. Preserving Hand-Written Content

Some sections have hand-written context that shouldn't be overwritten:

  • Authentication intro (env var setup, token instructions)
  • Notes/admonitions (e.g. !!! note blocks)
  • test-context blocks for doc testing

Strategy: use markers in the generated file:

<!-- AUTO-GENERATED BELOW - do not edit manually -->
...
<!-- END AUTO-GENERATED -->

Hand-written content lives outside the markers. The script only replaces content between them.

7. Diffing / Safety

  • Before overwriting, print a diff summary (endpoints added/removed/changed)
  • --dry-run flag: show what would change without writing
  • --check flag: exit non-zero if the file is out of date (for CI)

Discovered Endpoints Not Currently Documented

Category Count Endpoints
Auth (device flow) 2 POST /auth/device/code, POST /auth/device/token
API Tokens 3 GET/POST/DELETE /users/me/tokens
Org - update 1 PUT /organizations/{id}
Org - leave 1 POST /organizations/{id}/leave
Org - permissions 1 PUT /organizations/{id}/permissions
Org - invite (correct path) 1 POST /organizations/{id}/members/invite (not /members)
Workspace types 1 GET /workspace-types
Credentials 5 GET/POST/DELETE /api/v1/orgs/{org_id}/credentials, PUT .../creds, POST .../generate
Repositories 5 GET/POST/DELETE /api/v1/orgs/{org_id}/repositories, GET/POST .../default
Billing 11 account, estimate, portal, regions, tier, top-up, transactions, usage (compute/storage/transfer), webhook
DB Credentials 2 GET /db-credentials, POST /db-credentials/rotate
Dataset Creds 1 GET /api/org/{org_id}/ds/{ds_name}/creds

Total in swagger: ~50 endpoints. Currently documented: ~22.


Implementation Steps

  1. Write docs/scripts/generate_rest_api.py with fetch + parse + render
  2. Write docs/scripts/rest_api_config.yaml with section order, exclusions, overrides
  3. Run it, diff against current rest-api.md, review
  4. Add --check mode to CI (optional)
  5. Update DOCUMENTATION_TEST_CHECKLIST.md to note auto-generation