Skip to content

Working with Videos

Deeplake handles videos as a native column type. You can ingest video files, access individual frames without decompressing the full video, and stream large videos directly from cloud storage.

Objective

Ingest video files into a managed table, add annotations, access frames by index, and query your video dataset.

Prerequisites

  • pip install deeplake
  • A Deeplake API token.

Set credentials first

export DEEPLAKE_API_KEY="your-token-here"
export DEEPLAKE_WORKSPACE="your-workspace"  # optional, defaults to "default"

Complete Code

from deeplake import Client

client = Client()

# Ingest videos from local files
client.ingest("my_videos", {
    "video": ["./videos/clip1.mp4", "./videos/clip2.mp4"],
    "label": ["traffic", "parking"],
})

# Query your videos
results = client.table("my_videos").where("label = 'traffic'").execute()
print(results)

# Access the underlying dataset for frame-level operations
ds = client.open_table("my_videos")
print(f"Dataset has {len(ds)} samples")
import os
from deeplake import Client

client = Client()
WORKSPACE = os.environ.get("DEEPLAKE_WORKSPACE", "default")

# Create a table with typed columns
client.query(f"""
    CREATE TABLE IF NOT EXISTS "{WORKSPACE}"."my_videos" (
        video VIDEO,
        label TEXT,
        boxes FLOAT4[]
    ) USING deeplake
""")

# Open the table and append data (video column requires raw bytes)
ds = client.open_table("my_videos")

with open("./videos/clip1.mp4", "rb") as f:
    vid1 = f.read()
with open("./videos/clip2.mp4", "rb") as f:
    vid2 = f.read()

ds.append({
    "video": [vid1, vid2],
    "label": ["traffic", "parking"],
    "boxes": [[10, 20, 100, 150], [30, 40, 200, 250]],
})
ds.commit()

print(f"Dataset has {len(ds)} samples")
# Create a table with a VIDEO column
curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
  -d '{
    "query": "CREATE TABLE IF NOT EXISTS \"'$DEEPLAKE_WORKSPACE'\".\"my_videos\" (id SERIAL PRIMARY KEY, video VIDEO, label TEXT) USING deeplake"
  }'

Frame-Level Access

Deeplake decompresses only the frames you request, not the entire video:

ds = client.open_table("my_videos")

# Get video shape: (num_frames, height, width, channels)
print(ds["videos"][0].shape)  # e.g. (400, 360, 640, 3)

# Access a range of frames, only these frames are decompressed
frames = ds["videos"][0, 100:200].numpy()  # shape: (100, 360, 640, 3)

# Access with step (every 5th frame)
sampled = ds["videos"][0, 0:200:5].numpy()  # shape: (40, 360, 640, 3)

# Single frame
last_frame = ds["videos"][0, -1].numpy()  # shape: (360, 640, 3)

Timestamps

Access presentation timestamps (in seconds) for precise temporal alignment:

# Get timestamps for a frame range
ts = ds["videos"][0, 10:15].timestamp
print(ts)  # e.g. array([0.367, 0.400, 0.434, 0.467, 0.500])

# Get both frames and timestamps together
data = ds["videos"][0, 15:20].data()
print(data["frames"].shape)      # (5, 360, 640, 3)
print(data["timestamps"])        # array of 5 timestamps

Video Metadata

info = ds["videos"][0].sample_info
print(info)
# {'duration': 13.33, 'fps': 30.0, 'format': 'mp4', ...}

Streaming

Videos larger than 16 MB are automatically streamed from storage. Only the packets needed for the requested frames are fetched. No full download required. This works for both uploaded and linked videos.

What to try next