Akina AI
Docs/Zendo
Zendo · Getting started

Set up your capture volume.

Zendo runs locally on your machine. Start with one camera, or scale up to two or three for millimeter-range accuracy. Get three things right — hardware, cameras, calibration — and you're capturing.

Hardware bundles · On request

Want the rig, not just the software?

Akina ships pre-configured Zendo system bundles — computer, cameras, active USB-C extension cables and a rigid checkerboard — sized and tuned to your application. Cameras arrive with intrinsic calibration files ready to import, so you're capturing on day one.

Request a bundle →
01

Hardware requirements#

Zendo is a native desktop app for macOS and Linux. It does the perception work on your machine, so the hardware needs to keep up with one to three parallel HD streams plus pose tracking and rendering. The specs below assume a full three-camera rig — drop the GPU and memory floor a notch if you're running fewer cameras.

Supported platforms

macOS
Apple Silicon only — M2 or newer. Intel Macs are not supported. Mac mini M4 with 24 GB RAM is the sweet spot.
Linux
x86_64 only. Requires a dedicated NVIDIA GPU — RTX 3060 minimum, RTX 4060 or 4070 for headroom on three concurrent streams.
Memory
16 GB or more.

Accessories you'll need

  • MX Brio 705 webcams — one per camera position.
  • Active extension cables — one per camera. Passive cables drop signal over distance and will desync your streams.
  • A checkerboard — the linked size suits a 6 m × 6 m workspace. See the Checkerboard section below for sizing rules of thumb.
  • A tripod or wall mount per camera (standard 1/4″-20 thread). If the room has ferromagnetic parts on the walls or ceiling, a magnetic mount is a good option.
  • One USB-C 3.0 port per camera, or a powered USB-C 3.0 hub. USB 2.0 doesn't have the bandwidth for the UHD streams Zendo uses during calibration.
02

Cameras#

Zendo supports a single camera model today: the Logitech MX Brio 705. It's the only camera we've validated against the full UHD/HD capture path, and the only one we'll guarantee calibration accuracy on. Other webcams will not work — even ones that look spec-compatible on paper.

How many cameras you need

Zendo runs with one camera or more. What you're using the capture for decides the count.

1 camera
Monocular pose estimation. Approximate 3D only — not recommended where accuracy matters.
2 cameras
Stereo capture with millimeter-range accuracy after extrinsic calibration. Suitable for human–robot interaction and movement analysis.
3+ cameras
Same accuracy as two, with reduced occlusion in dense scenes. Our recommended setup.

Use one camera only when approximate 3D is enough. For human–robot interaction, biomechanics, or any workflow that depends on accurate data, use two cameras or more.

With a single camera, the output is a hip-centered skeleton in normalized — not metric — coordinates. Translation in world space isn't recovered, so you can't say where the person is or how far they moved. Joint angles still work for posture and shape analysis; absolute positions and distances need two cameras or more.

Camera placement#

How you place the cameras depends on how many you're using.

1 camera
Point it at the capture area from the most informative angle for your task. Frame the subject so they stay in view across the full motion.
2 cameras
Position them roughly 90° apart around the subject — close enough to overlap on the same volume, far enough apart to triangulate well.
3 cameras
Position them 120° apart around the capture volume, roughly equidistant from the centre. Maximum angular coverage of the subject.
4 cameras
Position them 90° apart around the capture volume, roughly equidistant from the centre. Maximum angular coverage of the subject.

When placing two or more, think about what each camera can't see. Stagger the angles so no two cameras share the same occlusion at the same time. If a hand disappears behind a torso in camera A, it should be visible in the others.

03

Calibration#

Calibration turns raw camera feeds into accurate 3D, and there's less of it than you might expect. There are two steps — intrinsic and extrinsic — and both run against a printed checkerboard.

Intrinsic calibration you run once per camera, then never again. Extrinsic takes about 30 seconds and only needs repeating when a camera moves. Each step below is tagged with its time and how often it applies.

Checkerboard#

Zendo calibrates against a printed checkerboard. The board itself matters more than you'd think — any flex changes the geometry, and any change in geometry degrades calibration.

Backing
Aluminium, LDPE, or ceramic plate. Rigid by design. Avoid paper, foam-core, or anything that can warp.
Square size
40 mm squares — sized to stay visible during calibration at 5–6 m from each camera. Scale up for larger workspaces.
Pattern dimensions
Configurable in Zendo when you start calibration. Set the number of inner columns and rows to match the board you've printed.

Size the board to the volume — its footprint should be a meaningful fraction of your capture area. The bigger the workspace, the larger the checkerboard you need; small boards in large volumes produce poor extrinsics because the corners cover too little of each image.

Intrinsic calibration#

~4 min per cameraOne-time only

Intrinsic calibration measures each camera's lens — focal length and distortion. Run it first. If you're using more than one camera, run extrinsic calibration afterwards.

Run it yourself

  1. Open Zendo. Confirm all your cameras are detected.
  2. Click Calibrate → Intrinsic calibration.
  3. Hold the checkerboard in front of each camera in turn.
  4. Tilt, rotate and move the board so the checkerboard fills different regions of the frame and is captured at varied angles.
  5. Zendo collects samples until the per-camera intrinsics converge. Follow the on-screen prompts.

Extrinsic calibration#

~30 secOne time per camera placement

Extrinsic calibration measures where the cameras sit relative to each other, so it only applies to multi-camera setups — with a single camera, you're done after the intrinsic step.

It depends on every camera's intrinsics already being known, so finish intrinsic calibration for each camera first. During the step, all cameras must see the same checkerboard at the same time.

  1. Place the checkerboard flat on the floor, in the middle of your capture volume. Its top-right inner corner becomes the origin (0, 0, 0) of Zendo's world coordinate system — orient the board with that in mind, since every 3D position Zendo reports afterwards is relative to that point.
  2. In each camera's live preview, confirm the full board is visible and well-lit.
  3. Click Calibrate → Extrinsic calibration.
  4. Follow the on-screen procedure. Don't move the board or the cameras until Zendo confirms calibration is complete.
You're set

You're ready to capture.

Run a quick sanity check by moving through the capture volume and watching the 3D skeleton. If something looks off, redo the extrinsic step before tearing down the rig.

Hit a snag? See troubleshooting below →
04

Troubleshooting#

Most setup problems trace back to connection bandwidth or a camera losing sight of the checkerboard. If something isn't working, start here.

A camera isn't detected, or calibration fails or returns inaccurate extrinsics
Every camera needs a USB-C 3.0 connection. USB 2.0 — and USB-C ports running at USB 2.0 speeds — don't have the bandwidth for the UHD resolution Zendo uses during calibration, so a camera on a slower port will fail to calibrate or produce inaccurate extrinsics. If you're short on ports, use a poweredUSB-C 3.0 hub — bus-powered hubs aren't reliable under multiple concurrent UHD streams. For runs longer than a metre or two, use active USB-C extension cables — passive cables drop signal and desync your streams. If you're using a USB-A to USB-C adapter, make sure it supports 10 Gbps (USB 3.1 Gen 2 or better); adapters that cap at 5 Gbps won't pass enough bandwidth for UHD streaming.
Extrinsic calibration fails
Two things cause this. First, extrinsic calibration depends on every camera's intrinsics being known — if intrinsic calibration hasn't been completed for each camera, it fails before it starts. Second, every camera must keep the full checkerboard in view for the whole run — if one loses sight of the board, calibration fails. Confirm intrinsics are done, make sure the full board is visible and well-lit in every camera's live preview, then retry.
The 3D skeleton looks wrong when you sanity-check the capture
Move through the capture volume and watch the skeleton. If joints sit in the wrong place or the skeleton drifts, the extrinsics are likely off — redo extrinsic calibration before tearing down the rig.
Support

Still stuck?

Reach our engineers directly — no ticket queue, no chatbot triage. We promise you'll talk to a human who's set up a Zendo rig themselves.

Talk to us →
05

Build with the SDK#

Once your capture volume is running, you can stream live motion data directly into your own application using the Zendo SDK. Zendo handles the capture; the SDK is a lightweight client that connects to a running session and gives you 80-DOF joint positions, orientations, and velocities at up to 60 fps — ready to wire into a robot controller, a biomechanics pipeline, or anything else.

Zendo SDK

Your capture volume, piped into your code.

Available for Rust and Python. Install from the package registry and you're connecting to Zendo in a few lines.