Lifting API Reference

The lifting module exposes tasks that convert 2D image data into 3D representations — depth maps, point clouds, and meshes.

DepthEstimation

The primary entry point for the depth estimation task. Instantiate once and call .run() with a DepthEstimationCommand.

`vizion3d.lifting.DepthEstimation`

Facade for the Depth Estimation task.

This class serves as the primary entry point for triggering monocular depth estimation inference via direct Python import.

Example

from vizion3d.lifting import (
    DepthEstimation,
    DepthEstimationAdvanceConfig,
    DepthEstimationCommand,
)

cmd = DepthEstimationCommand(
    image_input=b"...",
    return_point_cloud=True,
    advanced_config=DepthEstimationAdvanceConfig(
        fx=615.0, fy=615.0, cx=320.0, cy=240.0, depth_trunc=5.0
    ),
)
result = DepthEstimation().run(cmd)

Source code in vizion3d/lifting/__init__.py

class DepthEstimation:
    """
    Facade for the Depth Estimation task.

    This class serves as the primary entry point for triggering monocular depth
    estimation inference via direct Python import.

    Example:
        ```python
        from vizion3d.lifting import (
            DepthEstimation,
            DepthEstimationAdvanceConfig,
            DepthEstimationCommand,
        )

        cmd = DepthEstimationCommand(
            image_input=b"...",
            return_point_cloud=True,
            advanced_config=DepthEstimationAdvanceConfig(
                fx=615.0, fy=615.0, cx=320.0, cy=240.0, depth_trunc=5.0
            ),
        )
        result = DepthEstimation().run(cmd)
        ```
    """

    experimental: bool = False

    def run(self, command: DepthEstimationCommand) -> DepthEstimationResult:
        """
        Dispatches the provided command through the CQRS bus to the registered handler.

        Args:
            command (DepthEstimationCommand): The inference parameters and flags.

        Returns:
            DepthEstimationResult: The resultant depth map and optional generated files.
        """
        return command_bus.dispatch(command)

`run(command)`

Dispatches the provided command through the CQRS bus to the registered handler.

Parameters:

Name	Type	Description	Default
`command`	`DepthEstimationCommand`	The inference parameters and flags.	required

Returns:

Name	Type	Description
`DepthEstimationResult`	`DepthEstimationResult`	The resultant depth map and optional generated files.

Source code in vizion3d/lifting/__init__.py

def run(self, command: DepthEstimationCommand) -> DepthEstimationResult:
    """
    Dispatches the provided command through the CQRS bus to the registered handler.

    Args:
        command (DepthEstimationCommand): The inference parameters and flags.

    Returns:
        DepthEstimationResult: The resultant depth map and optional generated files.
    """
    return command_bus.dispatch(command)

DepthEstimationCommand

Input contract for the depth estimation task. All inference parameters are declared here.

`vizion3d.lifting.commands.DepthEstimationCommand` `dataclass`

Bases: Command[DepthEstimationResult]

Command payload to trigger a depth estimation inference task.

Attributes:

Name	Type	Description
`image_input`	`str \| bytes`	The input image. Pass a file-path string or raw image bytes. The handler auto-detects which form is supplied.
`model_backend`	`str`	Model backend to use for inference. Default value is the vizion3D release checkpoint URL (`depth_anything_v2_vitb.pth`), which is downloaded on first use and cached under `~/.cache/vizion3d/models/`. Set `VIZION3D_MODEL_CACHE` to override the cache directory. A local `.pth` or `.pt` path is loaded directly as a Depth Anything V2 checkpoint — no download occurs. Any HTTPS URL is downloaded to the cache directory and loaded as a checkpoint.
`return_depth_image`	`bool`	When `True`, the result includes a 16-bit grayscale `open3d.geometry.Image` (dtype `uint16`) mapping `[min_depth, max_depth]` to the full 0–65535 range. Requires Open3D (Python 3.12).
`return_point_cloud`	`bool`	When `True`, the result includes an `open3d.geometry.PointCloud` unprojected from the RGB-D image using the camera intrinsics in `advanced_config`. Point coordinates are in metres. Requires Open3D (Python 3.12).
`advanced_config`	`DepthEstimationAdvanceConfig`	Camera intrinsics and depth range settings. Override any field to customise — e.g. `advanced_config=DepthEstimationAdvanceConfig(fx=615.0, fy=615.0)`. Unspecified fields keep their defaults (PrimeSense values).

Source code in vizion3d/lifting/commands.py

@dataclass
class DepthEstimationCommand(Command[DepthEstimationResult]):
    """
    Command payload to trigger a depth estimation inference task.

    Attributes:
        image_input: The input image. Pass a file-path string or raw image bytes.
            The handler auto-detects which form is supplied.
        model_backend: Model backend to use for inference.

            - Default value is the vizion3D release checkpoint URL
              (`depth_anything_v2_vitb.pth`), which is downloaded on first use and
              cached under `~/.cache/vizion3d/models/`.
              Set `VIZION3D_MODEL_CACHE` to override the cache directory.
            - A local `.pth` or `.pt` path is loaded directly as a Depth Anything V2
              checkpoint — no download occurs.
            - Any HTTPS URL is downloaded to the cache directory and loaded as a
              checkpoint.

        return_depth_image: When `True`, the result includes a 16-bit grayscale
            `open3d.geometry.Image` (dtype `uint16`) mapping `[min_depth, max_depth]`
            to the full 0–65535 range. Requires Open3D (Python 3.12).
        return_point_cloud: When `True`, the result includes an
            `open3d.geometry.PointCloud` unprojected from the RGB-D image using
            the camera intrinsics in `advanced_config`. Point coordinates are in metres.
            Requires Open3D (Python 3.12).
        advanced_config: Camera intrinsics and depth range settings. Override any
            field to customise — e.g.
            ``advanced_config=DepthEstimationAdvanceConfig(fx=615.0, fy=615.0)``.
            Unspecified fields keep their defaults (PrimeSense values).
    """

    image_input: str | bytes
    model_backend: str = DEFAULT_DEPTH_MODEL_URL
    return_depth_image: bool = False
    return_point_cloud: bool = False
    advanced_config: DepthEstimationAdvanceConfig = field(
        default_factory=DepthEstimationAdvanceConfig
    )

DepthEstimationAdvanceConfig

Camera intrinsics and depth range settings. Pass an instance of this model as advanced_config on DepthEstimationCommand to override the PrimeSense defaults used for point cloud unprojection.

`vizion3d.lifting.models.DepthEstimationAdvanceConfig`

Bases: BaseModel

Camera intrinsics and depth range settings for depth estimation.

All fields are optional overrides — unspecified fields retain their defaults, which match the Open3D PrimeSense preset (640×480 RGB-D sensor).

Attributes:

Name	Type	Description
`fx`	`float`	Horizontal focal length in pixels. Controls the horizontal field of view: a larger value means a narrower FOV and more perspective compression.
`fy`	`float`	Vertical focal length in pixels. Usually equal to `fx` for square pixels; differs on sensors with non-square pixels.
`cx`	`float`	Principal point x — the pixel column of the optical axis, typically near the horizontal image centre.
`cy`	`float`	Principal point y — the pixel row of the optical axis, typically near the vertical image centre.
`depth_scale`	`float`	Divisor applied to raw uint16 depth values to convert them to metres. `1000` means the raw values are in millimetres (the standard for RealSense, Kinect, and PrimeSense sensors).
`depth_trunc`	`float`	Maximum depth in metres. Points beyond this distance are discarded from the point cloud.

Source code in vizion3d/lifting/models.py

class DepthEstimationAdvanceConfig(BaseModel):
    """
    Camera intrinsics and depth range settings for depth estimation.

    All fields are optional overrides — unspecified fields retain their defaults,
    which match the Open3D PrimeSense preset (640×480 RGB-D sensor).

    Attributes:
        fx: Horizontal focal length in pixels. Controls the horizontal field of
            view: a larger value means a narrower FOV and more perspective compression.
        fy: Vertical focal length in pixels. Usually equal to ``fx`` for square
            pixels; differs on sensors with non-square pixels.
        cx: Principal point x — the pixel column of the optical axis, typically
            near the horizontal image centre.
        cy: Principal point y — the pixel row of the optical axis, typically near
            the vertical image centre.
        depth_scale: Divisor applied to raw uint16 depth values to convert them to
            metres. ``1000`` means the raw values are in millimetres (the standard
            for RealSense, Kinect, and PrimeSense sensors).
        depth_trunc: Maximum depth in metres. Points beyond this distance are
            discarded from the point cloud.
    """

    fx: float = 525.0
    fy: float = 525.0
    cx: float = 319.5
    cy: float = 239.5
    depth_scale: float = 1000.0
    depth_trunc: float = 10.0

DepthEstimationResult

Output contract returned by DepthEstimation.run(). All fields are always present; optional geometry fields are None when the corresponding return_* flag was not set.

`vizion3d.lifting.models.DepthEstimationResult`

Bases: BaseModel

Result payload returned after a depth estimation inference task.

Attributes:

Name	Type	Description
`depth_map`	`list[list[float]]`	Raw floating-point depth array, shape `[H][W]`. Values are relative (not metric) for monocular models — closer objects have higher values for inverse-depth outputs.
`min_depth`	`float`	Minimum value in `depth_map`.
`max_depth`	`float`	Maximum value in `depth_map`. Guaranteed `max_depth >= min_depth`.
`backend_used`	`str`	Resolved model identifier that processed the request (local file path).
`depth_image`	`Image \| None`	16-bit grayscale `open3d.geometry.Image` (dtype `uint16`), present when `return_depth_image=True` was set on the command. The full 0–65535 range maps linearly to `[min_depth, max_depth]`.
`point_cloud`	`PointCloud \| None`	Coloured `open3d.geometry.PointCloud` unprojected from the RGB-D image, present when `return_point_cloud=True`. Coordinates are in metres — multiply distances by `point_cloud_scale` (always `1.0`) to confirm the unit.
`point_cloud_scale`	`float`	Scale factor for the point cloud coordinate space. Multiply any distance measured between two points in the returned point cloud by this value to get the equivalent distance in metres. Always `1.0` — Open3D produces point cloud coordinates directly in metres.

Source code in vizion3d/lifting/models.py

class DepthEstimationResult(BaseModel):
    """
    Result payload returned after a depth estimation inference task.

    Attributes:
        depth_map: Raw floating-point depth array, shape `[H][W]`. Values are
            relative (not metric) for monocular models — closer objects have
            higher values for inverse-depth outputs.
        min_depth: Minimum value in `depth_map`.
        max_depth: Maximum value in `depth_map`. Guaranteed `max_depth >= min_depth`.
        backend_used: Resolved model identifier that processed the request
            (local file path).
        depth_image: 16-bit grayscale `open3d.geometry.Image` (dtype `uint16`),
            present when `return_depth_image=True` was set on the command.
            The full 0–65535 range maps linearly to `[min_depth, max_depth]`.
        point_cloud: Coloured `open3d.geometry.PointCloud` unprojected from the
            RGB-D image, present when `return_point_cloud=True`. Coordinates are
            in metres — multiply distances by `point_cloud_scale` (always `1.0`)
            to confirm the unit.
        point_cloud_scale: Scale factor for the point cloud coordinate space.
            Multiply any distance measured between two points in the returned
            point cloud by this value to get the equivalent distance in metres.
            Always `1.0` — Open3D produces point cloud coordinates directly in metres.
    """

    depth_map: list[list[float]]
    min_depth: float
    max_depth: float
    backend_used: str
    depth_image: O3dImage | None = None
    point_cloud: O3dPointCloud | None = None
    point_cloud_scale: float = 1.0

    model_config = ConfigDict(arbitrary_types_allowed=True)

Lifting API Reference

DepthEstimation

vizion3d.lifting.DepthEstimation

run(command)

DepthEstimationCommand

vizion3d.lifting.commands.DepthEstimationCommand dataclass

DepthEstimationAdvanceConfig

vizion3d.lifting.models.DepthEstimationAdvanceConfig

DepthEstimationResult

vizion3d.lifting.models.DepthEstimationResult

`vizion3d.lifting.DepthEstimation`

`run(command)`

`vizion3d.lifting.commands.DepthEstimationCommand` `dataclass`

`vizion3d.lifting.models.DepthEstimationAdvanceConfig`

`vizion3d.lifting.models.DepthEstimationResult`