Skip to content

Verification using Type Checking

Recommendations

Type Annotations in the ADL

The abstract dataloader specification makes heavy use of generic types to allow type checkers to work even when indirected by various transformations. For example, consider the definition of Sensor:

# Definition simplified for readability
class Sensor(Protocol, Generic[TSample, TMetadata]):

    metadata: TMetadata

    def __getitem__(self, index: int | np.integer) -> TSample:
        ...
  • TSample and TMetadata declare the types of the metadata and data sample type.
  • Implementations which provide a TSample and/or TMetadata allow type checkers to infer that .metadata should refer to a TMetadata object, and indexing via __getitem__ should return a TSample type.

Note

In this particular example, TMetadata is expected to implement Metadata:

TMetadata = TypeVar("TMetadata", bound=Metadata)
where Metadata is a protocol which requires a .timestamps attribute.

Now, suppose we are declaring an interface which takes any Sensor which loads a (statically and dynamically checkable) CameraFrame type and is associated with VideoMetadata. We can define this interface as such:

read_frame_with_cond(
    video: spec.Sensor[CameraFrame, VideoMetadata], cond: dict
) -> CameraFrame:
    # `video` must have `metadata`, since `video` is a `Sensor`
    # `video.metadata` must be a `VideoMetadata` due to the type parameter
    metadata: VideoMetadata = video.metadata
    idx: int = some_specific_way_to_search_the_metadata(metadata, cond)
    # `video[idx]` must be a `CameraFrame` due to the type parameter
    return video[idx]

Static Type Checking

Static type checking via tools such as pyright, which is included with the VSCode python plugin, provides a first line of defense against violating the ADL specification, either by incorrectly using compliant components, or declaring a ADL-compatible component that isn't actually compliant.

Mypy does not fully support PEP 695

The spec currently makes heavy use of backported (via typing_extensions) PEP 695 TypeVar(..., infer_variance=True) syntax. Unfortunately, Mypy does not support this yet, so Mypy will always raise type errors.

Static type checking cannot verify array shapes or data types

Since array shapes and types are determined at runtime, static type checkers cannot verify jaxtyping array annotations.

Runtime Type Checking

The protocols declared in the spec are runtime_checkable, and can be isinstance or issubclass checked at runtime. As a last line of defense, this allows type checking by runtime type checkers such as beartype, which can also deeply check the data types used as inputs and returns by the various MDL components if they have a well-defined type system.

Notably, since python is fundamentally a dynamically typed language, this enables checking of many types (or aspects of types) which cannot be checked statically. This is especially true for data with runtime-dynamic shapes.

Runtime type checking does not deeply verify protocols

Runtime type checkers are based on isinstance and issubclass calls; however, these builtin functions verify protocols only by checking the presence of declared methods and parameters. In typing jargon, this means that objects can only be checked against the "type-erased" versions of protocols at runtime.

In practice, this means that it is still possible for false positives: objects which appear to match a protocol when checked (e.g., at initialization), but cause something to explode later (e.g., when an incompatible method is called).