Verification using Type Checking¶
Recommendations
- Set up a type system with jaxtyping array annotations, and provide type parameters accordingly.
- Use pyright for static type checking (just install the vscode Python extension).
- Use beartype for runtime checking - don't forget to use the jaxtyping hooks!
Type Annotations in the ADL¶
The abstract dataloader specification makes heavy use of generic types to allow type checkers to work even when indirected by various transformations. For example, consider the definition of Sensor
:
# Definition simplified for readability
class Sensor(Protocol, Generic[TSample, TMetadata]):
metadata: TMetadata
def __getitem__(self, index: int | np.integer) -> TSample:
...
TSample
andTMetadata
declare the types of the metadata and data sample type.- Implementations which provide a
TSample
and/orTMetadata
allow type checkers to infer that.metadata
should refer to aTMetadata
object, and indexing via__getitem__
should return aTSample
type.
Note
In this particular example, TMetadata
is expected to implement Metadata
:
Metadata
is a protocol which requires a .timestamps
attribute.
Now, suppose we are declaring an interface which takes any Sensor
which loads a (statically and dynamically checkable) CameraFrame
type and is associated with VideoMetadata
. We can define this interface as such:
read_frame_with_cond(
video: spec.Sensor[CameraFrame, VideoMetadata], cond: dict
) -> CameraFrame:
# `video` must have `metadata`, since `video` is a `Sensor`
# `video.metadata` must be a `VideoMetadata` due to the type parameter
metadata: VideoMetadata = video.metadata
idx: int = some_specific_way_to_search_the_metadata(metadata, cond)
# `video[idx]` must be a `CameraFrame` due to the type parameter
return video[idx]
Static Type Checking¶
Static type checking via tools such as pyright, which is included with the VSCode python plugin, provides a first line of defense against violating the ADL specification, either by incorrectly using compliant components, or declaring a ADL-compatible component that isn't actually compliant.
Mypy does not fully support PEP 695
The spec currently makes heavy use of backported (via typing_extensions
) PEP 695 TypeVar(..., infer_variance=True)
syntax. Unfortunately, Mypy does not support this yet, so Mypy will always raise type errors.
Static type checking cannot verify array shapes or data types
Since array shapes and types are determined at runtime, static type checkers cannot verify jaxtyping array annotations.
Runtime Type Checking¶
The protocols declared in the spec
are runtime_checkable
, and can be isinstance
or issubclass
checked at runtime. As a last line of defense, this allows type checking by runtime type checkers such as beartype, which can also deeply check the data types used as inputs and returns by the various MDL components if they have a well-defined type system.
Notably, since python is fundamentally a dynamically typed language, this enables checking of many types (or aspects of types) which cannot be checked statically. This is especially true for data with runtime-dynamic shapes.
Runtime type checking does not deeply verify protocols
Runtime type checkers are based on isinstance
and issubclass
calls; however, these builtin functions verify protocols only by checking the presence of declared methods and parameters. In typing jargon, this means that objects can only be checked against the "type-erased" versions of protocols at runtime.
In practice, this means that it is still possible for false positives: objects which appear to match a protocol when checked (e.g., at initialization), but cause something to explode later (e.g., when an incompatible method is called).