# Strict Dataclasses

The `huggingface_hub` package provides a utility to create **strict dataclasses**. These are enhanced versions of Python's standard `dataclass` with additional validation features. Strict dataclasses ensure that fields are validated both during initialization and assignment, making them ideal for scenarios where data integrity is critical.

## Overview

Strict dataclasses are created using the `@strict` decorator. They extend the functionality of regular dataclasses by:

- Validating field types based on type hints
- Supporting custom validators for additional checks
- Optionally allowing arbitrary keyword arguments in the constructor
- Validating fields both at initialization and during assignment

## Benefits

- **Data Integrity**: Ensures fields always contain valid data
- **Ease of Use**: Integrates seamlessly with Python's `dataclass` module
- **Flexibility**: Supports custom validators for complex validation logic
- **Lightweight**: Requires no additional dependencies such as Pydantic, attrs, or similar libraries

## Usage

### Basic Example

```python
from dataclasses import dataclass
from huggingface_hub.dataclasses import strict, as_validated_field

# Custom validator to ensure a value is positive
@as_validated_field
def positive_int(value: int):
    if not value > 0:
        raise ValueError(f"Value must be positive, got {value}")

@strict
@dataclass
class Config:
    model_type: str
    hidden_size: int = positive_int(default=16)
    vocab_size: int = 32  # Default value

    # Methods named `validate_xxx` are treated as class-wise validators
    def validate_big_enough_vocab(self):
        if self.vocab_size  [!WARNING]
> Method `.validate()` is a reserved name on strict dataclasses.
> To prevent unexpected behaviors, a `StrictDataclassDefinitionError` error will be raised if your class already defines one.

## API Reference

### `@strict`[[huggingface_hub.dataclasses.strict]]

The `@strict` decorator enhances a dataclass with strict validation.

#### huggingface_hub.dataclasses.strict[[huggingface_hub.dataclasses.strict]]

[Source](https://github.com/huggingface/huggingface_hub/blob/v1.7.0/src/huggingface_hub/dataclasses.py#L58)

Decorator to add strict validation to a dataclass.

This decorator must be used on top of `@dataclass` to ensure IDEs and static typing tools
recognize the class as a dataclass.

Can be used with or without arguments:
- `@strict`
- `@strict(accept_kwargs=True)`

Example:
```py
>>> from dataclasses import dataclass
>>> from huggingface_hub.dataclasses import as_validated_field, strict, validated_field

>>> @as_validated_field
>>> def positive_int(value: int):
...     if not value >= 0:
...         raise ValueError(f"Value must be positive, got {value}")

>>> @strict(accept_kwargs=True)
... @dataclass
... class User:
...     name: str
...     age: int = positive_int(default=10)

# Initialize
>>> User(name="John")
User(name='John', age=10)

# Extra kwargs are accepted
>>> User(name="John", age=30, lastname="Doe")
User(name='John', age=30, *lastname='Doe')

# Invalid type => raises
>>> User(name="John", age="30")
huggingface_hub.errors.StrictDataclassFieldValidationError: Validation error for field 'age':
    TypeError: Field 'age' expected int, got str (value: '30')

# Invalid value => raises
>>> User(name="John", age=-1)
huggingface_hub.errors.StrictDataclassFieldValidationError: Validation error for field 'age':
    ValueError: Value must be positive, got -1
```

**Parameters:**

cls : The class to convert to a strict dataclass.

accept_kwargs (`bool`, *optional*) : If True, allows arbitrary keyword arguments in `__init__`. Defaults to False.

**Returns:**

The enhanced dataclass with strict validation on field assignment.

### `validate_typed_dict`[[huggingface_hub.dataclasses.validate_typed_dict]]

Method to validate that a dictionary conforms to the types defined in a `TypedDict` class.

This is the equivalent to dataclass validation but for `TypedDict`s. Since typed dicts are never instantiated (only used by static type checkers), validation step must be manually called.

#### huggingface_hub.dataclasses.validate_typed_dict[[huggingface_hub.dataclasses.validate_typed_dict]]

[Source](https://github.com/huggingface/huggingface_hub/blob/v1.7.0/src/huggingface_hub/dataclasses.py#L290)

Validate that a dictionary conforms to the types defined in a TypedDict class.

Under the hood, the typed dict is converted to a strict dataclass and validated using the `@strict` decorator.

Example:
```py
>>> from typing import Annotated, TypedDict
>>> from huggingface_hub.dataclasses import validate_typed_dict

>>> def positive_int(value: int):
...     if not value >= 0:
...         raise ValueError(f"Value must be positive, got {value}")

>>> class User(TypedDict):
...     name: str
...     age: Annotated[int, positive_int]

>>> # Valid data
>>> validate_typed_dict(User, {"name": "John", "age": 30})

>>> # Invalid type for age
>>> validate_typed_dict(User, {"name": "John", "age": "30"})
huggingface_hub.errors.StrictDataclassFieldValidationError: Validation error for field 'age':
    TypeError: Field 'age' expected int, got str (value: '30')

>>> # Invalid value for age
>>> validate_typed_dict(User, {"name": "John", "age": -1})
huggingface_hub.errors.StrictDataclassFieldValidationError: Validation error for field 'age':
    ValueError: Value must be positive, got -1
```

**Parameters:**

schema (`type[TypedDictType]`) : The TypedDict class defining the expected structure and types.

data (`dict`) : The dictionary to validate.

### `as_validated_field`[[huggingface_hub.dataclasses.as_validated_field]]

Decorator to create a `validated_field`. Recommended for fields with a single validator to avoid boilerplate code.

#### huggingface_hub.dataclasses.as_validated_field[[huggingface_hub.dataclasses.as_validated_field]]

[Source](https://github.com/huggingface/huggingface_hub/blob/v1.7.0/src/huggingface_hub/dataclasses.py#L430)

Decorates a validator function as a `validated_field` (i.e. a dataclass field with a custom validator).

**Parameters:**

validator (`Callable`) : A method that takes a value as input and raises ValueError/TypeError if the value is invalid.

### `validated_field`[[huggingface_hub.dataclasses.validated_field]]

Creates a dataclass field with custom validation.

#### huggingface_hub.dataclasses.validated_field[[huggingface_hub.dataclasses.validated_field]]

[Source](https://github.com/huggingface/huggingface_hub/blob/v1.7.0/src/huggingface_hub/dataclasses.py#L387)

Create a dataclass field with a custom validator.

Useful to apply several checks to a field. If only applying one rule, check out the `as_validated_field` decorator.

**Parameters:**

validator (`Callable` or `list[Callable]`) : A method that takes a value as input and raises ValueError/TypeError if the value is invalid. Can be a list of validators to apply multiple checks.

- ****kwargs** : Additional arguments to pass to `dataclasses.field()`.

**Returns:**

A field with the validator attached in metadata

### Errors[[huggingface_hub.errors.StrictDataclassError]]

#### huggingface_hub.errors.StrictDataclassError[[huggingface_hub.errors.StrictDataclassError]]

[Source](https://github.com/huggingface/huggingface_hub/blob/v1.7.0/src/huggingface_hub/errors.py#L421)

Base exception for strict dataclasses.

#### huggingface_hub.errors.StrictDataclassDefinitionError[[huggingface_hub.errors.StrictDataclassDefinitionError]]

[Source](https://github.com/huggingface/huggingface_hub/blob/v1.7.0/src/huggingface_hub/errors.py#L425)

Exception thrown when a strict dataclass is defined incorrectly.

#### huggingface_hub.errors.StrictDataclassFieldValidationError[[huggingface_hub.errors.StrictDataclassFieldValidationError]]

[Source](https://github.com/huggingface/huggingface_hub/blob/v1.7.0/src/huggingface_hub/errors.py#L429)

Exception thrown when a strict dataclass fails validation for a given field.

## Why Not Use `pydantic`? (or `attrs`? or `marshmallow_dataclass`?)

- See discussion in https://github.com/huggingface/transformers/issues/36329 regarding adding Pydantic as a dependency. It would be a heavy addition and require careful logic to support both v1 and v2.
- We don't need most of Pydantic's features, especially those related to automatic casting, jsonschema, serialization, aliases, etc.
- We don't need the ability to instantiate a class from a dictionary.
- We don't want to mutate data. In `@strict`, "validation" means "checking if a value is valid." In Pydantic, "validation" means "casting a value, possibly mutating it, and then checking if it's valid."
- We don't need blazing-fast validation. `@strict` isn't designed for heavy loads where performance is critical. Common use cases involve validating a model configuration (performed once and negligible compared to running a model). This allows us to keep the code minimal.

