Diffusers documentation
ErnieImageTransformer2DModel
Get started
Pipelines
Adapters
Inference
Inference optimization
Modular Diffusers
Training
Quantization
Model accelerators and hardware
Specific pipeline examples
Resources
API
Main Classes
Modular
Loaders
Models
OverviewAutoModel
ControlNets
Transformers
AceStepTransformer1DModelAllegroTransformer3DModelAuraFlowTransformer2DModelBriaFiboTransformer2DModelBriaTransformer2DModelChromaTransformer2DModelChronoEditTransformer3DModelCogVideoXTransformer3DModelCogView3PlusTransformer2DModelCogView4Transformer2DModelConsisIDTransformer3DModelCosmosTransformer3DModelDiTTransformer2DModelEasyAnimateTransformer3DModelErnieImageTransformer2DModelFlux2Transformer2DModelFluxTransformer2DModelGlmImageTransformer2DModelHeliosTransformer3DModelHiDreamImageTransformer2DModelHunyuanDiT2DModelHunyuanImageTransformer2DModelHunyuanVideo15Transformer3DModelHunyuanVideoTransformer3DModelJoyImageEditTransformer3DModelLatteTransformer3DModelLongCatImageTransformer2DModelLTX2VideoTransformer3DModelLTXVideoTransformer3DModelLumina2Transformer2DModelLuminaNextDiT2DModelMochiTransformer3DModelMotifVideoTransformer3DModelOmniGenTransformer2DModelOvisImageTransformer2DModelPixArtTransformer2DModelPriorTransformerQwenImageTransformer2DModelSanaTransformer2DModelSanaVideoTransformer3DModelSD3Transformer2DModelSkyReelsV2Transformer3DModelStableAudioDiTModelTransformer2DModelTransformerTemporalModelWanAnimateTransformer3DModelWanTransformer3DModelZImageTransformer2DModel
UNets
VAEs
Pipelines
Schedulers
Internal classes
You are viewing main version, which requires installation from source. If you'd like
regular pip install, checkout the latest stable version (v0.38.0).
ErnieImageTransformer2DModel
A Transformer model for image-like data from ERNIE-Image.
A Transformer model for image-like data from ERNIE-Image-Turbo.
ErnieImageTransformer2DModel
class diffusers.ErnieImageTransformer2DModel
< source >( hidden_size: int = 3072 num_attention_heads: int = 24 num_layers: int = 24 ffn_hidden_size: int = 8192 in_channels: int = 128 out_channels: int = 128 patch_size: int = 1 text_in_dim: int = 2560 rope_theta: int = 256 rope_axes_dim: typing.Tuple[int, int, int] = (32, 48, 48) eps: float = 1e-06 qk_layernorm: bool = True )