DevRadar
🤗 HuggingFaceSignificant

Diffusers 0.38.0: New Pipelines and Flash Attention 4 Infrastructure

Diffusers 0.38.0 release introduces multiple new pipelines (Ace-Step 1.5, LongCat-AudioDiT, Ernie-Image) alongside significant performance infrastructure: Flash Attention 4 support, FlashPack loading mechanism, and Ring Anything as a new backend for context parallelism. Also includes a practical example for profiling DiffusionPipeline to identify optimization opportunities. This represents a substantial release with concrete technical features rather than incremental updates.

Sayak PaulWednesday, May 13, 2026Original source

Diffusers 0.38.0: New Pipelines and Flash Attention 4 Infrastructure

Summary

Diffusers 0.38.0 adds three new diffusion pipelines (Ace-Step 1.5, LongCat-AudioDiT, Ernie-Image) with expanded audio capabilities, alongside critical performance infrastructure including Flash Attention 4 support, FlashPack loading optimization, and Ring Anything as a context parallelism backend. The release also includes a profiling example to help developers identify optimization opportunities in their pipelines.

Integration Strategy

When to Use This?

Ideal for:

  • Audio generation pipelines requiring DiT-based architectures (LongCat-AudioDiT)
  • High-resolution image generation workloads exceeding single-GPU memory
  • Projects migrating from Diffusers 0.37.x seeking inference optimizations
  • Applications requiring Baidu Ernie model family integration (Ernie-Image)

Less relevant for:

  • Simple text-to-image tasks adequately served by existing Stable Diffusion pipelines
  • CPU-only deployment scenarios (Flash Attention 4 requires compatible CUDA hardware)

How to Integrate?

# Standard Diffusers installation upgrade
pip install --upgrade diffusers

# Verify version
import diffusers
print(diffusers.__version__)  # Expected: 0.38.0

# Flash Attention 4 typically auto-detected on compatible hardware
# FlashPack loading via standard pipeline.from_pretrained() with potential new parameters

Migration Complexity: Low — new features appear to be additive. Existing pipelines should remain compatible. Flash Attention 4 support likely requires hardware with compute capability ≥8.0 (Ampere or newer).

SDK Availability: Native Diffusers library (pip-installable). No separate SDK required.

Compatibility

ComponentStatus
PyTorchExpected 2.0+ (Flash Attention 4 requirement)
CUDACompute capability 8.0+ recommended
Diffusers core APIBackward compatible with 0.37.x
Existing SDXL/SD pipelinesLikely compatible (not explicitly confirmed)

Source: @huggingface Reference: Diffusers 0.38.0 Release (GitHub/Hub) Published: 2026 DevRadar Analysis Date: 2026-05-13