Diffusers 0.38.0: New Pipelines and Flash Attention 4 Infrastructure
Diffusers 0.38.0 release introduces multiple new pipelines (Ace-Step 1.5, LongCat-AudioDiT, Ernie-Image) alongside significant performance infrastructure: Flash Attention 4 support, FlashPack loading mechanism, and Ring Anything as a new backend for context parallelism. Also includes a practical example for profiling DiffusionPipeline to identify optimization opportunities. This represents a substantial release with concrete technical features rather than incremental updates.
Diffusers 0.38.0: New Pipelines and Flash Attention 4 Infrastructure
Diffusers 0.38.0 adds three new diffusion pipelines (Ace-Step 1.5, LongCat-AudioDiT, Ernie-Image) with expanded audio capabilities, alongside critical performance infrastructure including Flash Attention 4 support, FlashPack loading optimization, and Ring Anything as a context parallelism backend. The release also includes a profiling example to help developers identify optimization opportunities in their pipelines.
Integration Strategy
When to Use This?
Ideal for:
- Audio generation pipelines requiring DiT-based architectures (LongCat-AudioDiT)
- High-resolution image generation workloads exceeding single-GPU memory
- Projects migrating from Diffusers 0.37.x seeking inference optimizations
- Applications requiring Baidu Ernie model family integration (Ernie-Image)
Less relevant for:
- Simple text-to-image tasks adequately served by existing Stable Diffusion pipelines
- CPU-only deployment scenarios (Flash Attention 4 requires compatible CUDA hardware)
How to Integrate?
# Standard Diffusers installation upgrade
pip install --upgrade diffusers
# Verify version
import diffusers
print(diffusers.__version__) # Expected: 0.38.0
# Flash Attention 4 typically auto-detected on compatible hardware
# FlashPack loading via standard pipeline.from_pretrained() with potential new parameters
Migration Complexity: Low — new features appear to be additive. Existing pipelines should remain compatible. Flash Attention 4 support likely requires hardware with compute capability ≥8.0 (Ampere or newer).
SDK Availability: Native Diffusers library (pip-installable). No separate SDK required.
Compatibility
| Component | Status |
|---|---|
| PyTorch | Expected 2.0+ (Flash Attention 4 requirement) |
| CUDA | Compute capability 8.0+ recommended |
| Diffusers core API | Backward compatible with 0.37.x |
| Existing SDXL/SD pipelines | Likely compatible (not explicitly confirmed) |
Source: @huggingface Reference: Diffusers 0.38.0 Release (GitHub/Hub) Published: 2026 DevRadar Analysis Date: 2026-05-13