DevRadar
🤗 HuggingFaceSignificant

Datadog Toto 2.0: Open-Weights Time Series Foundation Model

Datadog released Toto 2.0, an open-weights time series foundation model (TSFM) family spanning 4M to 2.5B parameters under Apache 2.0 license. The key differentiator is the demonstration of scaling laws in time series—unlike other TSFM families where model sizes perform similarly, Toto 2.0 shows consistent improvement with scale from a single hyperparameter config. Achieved state-of-the-art on BOOM, GIFT-Eval, and TIME benchmarks. Weights available on Hugging Face for both smallest (4M) and largest (2.5B) variants. This represents a significant advancement for time series ML, providing the predictable compute/data/parameter/performance relationship that enabled progress in NLP and computer vision.

clem 🤗Thursday, May 14, 2026Original source

Datadog Toto 2.0: Open-Weights Time Series Foundation Model

Summary

Datadog's Toto 2.0 is an open-weights time series foundation model family (4M–2.5B parameters) released under Apache 2.0. Unlike prior TSFM families where model sizes perform similarly, Toto 2.0 demonstrates genuine scaling laws—each size consistently outperforms the last from a single hyperparameter configuration. State-of-the-art on BOOM, GIFT-Eval, and TIME benchmarks.

Integration Strategy

When to Use This?

Toto 2.0 is positioned for practitioners who need general-purpose time series understanding across diverse domains:

  • Infrastructure monitoring: Anomaly detection, metric forecasting, capacity planning
  • Financial time series: Price prediction, volatility modeling, algorithmic trading signals
  • Industrial IoT: Sensor anomaly detection, predictive maintenance, process optimization
  • Healthcare: Patient monitoring, clinical time series analysis
  • Retail/e-commerce: Demand forecasting, inventory optimization

The open-weights availability under Apache 2.0 makes it suitable for commercial applications without licensing restrictions.

How to Integrate?

Access Points:

  • HuggingFace: Datadog/Toto-2.0-4m and Datadog/Toto-2.0-2.5B
  • Blogpost: datadoghq.com/blog/ai/toto-2

Practical Integration Considerations:

  1. Model selection: The presence of genuine scaling laws means practitioners can reliably predict quality vs. latency tradeoffs across sizes. Start with the smallest model that meets your quality requirements.

  2. Fine-tuning path: Single hyperparameter config across sizes suggests consistent initialization and training procedures—valuable for practitioners adapting to domain-specific data.

  3. API complexity: Not publicly documented at time of analysis. Evaluate SDK availability and inference serving requirements against your infrastructure.

  4. Commercial use: Apache 2.0 license permits commercial use, modification, and redistribution without restrictions.

Compatibility

Not Specified in Available Documentation:

  • PyTorch/TensorFlow support
  • CUDA requirements
  • ONNX export availability
  • Serving framework integration (vLLM, TensorRT, etc.)
  • Minimum hardware requirements

Practitioners should consult the HuggingFace model cards and Datadog blogpost for implementation-specific guidance.

Source: @huggingface Reference: Datadog Toto 2.0 Blog Post Published: November 2025 DevRadar Analysis Date: 2026-05-14