DevRadar
🌐 Mistral AiSignificant

Mistral Mini Transcribe 2: Open-Weight Speech-to-Text at $0.003/min

Mistral releases Mini Transcribe 2, a speech-to-text model available via API at $0.003/min with a realtime tier at $0.006/min. The model ships with open weights, enabling self-hosting and local deployment. Console link provided for API access.

Mistral AIFriday, April 24, 2026Original source

Mistral Mini Transcribe 2: Open-Weight Speech-to-Text at $0.003/min

Summary

Mistral AI launches Mini Transcribe 2, a compact speech-to-text model with API access at $0.003/min and a realtime tier at $0.006/min. The model ships with open weights, enabling developers to self-host or deploy locally without vendor lock-in. This positions Mini Transcribe 2 as a cost-effective alternative to proprietary ASR services for developers prioritizing flexibility and transparency.

Integration Strategy

When to Use This?

Strong Fit:

  • Application embedding: Any product requiring transcription as a feature (note-taking apps, video platforms, accessibility tools)
  • Data pipelines: Batch transcription of recorded audio with cost-sensitive volume requirements
  • Domain-specific deployment: Healthcare, legal, or financial applications requiring local data processing
  • Offline/captive network: Mobile apps, desktop tools, or enterprise environments with restricted internet access
  • Cost optimization: Teams currently using premium ASR services looking to reduce per-minute costs

Consider Alternatives If:

  • Maximum accuracy on challenging audio (multiple speakers, heavy accents, technical jargon) is paramount—larger models like Whisper-large or cloud services may perform better
  • Real-time conversational AI with sub-500ms latency is required (verify realtime tier specs against your SLA needs)

How to Integrate?

API Integration Path:

  1. Access the Mistral console at console.mistral.ai/build/audio/speech-to-text
  2. Generate API credentials
  3. Submit audio via REST API or official SDK

Note: SDK availability and language support are not confirmed in the announcement. Check Mistral's documentation for Python, JavaScript, and other SDK options upon release.

Self-Hosting Path:

  1. Download model weights from Mistral's model hub (specific location not specified in source)
  2. Deploy using compatible inference infrastructure (llama.cpp, vLLM, or Mistral's own deployment toolkit if provided)
  3. Hardware requirements: Likely 4-8GB VRAM for a compact model, enabling single-GPU deployment

Typical Integration Code Pattern (inferred):

# Conceptual API usage (verify against official documentation)
import mistral

client = mistral.AudioClient(api_key="your-key")

# Standard transcription
result = client.transcribe(
    audio_url="gs://bucket/recording.wav",
    model="mini-transcribe-2"
)

# Realtime streaming
for chunk in client.stream_transcribe(
    audio_stream=microphone_input,
    model="mini-transcribe-2-realtime"
):
    print(chunk.text)

Compatibility

Inference Infrastructure:

  • Self-hosted deployment should be compatible with standard LLM serving frameworks
  • ONNX export likely supported given Mistral's historical pattern
  • GPU acceleration: CUDA 11.8+ expected for optimal performance

API Integration:

  • REST API ensures language-agnostic compatibility
  • WebSocket support implied for realtime tier

Source: @MistralAI Reference: Console announcement (link in original tweet) Published: April 2026 DevRadar Analysis Date: 2026-04-24