Mistral Mini Transcribe 2: Open-Weight Speech-to-Text at $0.003/min
Mistral releases Mini Transcribe 2, a speech-to-text model available via API at $0.003/min with a realtime tier at $0.006/min. The model ships with open weights, enabling self-hosting and local deployment. Console link provided for API access.
Mistral Mini Transcribe 2: Open-Weight Speech-to-Text at $0.003/min
Mistral AI launches Mini Transcribe 2, a compact speech-to-text model with API access at $0.003/min and a realtime tier at $0.006/min. The model ships with open weights, enabling developers to self-host or deploy locally without vendor lock-in. This positions Mini Transcribe 2 as a cost-effective alternative to proprietary ASR services for developers prioritizing flexibility and transparency.
Integration Strategy
When to Use This?
Strong Fit:
- Application embedding: Any product requiring transcription as a feature (note-taking apps, video platforms, accessibility tools)
- Data pipelines: Batch transcription of recorded audio with cost-sensitive volume requirements
- Domain-specific deployment: Healthcare, legal, or financial applications requiring local data processing
- Offline/captive network: Mobile apps, desktop tools, or enterprise environments with restricted internet access
- Cost optimization: Teams currently using premium ASR services looking to reduce per-minute costs
Consider Alternatives If:
- Maximum accuracy on challenging audio (multiple speakers, heavy accents, technical jargon) is paramountālarger models like Whisper-large or cloud services may perform better
- Real-time conversational AI with sub-500ms latency is required (verify realtime tier specs against your SLA needs)
How to Integrate?
API Integration Path:
- Access the Mistral console at
console.mistral.ai/build/audio/speech-to-text - Generate API credentials
- Submit audio via REST API or official SDK
Note: SDK availability and language support are not confirmed in the announcement. Check Mistral's documentation for Python, JavaScript, and other SDK options upon release.
Self-Hosting Path:
- Download model weights from Mistral's model hub (specific location not specified in source)
- Deploy using compatible inference infrastructure (llama.cpp, vLLM, or Mistral's own deployment toolkit if provided)
- Hardware requirements: Likely 4-8GB VRAM for a compact model, enabling single-GPU deployment
Typical Integration Code Pattern (inferred):
# Conceptual API usage (verify against official documentation)
import mistral
client = mistral.AudioClient(api_key="your-key")
# Standard transcription
result = client.transcribe(
audio_url="gs://bucket/recording.wav",
model="mini-transcribe-2"
)
# Realtime streaming
for chunk in client.stream_transcribe(
audio_stream=microphone_input,
model="mini-transcribe-2-realtime"
):
print(chunk.text)
Compatibility
Inference Infrastructure:
- Self-hosted deployment should be compatible with standard LLM serving frameworks
- ONNX export likely supported given Mistral's historical pattern
- GPU acceleration: CUDA 11.8+ expected for optimal performance
API Integration:
- REST API ensures language-agnostic compatibility
- WebSocket support implied for realtime tier
Source: @MistralAI Reference: Console announcement (link in original tweet) Published: April 2026 DevRadar Analysis Date: 2026-04-24