Whisper
Open-source AI-powered speech recognition system developed by OpenAI.
Contents
Model variants
Model variants
Adds fast automatic speaker recognition with word-level timestamps and speaker diarization.
Faster reimplementation of Whisper using CTranslate2.
JAX implementation of Whisper for up to 70x speed-up on TPU.
Adds word-level timestamps and confidence scores.
Whisper running on OpenVINO.
Whisper running on TensorFlow Lite.
Whisper that can recognize non-speech audio events in addition to speech.
Apps
Audio transcription and translation macOS app.
Local speech-to-text transcription for macOS and Windows with system-wide dictation.
Audio transcription Linux app.
Android app for transcription and translation. (FOSS)
Dictation and transcription macOS app. (FOSS)
AI voice dictation for Mac. (FOSS)
Dictation app for macOS. (FOSS)
Web apps
Self-hosted
CLI tools
YouTube subtitle generation.
Generate captions for videos.
Standalone Windows executable for Whisper and Faster Whisper.
Whisper command-line tool based on CTranslate2, compatible with the original.
Achieve transcription speeds near 30x real-time with several optimizations.
Automatic speech recognition with speaker diarization.
On-device speech-to-text CLI using faster-whisper with automatic clipboard copy.