Asynchronous Speech-to-Text API for pre-recorded audio, powered by the world's leading speech recognition engine

Rev.ai's asynchronous Speech-to-Text engine unlocks the power of voice for your business. Whether you're extracting insights from audio or transcribing content at scale, get it all done with a platform that is engineered for growth.

Get more out of your audio and video with our unmatched accuracy in an easy-to-use API

speech to text api for pre-recorded audio laptop computer

Rev.ai serves your industry

Companies use Rev.ai for a multitude of use cases, including business intelligence, market and user research, meeting transcription, and scaling of manual tasks.

Media and entertainment
Caption you videos at scale, increase the accessibility and searchability of you content, and improve video editing efficiency.
Legal and compliance
Use automated speech recognition for digital depositions, eDiscovery, call recording, risk analysis, and court reporting.
Education
Increase the accessibility of your lectures, webinars, and events with pre-recorded classes.
Call centers and analytics
Monitor agent quality, train agents, classify calls, and conduct post-call analytics to improve the customer experience while reducing operational costs.
Languages
Introducing Global Voice Recognition: Rev.ai in four more major world languages. We're bringing everything you already love about our English asynchronous ASR model to four new global languages including Spanish, French, German, and Portuguese. Read more
share additional language requests here
English
Hello
German
Hallo
French
Bonjour
Spanish
Hola
Portuguese
Olá
speech to text api for pre-recorded audio laptop with cloud
Asynchronous speech recognition features

Our global accent model supports major accents from around the world, eliminating the need to pay extra and switch models for different speakers and conversations. We provide you with the best ASR results out-of-the-box, regardless of who is speaking - now in Spanish, French, German, and Portuguese.

Fast delivery
We won't slow you down. Transcribe hour-long files in less than a minute.
Advanced punctuation and capitalization
We use natural language processing to produce transcripts that are highly accurate, fully punctuated, context-aware, and actually readable.
Timestamps
See the start time and end time for each word.
Speaker channel support
Seamlessly process audio on up to eight distinct speaker channels.
Custom vocabulary
Share unique names, industry-specific terminology, and more to improve the accuracy of your transcripts.
Accurate speaker separation
Recognize multiple speakers and attribute text to each in mono- and dual-channel audio.
Get a transcript in minutes
Once you receive your access token, send us your audio file using the /post endpoint. Retrieve it in minutes with the /get endpoint.
Need more help? Check out our SDKs or chat with one of our technical experts.
Explore Documentation
Python
Node
curl