Accurate Speech-to-Text APIs for all of your speech recognition needs's suite of speech-to-text APIs allows businesses to build powerful downstream applications. We train our speech engine on 50,000+ hours of human-transcribed content from a wide range of topics, industries, and accents. The result? You get access to the most accurate speech recognition products on the market.

Get more out of your audio and video with our unmatched accuracy in an easy-to-use API.


Pay as you go
$0.035 / minute
Rounded to nearest 15 seconds
  • Your first 5 hours free
  • No usage limits
  • Handles all media types
  • Email, chat, and phone support
Try For Free
Contact us
  • For large volumes
  • Custom pricing
  • Dedicated account reps
  • Priority support
Talk to an expert

Simple Integration

Our easy-to-use API is designed by developers for developers. We provide you with SDKs, comprehensive documentation, and expert support so you can get started in minutes. All you need to generate your first transcript is an access token.
Explore Documentation

API Features

Punctuation & Capitalization

Automatically punctuate (commas, question marks, periods, etc.) and capitalize for an easy-to-read transcript.

Speaker Diarization

Recognize multiple speakers and attribute text to each.

Timestamp Generation

Receive a timestamp for each word.

Speaker Channel Support

Seamlessly process multi-channel audio on each distinct channel.

Custom Dictionaries

Customize vocabulary with names, industry-specific terminology, and more to increase transcript accuracy.

Live Streaming

Transcribe speech to text in real-time. Learn more.