Rev.ai, Google, and Amazon – Which ASR Is Easiest to Use?
We’ve previously discussed a few different metrics and considerations when chatting about Automatic Speech Recognition (ASR) services. We’ve covered word error rate and talked about features like diarization and speaker identification. But other factors should be taken into account – particularly how easy the service is to set up and use.
We spoke with Rev software engineer John Stephens, who’s worked with the transcription APIs of Rev.ai, Google, and Amazon, among others. He discussed a few key points that he found particularly helpful while doing his work.
Making Signup a Breeze
Part of John’s job involves interfacing with different applications, both internal and customer-facing ones. Even for someone who’s adept in code, sometimes these applications or services require lengthy and cumbersome setup processes.
Other services – like Rev.ai – don’t require much to get started. That ease of use sets Rev.ai apart from other transcription services, including Amazon, Google, and Speechmatics.
“It was so much easier interfacing with Rev.ai than any of the other services,” John says. “You can’t just get a transcript from them, you have to get set up in their own little world. You have to make accounts in their environment and interface with their other services.”
It’s understandable why other providers want you to fully create accounts on their cloud environment. After all, it gets you more ingrained with their product, and they can use data they collect for other products and services, or to try and customize your experience down the road.
If you have the time to set up different accounts, there may be benefits in using other services. But developers are typically working hard and staying busy throughout the day. For pure simplicity, John recommends Rev.ai.
“I signed up, generated a token, and then used the token to get a transcript,” John says. “That’s it. I didn’t have to worry about anything else.”
Turnaround Time with Longer Files
John also notes how helpful it is to have a quick turnaround time, especially when dealing with longer files.
“For a 30-second file, Google is the fastest,” John says.
It’s when the files became longer that John realized Google’s turnaround time started lagging.
“For longer files, Rev.ai is actually much faster than Google,” he says. “I’ve found with Google that the amount of time to get your transcript back is about half the length of your media file. So it really becomes a problem with files that are an hour or two long. With Rev.ai it was five to ten minutes tops, even for those longer files.”
If you’ve just got one file you might be okay with waiting. However, if you’re working with several files simultaneously, that can be a serious damper on production.
On top of turnaround speed, John considers documentation an important facet of an ASR service.
Thorough Documentation to Get Started Quickly
Think about when you’ve tried to put furniture or equipment together. Were the instructions more confusing than helpful? Did you feel like giving up and throwing everything out the window? API documentation is the same concept – if the documentation isn’t easy to follow, it can lead to frustration while setting up and trying to accomplish tasks. We’d recommend not throwing your computer out the window, though.
John likes Rev.ai’s documentation since it’s such an easy setup. For the more complex services, he thinks Google offers better support than Amazon.
Google also benefits from a recognizable user interface, as there are similarities between its ASR service and, say, Google Search or Gmail. Since he’d already done the legwork to get set up on Google’s API, John found it to be an enjoyable experience.
“It’s nice to have a nice UI,” John adds. “Actually having to integrate with their services could be viewed as both a pro and a con, depending on whether you’re already set up with them or not.”
How do the Services Stack Up?
So then, between Rev.ai, Google, and Amazon, which one does John choose as his go-to?
“I would recommend Rev.ai over Amazon and Google,” John says. “Especially for the person who just wants to do a personal project and need ASR output easily and simply. It’s perfect for them.”
Rev.ai provides transcripts as either .json or .txt files. The .json transcript provides four things: the text, the relevant speaker, timestamps (both starting and ending), and a confidence score for every word that gets transcribed.
“If Rev.ai was available to me sooner, I definitely would have used it,” John says. “It’s just as accurate or more accurate than Amazon or Google, and it’s super easy to set up. It was at most a 50-line Python script, and I was already getting transcripts from Rev.ai.”
John isn’t exaggerating – it really is simple to use Rev.ai. Receiving quality speech-to-text recognition is just three quick steps away:
- Create a free account
- Generate your access token
- Submit your first API job
The best part? You can give Rev.ai a thorough test run with 300 minutes of free credit.