At WWDC 2018, Apple confirmed over 550,000 podcast publications on iTunes. With inexpensive production costs and ease of access, podcasts have become a favorite way to inform and entertain listeners across the globe.

A shortcoming? Podcasts can’t be searched easily. This inability to search makes it difficult for listeners to come back to a snippet they may have heard while commuting or running on a treadmill.

Text-based show notes are a good first step, but are often neglected by listeners because they’re difficult to sift through. Devin Pigera, the winner of the 2018 Austin DeveloperWeek Hackathon, aimed to solve this problem in a 12-hour timebox. The result was Roxxy – a simple way to transcribe and annotate your podcasts.

See the winning pitch here (1 min 38 seconds):

Building the Roxyy Player

Devin’s aim with Roxyy was to enable producers to easily upload a podcast and receive a transcript they can annotate with important notes and links.

Podcasts creators frequently publish their list of episodes via an RSS XML feed, so Devin was able to leverage episodes from The Tim Ferriss Show and Joe Rogan Experience to develop his product. His approach involved four components:

1. A NodeJS web server that site-scraped podcast RSS feeds and saved episode links to a Heroku Cloud DB

2. A 1-click integration with to transcribe the audio file

3. An EmberJS web content management platform enabling podcast creators to view a transcript, make small spelling/grammar corrections, and tag content at crucial timestamps (e.g., links to Amazon goods, Netflix shows, and quotes)

4. An EmberJS player for subscribers to listen to podcasts and view annotations

Now, with Roxyy listeners can easily find that book referenced in their favorite podcast. At present Devin is testing a beta version of the platform with content creators based in Austin TX. See a demo of Roxyy here.

Implementing for Podcast Transcription

There are many transcription services available – so why did Devin choose In a time-constrained hackathon, the ease of integration made the clear winner.

“The way solved integrations from a developer standpoint is the best I’ve seen so far. Even at the hackathon, every developer was able to get started right away,” Devin says.

To use, all you need is an access token and a quick read through the documentation.

In his hackathon implementation Devin only used three queries to get a transcript of his podcast. He submitted the URL of his podcast file, queried for the job status (e.g., IN_PROGRESS, COMPLETED, etc.), and then queried for the actual transcript.

Optionally, developers can also provide a callback URL that the API will automatically reply to when the transcript is ready. provides transcripts as either .json or .txt files. For each word, the .json transcript provides:

  1. The text
  2. The relevant speaker
  3. Starting and ending timestamps
  4. A confidence score out of 1 for every word transcribed
Example properties in the .json transcript

Devin was able to integrate with during the hackathon in less than 5 minutes by using Paw to quickly construct his queries.

Try in Your Application delivers quality speech-to-text recognition via an API in three steps:

1. Create a free account

2. Generate your access token

3. Submit your first API job

Discover the beauty and simplicity of for yourself with 300 minutes of free credit.

Try it free