How to Setup Auphonic Speech Recognition for Automatic Transcription

Auphonic has been well known in the podcasting field because of its great tool that makes audio editing and podcast cleanup a breeze. But unknown to some, Auphonic has a ton of features that's not limited to audio cleanup or post production processing but some other bells and whistles that some of you are probably paying $1/per minute.

One of its great features you should test out is Auphonic Speech Recognition. If you want to automate your podcast transcription and turn your audio/video files into text transcripts, then this speech recognition feature from Auphonic is something you might be interested in.

Table of Contents

1 How does Auphonic Speech Recognition Work?

1.1 Google Speech API

2 Transcription Quality

How does Auphonic Speech Recognition Work?

Auphonic's speech recognition feature harnesses the power of speech recognition services like Google Speech API and Wit.ai. You will have to add either of these automatic speech recognition services into your Auphonic's “External Services”

How to create a Google Cloud Speech API Account:

1. Optional: Sign up for an account at cloud.google.com/freetrial (credit card required).
2. Create a new project in the Cloud Console.
3. Enable the Cloud Speech API, select your project, continue and click on the Go to credentials button.
4. Answer “Are you using Google App Engine or Google Compute Engine?” with Yes.
5. Click on “What credentials do I need?” below, then on Done.
6. Continue and click on Create credentials / API key.
7. Copy your API key and paste it in the form below (might take a minute).

Google Speech API

Google's Speech API can be accessed by signing up to Google's Cloud Platform. There's also a 365-day free trial (yes, it's a whole year) plus an additional $300 credits.

Once you complete the trial period and used up your credit, you will still have 60 minutes of audio per month for free and will only be charged in excess of that.

Transcription Quality

Based on the tests that I've done, the transcription can be really great, but there are some if's to that:

You can really achieve great quality if:

Your audio quality is great, which means, the words were spoken clearly and with neutral or no heavy accent.
There's no/minimal background noise
Add word and phrase hints
Add additional words to the vocabulary of the recognition task
Choose the correct language variety, for example, English-US is different from English-UK or English-AU

You can also do some further check by going to Google Cloud Speech API Documentation.

Podcast is ever-changing. We're here to keep you up to date

Join our newsletter to get tips and tutorials to launch and grow your show! Plus equipment, tools, and software reviews.

No spamming for sure. Just podcast content that will help you grow your show, and make podcasting easier for you.

Email Us

Call us

Our Latest Blog Posts