
Auphonic has been well known in the podcasting field because of its great tool that makes audio editing and podcast cleanup a breeze. But unknown to some, Auphonic has a ton of features that's not limited to audio cleanup or post production processing but some other bells and whistles that some of you are probably paying $1/per minute.
One of its great features you should test out is Auphonic Speech Recognition. If you want to automate your podcast transcription and turn your audio/video files into text transcripts, then this speech recognition feature from Auphonic is something you might be interested in.
How does Auphonic Speech Recognition Work?
Auphonic's speech recognition feature harnesses the power of speech recognition services like Google Speech API and Wit.ai. You will have to add either of these automatic speech recognition services into your Auphonic's “External Services”
How to create a Google Cloud Speech API Account:
1. Optional: Sign up for an account at cloud.google.com/freetrial (credit card required).
2. Create a new project in the Cloud Console.
3. Enable the Cloud Speech API, select your project, continue and click on the Go to credentials button.
4. Answer “Are you using Google App Engine or Google Compute Engine?” with Yes.
5. Click on “What credentials do I need?” below, then on Done.
6. Continue and click on Create credentials / API key.
7. Copy your API key and paste it in the form below (might take a minute).
Google Speech API
Google's Speech API can be accessed by signing up to Google's Cloud Platform. There's also a 365-day free trial (yes, it's a whole year) plus an additional $300 credits.
Once you complete the trial period and used up your credit, you will still have 60 minutes of audio per month for free and will only be charged in excess of that.
Transcription Quality
Based on the tests that I've done, the transcription can be really great, but there are some if's to that:
You can really achieve great quality if:
- Your audio quality is great, which means, the words were spoken clearly and with neutral or no heavy accent.
- There's no/minimal background noise
- Add word and phrase hints
- Add additional words to the vocabulary of the recognition task
- Choose the correct language variety, for example, English-US is different from English-UK or English-AU
You can also do some further check by going to Google Cloud Speech API Documentation.

No spamming for sure. Just podcast content that will help you grow your show, and make podcasting easier for you.