Google has announced plans to update its Cloud Text-to-Speech API with the same speech recognition technology that’s used to power Google Search and Assistant. According to TechCrunch, the new API is expected to reduce transcription errors by approximately 54 percent.
Google’s new and improved Cloud Text-to-Speech API allows developers to choose from one of four specific machine learning algorithm based on their needs. This includes voice commands, phone calls, video transcription and the default algorithm.
Furthermore, the new API features an updated punctuation framework. When speaking about the Cloud Text-to-Speech API, Google acknowledged that transcriptions don’t always have accurate punctuation. This shouldn’t come as a surprise to anyone who’s used a voice-to-text transcription service whether it’s Google’s Cloud Text-to-Speech API or those offered by other companies. Transcriptions are often accurate at converting spoken words into text, but they fail to provide proper punctuation. This is something that Google is trying to fix with its new API. Google says its new Cloud Text-to-Speech API is significantly more accurate at reading transcribing punctuation like periods, commas, quotation marks and exclamation marks.
Finally, Google is updating its Cloud Text-to-Speech API so that developers can tag their audio or video with metadata. Currently, metadata for transcriptions doesn’t offer any real benefits. According to Google, however, it will use this developer-provided data to improve and optimize the API with new features in the future. So, while adding metadata won’t initially improve the API’s accuracy or effectiveness, it will help Google better understand developers’ needs.
If you’re thinking about using Google’s new Cloud Text-to-Speech API, you should take note of its new pricing. Audio transcriptions using the API will cost the same as before at $0.006 for every 15 seconds. Beginning May 31, though, the cost of video transcriptions will increase from $0.006 to $0.012 for every 15 seconds.
The Mountain View company first unveiled its Cloud Text-to-Speech API back in June 2016, during which it was available to select developers in open beta. It wasn’t until a year later when Google launched the API under general availability, allowing all developers to access it. This latest update builds upon the company’s Cloud Text-to-Speech API, with Google describing it as being the most significant overhaul for business owners who use the service.