How to Build Liv.ai like Speech-to-text Conversion Platform

This article explores the technology landscape which can be used to build similar platform / service offerings like Liv.ai.

First and foremost, congratulations to Liv.ai team for leveraging existing cloud-based AI and speech recognition (Speech-to-text conversion) technologies to come up with a set of business offerings which leverages speech-to-text conversion technology to create great value for businesses. The founding team (IIT KGP Alumni – Subodh Kumar, Sanjeev Kumar and Kishore Mundra) nailed it! Doing right thing at right time at right place.

Liv.ai enables developers to convert speech-to-text by using Powerful Neural Network Models with exceptional accuracy and minimal latency. At this point, the platform supports 9 languages including Hindi, English, Bengali, Gujarati, Telugu, Tamil, Marathi, Punjabi and Kannada.

The following is the list of areas where Liv.ai technologies look to be focusing on:

Speech analytics
Voice keyboard
Assistant
Customer care automation

Table of Contents

Business Usecases for Speech-to-text Conversion

The following lists down some of the business use case in relation to speech-to-text conversion technology:

Voice input: Let consumer talk with your website in relation with searching products & services; This could be very useful for ecommerce website.
Conversational speech-to-text: Audio files are converted into text files; This can be very useful for extracting intelligence from voice captured from as part of customer care phone calls. Millions of customers are calling customer care centre for issues resolutions. The calls are recorded. Imagine this service tagging the recorded call. And, the tag information can be used for various purposes such as following:
- Segmentation such as issues classification, customers types etc.
- Customer churn
- Rewards & recognition for rewarding customer care executives
- Product feature identification

How to Build Liv.ai like Platforms

The following are some of the key building blocks of a platform like liv.ai leveraging speech-to-text conversion technology:

Voice capture
- Way to capture the real-time data
Speech-to-text conversion
- Convert the speech (streaming) in real-time; This would be useful when consumer could call out product names during search, or, customer call out commands on the software
- Convert the batch of long audio files to text in asynchronous manner; This would be useful
API-based integrations
- APIs for converting audio to text by applying neural network models
Deep learning algorithms to recognize the speech and convert it to text
App for doing some of the following:
- Access the audio files
- Analytics reports

All of the above can be achieved using following:

Web / mobile app for accessing audio files / analytics reports
Integration with Google Cloud Speech API to achieve speech-to-text conversion.

Google Cloud Speech API

When considering Indian languages or rather, languages spoken in India, Google Cloud Speech API supports speech to text conversion for following Indian languages (as supported by Liv.ai):

Hindi
English
Bengali
Gujarati
Telugu
Tamil
Marathi
Punjabi
Kannada

The following are some of the salient features of Google Cloud Speech API:

Speech-to-text conversion in real time (streaming recognition) based on deep learning models
Greater accuracy in noisy environments
Context aware recognition; Very useful for auto-suggesting words (word hints)
Easy-to-integrate APIs with support for REST and gRPC based integrations.
Asynchronous audio processing for large audios

And, all of the above comes at a very decent pricing from Google:

Monthly Usage	Price per 15 seconds
0-60 minutes	Free
61-1,000,000 minutes	$0.006

Other Cloud Speech APIs (Azure, AWS)

One can also try other cloud speech APIs such as following:

In case, you wanted to share your thoughts in relation with using Google or other cloud speech APIs to build speech-to-text conversion platforms such as liv.ai, please feel free to suggest.

Author
Recent Posts

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. I would love to connect with you on Linkedin.
Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking.

How to Build Liv.ai like Speech-to-text Conversion Platform

Business Usecases for Speech-to-text Conversion

How to Build Liv.ai like Platforms

Google Cloud Speech API

Other Cloud Speech APIs (Azure, AWS)

Ajitesh Kumar

ChatGPT Prompts (250+)

Recent Posts

Data Science / AI Trends

Free Online Tools

Newsletter

Recent Comments

How to Build Liv.ai like Speech-to-text Conversion Platform

Business Usecases for Speech-to-text Conversion

How to Build Liv.ai like Platforms

Google Cloud Speech API

Other Cloud Speech APIs (Azure, AWS)

Ajitesh Kumar

ChatGPT Prompts (250+)

Recent Posts

Data Science / AI Trends

Free Online Tools

Newsletter

Tag Cloud

Recent Comments