Enable Google Cloud Text-to-Speech Service
Google Cloud Text-to-Speech is a text-to-speech conversion service which got launched a few days back by Google Cloud. This was one of the most important service missing from Google Cloud AI portfolio which is now available and completes the loop for text-to-speech and speech-to-text services by Google Cloud. In next few weeks, you will learn about different usages of Google Cloud text-to-speech service with other Google cloud services.
In this post, you will learn about some of the following:
The following are some of the key aspects of setting up the development environment using Eclipse IDE:
Figure 1. Enable Google Cloud Text-to-Speech Service
Figure 2. Google Cloud Service – Create Service Account Key
Figure 3. Google Cloud Text to Speech – Setting Environment Variable
The following are two key steps which needed to be taken to create a sample program/app for demonstrating google cloud text-to-speech services
The following are some of the artifacts which need to be included for working with Google Cloud Text-to-speech APIs
<!-- https://mvnrepository.com/artifact/com.google.guava/guava --> <dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> <version>24.1-jre</version> </dependency> <!-- https://mvnrepository.com/artifact/org.threeten/threetenbp --> <dependency> <groupId>org.threeten</groupId> <artifactId>threetenbp</artifactId> <version>1.3.6</version> </dependency> <!-- https://mvnrepository.com/artifact/com.google.http-client/google-http-client --> <dependency> <groupId>com.google.http-client</groupId> <artifactId>google-http-client</artifactId> <version>1.22.0</version> </dependency> <!-- https://mvnrepository.com/artifact/com.google.cloud/google-cloud-texttospeech --> <dependency> <groupId>com.google.cloud</groupId> <artifactId>google-cloud-texttospeech</artifactId> <version>0.42.0-beta</version> </dependency>
Pay attention to some of the following aspects which needed to be done for achieving text-to-speech conversion:
The following is the code representing above steps:
@SpringBootApplication public class GCloudText2SpeechApplication implements CommandLineRunner { public static void main(String[] args) { SpringApplication app = new SpringApplication(GCloudText2SpeechApplication.class); app.run(args); } @Override public void run(String... arg0) throws Exception { String text = "Hello World! How are you doing today? This is Google Cloud Text-to-Speech Demo!"; String outputAudioFilePath = "/home/support/Documents/output.mp3"; try (TextToSpeechClient textToSpeechClient = TextToSpeechClient.create()) { // Set the text input to be synthesized SynthesisInput input = SynthesisInput.newBuilder().setText(text).build(); // Build the voice request; languageCode = "en_us" VoiceSelectionParams voice = VoiceSelectionParams.newBuilder().setLanguageCode("en-US") .setSsmlGender(SsmlVoiceGender.FEMALE) .build(); // Select the type of audio file you want returned AudioConfig audioConfig = AudioConfig.newBuilder().setAudioEncoding(AudioEncoding.MP3) // MP3 audio. .build(); // Perform the text-to-speech request SynthesizeSpeechResponse response = textToSpeechClient.synthesizeSpeech(input, voice, audioConfig); // Get the audio contents from the response ByteString audioContents = response.getAudioContent(); // Write the response to the output file. try (OutputStream out = new FileOutputStream(outputAudioFilePath)) { out.write(audioContents.toByteArray()); System.out.println("Audio content written to file \"output.mp3\""); } } } }
In this post, you learned about how to get started with Google Cloud Text-to-Speech Service using Java/Sring Boot app.
Did you find this article useful? Do you have any questions or suggestions about this article? Leave a comment and ask your questions and I shall do my best to address your queries.
Last updated: 25th Jan, 2025 Have you ever wondered how to seamlessly integrate the vast…
Hey there! As I venture into building agentic MEAN apps with LangChain.js, I wanted to…
Software-as-a-Service (SaaS) providers have long relied on traditional chatbot solutions like AWS Lex and Google…
Retrieval-Augmented Generation (RAG) is an innovative generative AI method that combines retrieval-based search with large…
The combination of Retrieval-Augmented Generation (RAG) and powerful language models enables the development of sophisticated…
Have you ever wondered how to use OpenAI APIs to create custom chatbots? With advancements…