Amazon Polly is one cool AWS service which can be used to achieve requirements such as creating business, security alerts via phone call. When integrated with communication providers such as Twilio, several value-added services could be created using Amazon Polly AWS service.
In this post, you will learn about creating a sample Java app which uses Amazon Polly service for converting text to speech. You should be able to listen to the following text using the program given below: Hello World! How are you doing? This is Polly. I am happy to talk with you. The following are some of the aspects covered in this post:
- Create a Spring Boot app
- Setup POM.xml for AmazonPolly and Audio Player
- Sample Code for Amazon Polly with Spring Boot and Java
- Run and Test the AmazonPolly
Create a Spring Boot App
Create a Spring boot app by going to Eclipse IDE and creating a new Spring Starter Project. This is provided you installed Eclipse plugin, Spring Tools (aka Spring IDE and Spring Tool Suite) by going to Help > Eclipse Marketplace… and searching for “spring tools” keyword.
Setup POM.xml for AmazonPolly and Audio Player
Use the following in your POM.xml file to work with the code given later in this post:
<!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-polly --> <dependency> <groupId>com.amazonaws</groupId> <artifactId>aws-java-sdk-polly</artifactId> <version>1.11.282</version> </dependency> <!-- https://mvnrepository.com/artifact/com.googlecode.soundlibs/jlayer --> <dependency> <groupId>com.googlecode.soundlibs</groupId> <artifactId>jlayer</artifactId> <version>1.0.1.4</version> </dependency>
Sample Code for Amazon Polly with Spring Boot and Java
The following text will be played using AmazonPolly sample app: Hello World! How are you doing? This is Polly. I am happy to talk with you. The following are three key aspects of the code:
- Main app which would invoke the custom class invoking Amazon Polly
- Custom class representing Amazon Polly
- Audio player for playing the speech stream generated using Amazon Polly
Main App invoking Custom Amazon Polly Class
This is where the custom class representing Amazon Polly is invoked.
import java.io.IOException; import org.springframework.boot.CommandLineRunner; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; import com.amazonaws.regions.Region; import com.amazonaws.regions.Regions; import com.vflux.rbot.texttospeech.CustomPolly; import javazoom.jl.decoder.JavaLayerException; @SpringBootApplication public class RecruiterbotApplication implements CommandLineRunner { public static void main(String[] args) { SpringApplication app = new SpringApplication(RecruiterbotApplication.class); app.run(args); } @Override public void run(String... arg0) throws IOException, JavaLayerException { // // Sample Hello World Text // String sampleText = "Hello World! How are you doing? This is Polly. I am happy to talk with you."; // // create the CustomPolly class // CustomPolly customPolly = new CustomPolly(Region.getRegion(Regions.US_EAST_1)); // // Have CustomPolly play the text to speech // customPolly.play(sampleText); } }
Custom Class representing Amazon Polly
This is the class representing the Amazon Polly. The following is achieved in this class:
- An instance of AmazonPolly is created; You would be required to get your access key ID and access key secret from AWS console.
- A voice is obtained for reading out the speech stream; In the code sample given below, first voice is obtained from the list of voices. I would present a code sample in later posts to get specific voices based on the region.
- Speech stream is synthesized using text and voice
import java.io.IOException; import java.io.InputStream; import com.amazonaws.auth.AWSStaticCredentialsProvider; import com.amazonaws.auth.BasicAWSCredentials; import com.amazonaws.regions.Region; import com.amazonaws.services.polly.AmazonPolly; import com.amazonaws.services.polly.AmazonPollyClientBuilder; import com.amazonaws.services.polly.model.DescribeVoicesRequest; import com.amazonaws.services.polly.model.DescribeVoicesResult; import com.amazonaws.services.polly.model.OutputFormat; import com.amazonaws.services.polly.model.SynthesizeSpeechRequest; import com.amazonaws.services.polly.model.SynthesizeSpeechResult; import com.amazonaws.services.polly.model.Voice; import javazoom.jl.decoder.JavaLayerException; public class CustomPolly { private AmazonPolly amazonPolly; public CustomPolly(Region region) { // // Use your access key id and access secret key // Obtain it from AWS console // BasicAWSCredentials awsCredentials = new BasicAWSCredentials("AKAJ3XCSWGKOLSABU4Q", "PQe3Qxr2R12Lm5ysBe21ugFbt7Ai4W3cdsACTky"); // // Create an Amazon Polly client in a specific region // this.amazonPolly = AmazonPollyClientBuilder.standard() .withCredentials(new AWSStaticCredentialsProvider(awsCredentials)).withRegion(region.getName()).build(); } public void play(String text) throws IOException, JavaLayerException { // // Get the audio stream created using the text // InputStream speechStream = this.synthesize(text, OutputFormat.Mp3); // // Play the audio // AudioPlayer.play(speechStream); } public InputStream synthesize(String text, OutputFormat format) throws IOException { // // Get the default voice // Voice voice = this.getVoice(); // // Create speech synthesis request comprising of information such as following: // Text // Voice // The detail will be used to create the speech // SynthesizeSpeechRequest synthReq = new SynthesizeSpeechRequest().withText(text).withVoiceId(voice.getId()) .withOutputFormat(format); // // Create the speech // SynthesizeSpeechResult synthRes = this.amazonPolly.synthesizeSpeech(synthReq); // // Returns the audio stream // return synthRes.getAudioStream(); } public Voice getVoice() { // // Create describe voices request. // DescribeVoicesRequest describeVoicesRequest = new DescribeVoicesRequest(); // Synchronously ask Amazon Polly to describe available TTS voices. DescribeVoicesResult describeVoicesResult = this.amazonPolly.describeVoices(describeVoicesRequest); return describeVoicesResult.getVoices().get(0); } }
AudioPlayer Class to play the Speech Stream
This is the class which has method to play the speech stream created using AmazonPolly speech synthesis API (synthesizeSpeech)
import java.io.InputStream; import javazoom.jl.decoder.JavaLayerException; import javazoom.jl.player.advanced.AdvancedPlayer; import javazoom.jl.player.advanced.PlaybackEvent; import javazoom.jl.player.advanced.PlaybackListener; public class AudioPlayer { public static void play(InputStream speechStream) throws JavaLayerException { // // create an MP3 player // AdvancedPlayer player = new AdvancedPlayer(speechStream, javazoom.jl.player.FactoryRegistry.systemRegistry().createAudioDevice()); player.setPlayBackListener(new PlaybackListener() { @Override public void playbackStarted(PlaybackEvent evt) { System.out.println("Playback started"); } @Override public void playbackFinished(PlaybackEvent evt) { System.out.println("Playback finished"); } }); // // Play the speech stream // player.play(); } }
Run and Test the AmazonPolly
Right-click on RecruiterbotApplication and run as Java Application. Alternatively, right-click on the project and run as Spring boot app. You should be able to listen to the following text: Hello World! How are you doing? This is Polly. I am happy to talk with you.
Further Reading / References
Summary
In this post, you learned about getting started with Amazon Polly (AWS cloud service) with Spring Boot and Java.
Did you find this article useful? Do you have any questions or suggestions about this article in relation to creating a Spring Boot app that uses AmazonPolly AWS service? Leave a comment and ask your questions and I shall do my best to address your queries.
- Agentic Reasoning Design Patterns in AI: Examples - October 18, 2024
- LLMs for Adaptive Learning & Personalized Education - October 8, 2024
- Sparse Mixture of Experts (MoE) Models: Examples - October 6, 2024
I found it very helpful. However the differences are not too understandable for me