Amazon Polly can be used with Twilio phone service and AWS S3 to create an automated alert system which does (achieves) some of the following:
- Convert text to speech (using Amazon Polly)
- Upload audio (speech stream) created using Polly service on AWS S3 bucket
- Use Twilio Call service to play the audio to the destined phone number
The following represents the application architecture diagram (communication flow viewpoint) representing communication between Spring Boot app and Amazon Polly, Amazon S3 and Twilio Service to achieve automated phone alerts based on text-to-speech conversion.
This can be used to create automated alert/notification system around following use cases which makes phone call to concerned personal and inform about the incidents:
- Security alerts
- Production downtime
- Custom alerts related to one or more custom events
In this post, you will learn about creating Spring Boot app for using Twilio and Amazon services (S3 and Polly) for making automated phone call to end users. The following are some of the topics explained in this article:
- Spring Boot App for invoking Amazon S3, Polly and Twilio Services
- Custom Class for invoking Amazon Polly Service
- Custom Class for invoking AWS S3 Storage APIs
- Custom Class for invoking Twilio Phone Service
- Application properties file
- Configuration beans created using application properties files
Spring Boot App for invoking Amazon S3, Polly and Twilio Services
Pay attention to some of the following steps for making phone call to end user:
- Create speech stream using Amazon Polly service
- Upload the audio file to AWS S3 storage
- Invoke Twilio service to make the phone call and play the audio to the end user
import java.io.IOException; import java.io.InputStream; import java.util.UUID; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.boot.CommandLineRunner; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; import com.vflux.rbot.storage.CloudStorage; import com.vflux.rbot.texttospeech.CustomPolly; import com.vflux.rbot.voice.VoiceService; @SpringBootApplication public class RecruiterbotApplication implements CommandLineRunner { @Autowired VoiceService twilioVoiceService; @Autowired CustomPolly customPolly; @Autowired CloudStorage awsCloudStorage; public static void main(String[] args) { SpringApplication app = new SpringApplication(RecruiterbotApplication.class); app.run(args); } @Override public void run(String... arg0) throws IOException { String text = "Hello Ajitesh. This is message from Microsoft talent acquisition team. We are happy to inform you that you have been shortlisted for next round of interview. Good Bye!"; // // Create speech stream using Amazon Polly service // InputStream speechStream = this.customPolly.synthesisSpeechMP3Format(text); // // Upload the audio file to AWS S3 storage // String s3Key = UUID.randomUUID().toString() + ".mp3"; String s3URL = this.awsCloudStorage.uploadAudioStream(s3Key, speechStream); // // Invoke Twilio service to make the phone call and play the audio to the end user // this.twilioVoiceService.playVoice("+9198877761234", s3URL); } }
Custom Class for invoking Amazon Polly Service
Pay attention to some of the following:
- Configurable language code and voice id which will be used for creating audio file of format MP3 (OutputFormat.Mp3)
- BasicSessionCredentials instance which is injected while constructing the instance of CustomPolly class; This allows for using Amazon STS service related to using temporary security credentials for instantiating Amazon services
- Retrieve appropriate voice id based on configurable value for language code and voice id. This code works for language code as en-IN. One can set language code and voice id based on details provided in this page, Amazon Polly Voice List
- Amazon Polly synthesizeSpeech API which is used to create a speech stream
@Component public class CustomPolly { @Autowired String pollyLanguageCode; @Autowired String pollyVoiceId; private AmazonPolly amazonPolly; public CustomPolly(@Autowired Region awsRegion, @Autowired BasicSessionCredentials sessionCredentials) { this.amazonPolly = AmazonPollyClientBuilder.standard() .withCredentials(new AWSStaticCredentialsProvider(sessionCredentials)).withRegion(awsRegion.getName()).build(); } public InputStream synthesisSpeechMP3Format(String text) throws IOException { return this.synthesize(text, OutputFormat.Mp3); } public void play(String text) throws IOException, JavaLayerException { // // Get the audio stream created using the text // InputStream speechStream = this.synthesize(text, OutputFormat.Mp3); // // Play the audio // AudioPlayer.play(speechStream); } public InputStream synthesize(String text, OutputFormat format) throws IOException { // // Get the default voice // Voice voice = this.getVoice(); // // Create speech synthesis request comprising of information such as following: // Text // Voice // The detail will be used to create the speech // SynthesizeSpeechRequest synthReq = new SynthesizeSpeechRequest().withText(text).withVoiceId(voice.getId()) .withOutputFormat(format); // // Create the speech // SynthesizeSpeechResult synthRes = this.amazonPolly.synthesizeSpeech(synthReq); // // Returns the audio stream // return synthRes.getAudioStream(); } public Voice getVoice() { // // Create describe voices request. // DescribeVoicesRequest enInVoicesRequest = new DescribeVoicesRequest().withLanguageCode(this.pollyLanguageCode); // // Synchronously ask Amazon Polly to describe available TTS voices. // DescribeVoicesResult enInVoicesResult = this.amazonPolly.describeVoices(enInVoicesRequest); Iterator<Voice> voiceIter = enInVoicesResult.getVoices().iterator(); Voice voice = null; String pollyVoiceIdLower = this.pollyVoiceId.trim().equals("")?"raveena":this.pollyVoiceId.toLowerCase(); while(voiceIter.hasNext()) { Voice tmpvoice = voiceIter.next(); if(tmpvoice.getId().toLowerCase().equals(pollyVoiceIdLower)) { voice = tmpvoice; break; } } if(voice == null) { voice = enInVoicesResult.getVoices().get(0); } return voice; } }
Custom Class for invoking AWS S3 Storage APIs
Pay attention to some of the following:
- An instance of AccessControlList which is used to associate read permission with the file uploaded on S3
AccessControlList acl = new AccessControlList(); acl.grantPermission(GroupGrantee.AllUsers, Permission.Read);
- An instance of ObjectMetaData which is required to be configured with content-type and content-length of the file/content uploaded in the S3 bucket
ObjectMetadata objectMetaData = new ObjectMetadata(); byte[] bytes = IOUtils.toByteArray(inputStream); objectMetaData.setContentLength(bytes.length); ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(bytes); objectMetaData.setContentType("audio/mpeg");
- An instance of PutObjectRequest which is intantiated with AccessControlList and ObjectMetaData instances.
PutObjectRequest putObjectRequest = new PutObjectRequest(this.awsS3DataBucket, keyName, byteArrayInputStream, objectMetaData).withAccessControlList(acl);
- API putObject invoked with an instance of PutObjectRequest
@Component("awsCloudStorage") public class AWSCloudStorage implements CloudStorage { public static final String S3_BASE_URL = "http://s3.amazonaws.com/"; @Autowired String awsS3DataBucket; private AmazonS3 amazonS3; public AWSCloudStorage(@Autowired Region awsRegion, @Autowired BasicSessionCredentials sessionCredentials) { this.amazonS3 = AmazonS3ClientBuilder.standard() .withCredentials(new AWSStaticCredentialsProvider(sessionCredentials)).build(); } public void uploadFile(String keyName, String filePath) { try { this.amazonS3.putObject(this.awsS3DataBucket, keyName, filePath); } catch (AmazonServiceException e) { System.err.println(e.getErrorMessage()); } } public String uploadAudioStream(String keyName, InputStream inputStream) throws IOException { try { // // Create Read permission for audio file // AccessControlList acl = new AccessControlList(); acl.grantPermission(GroupGrantee.AllUsers, Permission.Read); // // Create ObjectMetaData for setting content length and content type // ObjectMetadata objectMetaData = new ObjectMetadata(); byte[] bytes = IOUtils.toByteArray(inputStream); objectMetaData.setContentLength(bytes.length); ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(bytes); objectMetaData.setContentType("audio/mpeg"); // // Put the object in the AWS S3 // PutObjectRequest putObjectRequest = new PutObjectRequest(this.awsS3DataBucket, keyName, byteArrayInputStream, objectMetaData).withAccessControlList(acl); this.amazonS3.putObject(putObjectRequest); } catch (AmazonServiceException e) { System.err.println(e.getErrorMessage()); } return getS3URL(keyName); } private String getS3URL(String key) { return S3_BASE_URL + this.awsS3DataBucket + "/" + key; } }
Custom Class for invoking Twilio Phone Service
Pay attention to some of the following:
- Twimlet URL which is created using voice URL which will be accessed to play voice file to the destined phone number.
- TwilioRestClient which is used to instantiate Twilio Call service
- From and To phone numbers which are required for making phone calls.
import java.net.URI; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Component; import com.twilio.http.TwilioRestClient; import com.twilio.rest.api.v2010.account.Call; import com.twilio.type.PhoneNumber; @Component public class TwilioVoiceService implements VoiceService { private static final String APIVERSION = "2010-04-01"; private static final String TWIMLET_BASE_URL = "http://twimlets.com/message?Message%5B0%5D="; private TwilioRestClient twilioRestClient; private String fromPhoneNumber; public TwilioVoiceService(@Autowired TwilioRestClient twilioRestClient, @Autowired String fromPhoneNumber) { this.twilioRestClient = twilioRestClient; this.fromPhoneNumber = fromPhoneNumber; } @Override public void playVoice(String toNumber, String voicePath) { String url = TWIMLET_BASE_URL + voicePath; PhoneNumber to = new PhoneNumber(toNumber); // Replace with your phone number PhoneNumber from = new PhoneNumber(this.fromPhoneNumber); // Replace with a Twilio number URI uri = URI.create(url); // Make the call Call call = Call.creator(to, from, uri).create(this.twilioRestClient); System.out.println(call.getSid()); } }
Application properties file
Pay attention to some of the following in the code given below:
- AWS Access Key properties; This is stored as empty string. The code uses AWS security temporary credentials for accessing AWS services.
- AWS S3 Bucket
- Amazon Polly voice language code and voice ID
- Twilio account SID, auth token values and phone number from which the call would be made. Note that this phone number needs to be purchased from TWILIO web console.
# AWS Properties # aws.access.key.id = aws.access.key.secret = aws.region = us-east-1 aws.s3.data.bucket = remainders-11032018 aws.temporary.credentials.validity.duration = # # Amazon Polly Properties # aws.polly.language.code = en-IN aws.polly.voice.id = Raveena # # Twilio properties # twilio.account.sid = BD1a1234f87beee26b9876c8e8513e432 twilio.auth.token = zzz12341df987654c7a71b9c1234xz87 twilio.from.phone.number = +13111116667
Configuration Beans created using Application Properties Files
Pay attention to the different set of files used for creating configuration beans for AWS and Twilio services.
AWS Properties Configuration Beans
Pay attention to some of the following:
- BasicSessionCredentials bean which will be used to create Amazon services using AWS temporary security credentials
- AWS Security credentials provider which can be created using AWS Security Key ID and security key access.
import org.springframework.beans.factory.annotation.Value; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.context.annotation.PropertySource; import com.amazonaws.auth.AWSCredentialsProvider; import com.amazonaws.auth.AWSStaticCredentialsProvider; import com.amazonaws.auth.BasicAWSCredentials; import com.amazonaws.auth.BasicSessionCredentials; import com.amazonaws.regions.Region; import com.amazonaws.regions.Regions; import com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClient; import com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClientBuilder; import com.amazonaws.services.securitytoken.model.Credentials; import com.amazonaws.services.securitytoken.model.GetSessionTokenRequest; import com.amazonaws.services.securitytoken.model.GetSessionTokenResult; @Configuration @PropertySource("classpath:application.properties") public class AWSAppConfig { private static final Integer TEMPORARY_CREDENTIALS_DURATION_DEFAULT = 7200; @Value("${aws.access.key.id}") String awsKeyId; @Value("${aws.access.key.secret}") String awsKeySecret; @Value("${aws.region}") String awsRegion; @Value("${aws.s3.data.bucket}") String awsS3DataBucket; @Value("${aws.temporary.credentials.validity.duration}") String credentialsValidityDuration; @Value("${aws.polly.language.code}") String pollyLanguageCode; @Value("${aws.polly.voice.id}") String pollyVoiceId; @Bean(name = "awsKeyId") public String getAWSKeyId() { return awsKeyId; } @Bean(name = "awsKeySecret") public String getAWSKeySecret() { return awsKeySecret; } @Bean(name = "awsRegion") public Region getAWSPollyRegion() { return Region.getRegion(Regions.fromName(awsRegion)); } @Bean(name = "awsCredentialsProvider") public AWSCredentialsProvider awsCredentialsProvider() { BasicAWSCredentials awsCredentials = new BasicAWSCredentials(this.awsKeyId, this.awsKeySecret); return new AWSStaticCredentialsProvider(awsCredentials); } @Bean(name = "sessionCredentials") public BasicSessionCredentials sessionCredentials() { AWSSecurityTokenServiceClient sts_client = (AWSSecurityTokenServiceClient) AWSSecurityTokenServiceClientBuilder.defaultClient(); GetSessionTokenRequest session_token_request = new GetSessionTokenRequest(); if(this.credentialsValidityDuration == null || this.credentialsValidityDuration.trim().equals("")) { session_token_request.setDurationSeconds(TEMPORARY_CREDENTIALS_DURATION_DEFAULT); } else { session_token_request.setDurationSeconds(Integer.parseInt(this.credentialsValidityDuration)); } GetSessionTokenResult session_token_result = sts_client.getSessionToken(session_token_request); Credentials session_creds = session_token_result.getCredentials(); BasicSessionCredentials sessionCredentials = new BasicSessionCredentials( session_creds.getAccessKeyId(), session_creds.getSecretAccessKey(), session_creds.getSessionToken()); return sessionCredentials; } @Bean(name = "awsS3DataBucket") public String awsS3DataBucket() { return awsS3DataBucket; } @Bean(name = "pollyLanguageCode") public String pollyLanguageCode() { return pollyLanguageCode; } @Bean(name = "pollyVoiceId") public String pollyVoiceId() { return pollyVoiceId; } }
Twilio Properties Configuration Beans
Pay attention to creation of the Bean of type, TwilioRestClient which can be used for instantiating Twilio services.
import org.springframework.beans.factory.annotation.Value; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.context.annotation.PropertySource; import com.twilio.http.TwilioRestClient; import com.twilio.http.TwilioRestClient.Builder; @Configuration @PropertySource("classpath:application.properties") public class TwilioAppConfig { @Value("${twilio.account.sid}") String accountSID; @Value("${twilio.auth.token}") String authToken; @Value("${twilio.from.phone.number}") String fromPhoneNumber; @Bean(name = "twilioRestClient") public TwilioRestClient twilioRestClient() { return (new Builder(this.accountSID, this.authToken)).build(); } @Bean(name = "fromPhoneNumber") public String fromPhoneNumber() { return this.fromPhoneNumber; } @Bean(name = "twilioAccountSID") public String twilioAccountSID() { return this.accountSID; } @Bean(name = "twilioAuthToken") public String twilioAuthToken() { return this.authToken; } }
Further Reading / References
- Amazon Polly
- AWS Java SDK
- Twilio Java Helper library
- Twilio Java on Github
- Spring Boot App Source Code
- Related articles on Vitalflux such as
Summary
In this post, you learned about invoking Twilio API from Spring Boot Java app to play audio to a destined phone number, with audio being created using Amazon Polly service and uploaded on AWS S3 bucket.
Did you find this article useful? Do you have any questions or suggestions about this article in relation to integrating Amazon Polly with AWS S3 and Twilio API using Spring Boot Java app? Leave a comment and ask your questions and I shall do my best to address your queries.
- What are AI Agents? How do they work? - January 7, 2025
- Agentic AI Design Patterns Examples - January 6, 2025
- List of Agentic AI Resources, Papers, Courses - January 5, 2025
I found it very helpful. However the differences are not too understandable for me