Skip to content

Malek-Zaag/Serverless-text-to-speech-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Serverless text-to-speech API with AWS API Gateway and CloudFront

Introduction :

In this blog, we will be setting a lambda function that interacts with Amazon Polly which is a text-to-speech managed service to convert a text into a nice hearing and customizable human voice.

Project workflowProject workflow

We start by defining the technologies we used in this mini project :

  • AWS Lambda is a compute service that lets you run code without provisioning or managing servers. Lambda runs your code on a high-availability compute infrastructure and performs all of the administration of the compute resources, including server and operating system maintenance, capacity provisioning and automatic scaling, and logging.

  • Amazon Simple Storage Service (Amazon S3) is an object storage service offering industry-leading scalability, data availability, security, and performance. Customers of all sizes and industries can store and protect any amount of data for virtually any use case, such as data lakes, cloud-native applications, and mobile apps.

  • Amazon Polly is a Text-to-Speech (TTS) cloud service that converts text into lifelike speech. You can use Amazon Polly to develop applications that increase engagement and accessibility.

  • Amazon API Gateway is an AWS service for creating, publishing, maintaining, monitoring, and securing REST, HTTP, and WebSocket APIs at any scale.

  • Amazon CloudFront is a web service that speeds up distribution of your static and dynamic web content, such as .html, .css, .js, and image files, to your users. CloudFront delivers your content through a worldwide network of data centers called edge locations.

Setting up our Lambda function :

Lambda setupLambda setup

Now we have a lambda function with default code, we add a custom event just to try out the function and we test :

IAM Execution Role Setup :

A Lambda functionΓÇÖs execution role is an AWS Identity and Access Management (IAM) role that grants the function permission to access AWS services and resources. For example, you might create an execution role that has permission to send logs to Amazon CloudWatch and upload trace data to AWS X-Ray.

We start by going to the IAM section in the console :

We add the desired permissions :

PermissionsPermissions

We give a name for our role :

Now it is time to add some code to send the request to polly so it can produces the wanted audio :

import json
import boto3
import base64

def generateAudioFromText(text):
    polly= boto3.client('polly')
    response= polly.synthesize_speech(  Text=text,
                                        TextType = "text",
                                        OutputFormat='mp3', 
                                        VoiceId='Bianca')
    audio_stream = response['AudioStream'].read()
    audio_base64 = base64.b64encode(audio_stream).decode('utf-8')
    bucket_name = "serverless-api-bucket24"
    filename = "audios/audio.mp3"
    bucket = boto3.client('s3').put_object(Bucket=bucket_name, Key=filename, Body=audio_stream)
    return audio_base64
    
def lambda_handler(event, context):
    # TODO implement
    audio_base64 = generateAudioFromText("Hello world!")
    return {
        'statusCode': 200,
        #'body': json.dumps(base64.b64encode(response).decode('utf-8')+)
        'headers': {
                'Content-Type': 'audio/mpeg',
            },
        'body': audio_base64,
        'isBase64Encoded': True
    }

since we created a bucket already :

We trigger our function and we see the output in the appropriate folder :

We also tried to add a longer text to convert :

and we got a satisfying result.

API Gateway :

Now we proceed by adding an entry for our serverless api through api Gateway :

since we setup our api gateway, we have a url we can use now to trigger our lambda function :

we can send a simple GET request to have our output file :

Now we tested that, we can have our own text to send to the converter through adding custom routes in the gateway and adding integrations with the lambda function :

We added integrations :

Now we made some changes in the lambda code to handle different routes, we can test this routes :

We can recover our audio as response in the GET request :

CloudFront :

Since we want to add a nice looking front page for our api, we proceed to use CloudFront to serve a static HTML file hosted in the same S3 bucket created previously :

Now we configure our CloudFront distribution with the appropriate origin :

And we add the resource policy in the S3 bucket to allow CloudFront to access the HTML file :

Final Result :

Now we can have a nice looking front page which sends a text to a serverless api and gets back the generated audio file, which obviously can be downloaded :

ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö ΓÇö

Was this helpful? Confusing? If you have any questions, feel free to contact me!

Before you leave:

👏 Clap for the story

📰 Subscribe for more posts like this @malek.zaag ⚡️

👉👈 Please follow me: GitHub | LinkedIn

About

A lambda function that interacts with Amazon Polly which is a text-to-speech managed service to convert a text into a nice hearing and customizable human voice.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published