Python Archives - ab

Generative AI With Amazon Bedrock and Unity3D

Alexandre Bruffa — Thu, 19 Oct 2023 16:08:19 +0000

A few days ago, Amazon made a big announcement: the service Bedrock was finally released.

If you missed the news, here it is: Amazon decided to enter massively into the generative AI field and invested in the company Anthropic for $4 billion in order to compete with popular AI services like ChatGPT, Dall-E, MidJourney, etc. The first proposal of Amazon is Bedrock, a fully managed AI service that offers multiple foundational models such as Claude, Jurassic, and Stable Diffusion XL, among others.

In my previous articles, I created Unity3D applications communicating with ChatGPT and Dall-E. Let’s do the same with Bedrock!

Generating AI Images With DALL-E, AWS, and Unity3D

Do you prefer something more interactive instead of reading? Check my video!

General Architecture

Here is the general architecture of our application:

We use 3 Amazon services:

API Gateway to create, expose, and consume endpoints.
Lambda to receive the client’s requests and communicate with Bedrock.
Bedrock to generate answers based on the client’s prompts.

If you want to get further adding a login mechanism with Cognito, please refer to my previous article:

I made ChatGPT talk using Unity3D and AWS

Bedrock

First, we have to enable the foundational models we want to use. By default, all the models are disabled. Go to the Bedrock console and enable the models you will use in the model access section. If you have an updated payment method, activating the models only takes a few minutes.

Model access section in the Bedrock console

If you don’t enable the models in the Bedrock console, you will receive this error in Lambda:

An error occurred (AccessDeniedException) when calling the InvokeModel operation: Your account is not authorized to invoke this API operation.

Lambda, Boto3 and Bedrock

If you follow my work, you know I’m a Python guy, so I will use the Python SDK for AWS, Boto3. Good news: Bedrock is available on Boto3; bad news: Lambda runs an old version of Boto3 in which Bedrock does not exist yet.

Here is a workaround: creating a new Lambda layer with an up-to-date version of Boto3. This is how to do it:

In PyCharm, create a new project and install boto3 in a new virtual environment. Boto3 and all the dependencies will be installed.

PyCharm project

2. Go to the venv folder of your project and copy the lib folder into a new folder called python. Then, zip it.

venv folder

3. Go to the Layers section of the Lambda console and create a new layer. Upload the zip file we have created before and choose the same Python version as your local Python virtual environment for the runtime.

Creating a new Layer

The Lambda Functions

We create 2 Lambda functions, one for text generation and the other one for image generation, both with the same version as previously for the runtime (In my case, Python 3.9), and we add the layer we have created before.

Adding a new layer to the Lambda function

The text generation function

For the text generation, I use the Claude model and implement it as described in the Amazon documentation.

The image generation function

For the image generation, I use the Stable Diffusion XL model and implement it, as described in the Amazon documentation.

Note that the model IDs can be found in the Bedrock documentation.

The Lambda Policy

In order to allow Lambda to invoke a Bedrock model, we create a new managed policy and add it to the Lambda functions roles:

API Gateway

In the API Gateway console, we create a new REST API with 2 resources: image and text. Each resource has a post method integrating with the Lambda functions we have created before.

API resources

When deploying the API, you will be asked to create a new stage. After the creation of the stage, the final endpoint will be shown. You will have to use this URL in the Unity client to perform the request.

API stages

The Unity Client

the Unity client allows the user to send a message to generate a text or an image as an answer:

Handling the request

First, we need to create serializable classes to embed the sending and receiving data:

Sending a request in Unity is an asynchronous process. Therefore, we create a coroutine that sends the request to the server:

Dealing with the generated image

If we ask for an image generation, the result sent by Bedrock is a base64 image. To show it on the Unity UI, we must convert it into bytes and create a Texture2D. Then, we assign the texture to the RawImage component of the UI.

Result

Here is the result with text and image generations:

Costs

Let’s check with the AWS Calculator and the Bedrock pricing sheet what our system cost would be for a very pessimistic scenario: you love the app, and you perform 2,000 requests a month. 1,000 for text generation with the Claude model and 1,000 for image generation with Stable Diffusion XL.

API Gateway: With 2,000 requests to our REST API, the monthly cost is 0,01 USD
Lambda: With 2,000 requests with an average time of 5 seconds and 512 MB of memory allocation, the monthly cost is 0.08 USD.
Bedrock: 1,000 text generations by month with a maximum of 300 output tokens and 20 input tokens per request would result, in the worst case, in $0.2204 + $9.804. 1,000 image generations would give us a bill of $18.

Total: The total bill for our system would be approximately 28 USD monthly. It’s quite affordable!

Final Thoughts

Using Bedrock is very intuitive, and integrating it with other Amazon services is absolutely straightforward. The fact that Bedrock is an in-house product makes the integration easier: there is no need for API keys or external requests to make it work, resulting in greater performance and security.

Every code of this article has been tested using Unity 2021.3.3 and Visual Studio Community 2022 for Mac. The mobile device I used to run the Unity app is a Galaxy Tab A7 Lite with Android 11.

Thanks for reading until the end!

Resources

Here is the resource mentioned in the article:

The Unity package of the application
The text generation function in Python
The image generation function in Python

The post Generative AI With Amazon Bedrock and Unity3D appeared first on ab.

Detecting Unsafe Content With Unity3D and Amazon Rekognition

Alexandre Bruffa — Wed, 19 Jul 2023 17:31:08 +0000

This article was initially published on my Medium Page.

If you prefer watching a video instead of reading, here you have:

My Brief Journey on TikTok

Weeks ago, I finally decided to download TikTok. I download the app and create a new account. The first thing I always do just after registering on a social network is upload a profile picture. Strangely, TikTok did not let me do it. I tried two more times the same day, but I had no success. After that, I also tried to add friends and change my username, but still no success. I waited for a couple of days, I reopened the app, and I got this message:

Permanently banned message in Spanish.

I have been permanently banned from TikTok, and I don’t know why! Then I started to think. What did I do wrong? I did not post any publications or messages. I used a valid email to register, entered my real name, and tried to upload a real picture of me. I don’t believe my email and my name are a problem. What about my profile picture? Did TikTok detect it as inappropriate? Is it because I have no hair? How to find it out?

The System I built

Then I remembered that Amazon has a service for unsafe content detection: Rekognition; let’s try it! Additionally, you may remember this previous article of mine where I created a chat in Unity showing images from Dall-E. I will reuse the interface for this article.

This is how it works:

The Unity client login thanks to Cognito User Pool.
The client uploads then a picture to S3.
Finally, the client calls a Lambda function that works directly with Rekognition to define if the picture contains unsafe content.

Amazon Services

About Cognito, I created an authenticated access using a User Pool coupled with an Identity Pool. A policy attached to the authenticated role of the Identity Pool allows users to access S3 (PutObject) and Lambda (InvokeFunction). If you want to know in detail how to do it, please check this previous article of mine:

Connecting Unity3D With AWS Services

The S3 Bucket is private, so nobody but the authenticated users can access it. I called it censorimagesbucket.

I wrote the Lambda function in Python:

Notes:

I used boto3, the AWS SDK for Python, to connect with Rekognition.
I used all the Rekognition’s top-level category moderation labels. My face may be offensive in many ways; we never know.
I used Rekognition’s DetectModeration API thanks to the detect_moderation_labels function and the S3Object parameter.
I used the very cool Pythonic any function.

Don’t forget to attach to the function’s role a policy allowing access to Rekognition and S3:

Unity

The connection between Unity3D and Amazon services is done through the AWS SDK for .NET, as described in my previous article. However, the function invocation is different this time because we have to send the filename to Lambda. We do it thanks to the Payload parameter.

Result

I performed three tests: one with a picture of Bibi, my daughter’s favorite teddy bear, a picture of a girl wearing a bikini, and a picture of myself.

It’s working great! The only unsafe content (the girl in a bikini) has been filtered successfully, and the following moderation labels have been detected:

The result is very accurate, with high confidence rates.

Conclusion

Apparently, my face is not unsafe content! I still don’t know why I have been permanently banned from TikTok, but thanks to this misadventure, I had the opportunity to discover Amazon Rekognition and realize an implementation with Unity3D.

All ids and tokens shown in this article are fake or expired; if you try to use them, you will not be able to establish any connections.

Every code of this article has been tested using Unity 2021.3.3 and Visual Studio Community 2022 for Mac. The mobile device I used to run the Unity app is a Galaxy Tab A7 Lite with Android 11.

Resources

Here are all the resources mentioned in the article:

The Lambda function in Python
The Lambda policy attached to the Lambda role
The Cognito policy attached to the Cognito role
The Unity package of the application

The post Detecting Unsafe Content With Unity3D and Amazon Rekognition appeared first on ab.

Building a Real-Time Multiplayer Game With Unity3D and GameLift

Alexandre Bruffa — Tue, 18 Jul 2023 19:28:27 +0000

This article was initially published on my Medium Page.

You are talented with Unity and game creation in general, and it’s time for you to move up a gear by creating a real-time multiplayer game, but you have no idea how to do it, and you know very little about cloud computing. Don’t worry, we will see how to do it step by step.

This article is based on the excellent tutorial by Chris Blackwell. Please take a look at it!

Disclaimer: This is not an article about game design. I will focus on the technical aspect of creating a real-time multiplayer game.

Here is the final result:

Preamble

When one starts to work on a real-time multiplayer game, a lot of questions come to mind:

How can I match the players in one or multiple rooms (game sessions)? How can my players communicate together? How can I achieve to have a very low latency for my game?

Representation of a game server

Short answer: Amazon GameLift.

Amazon GameLift is a managed service for multiplayer games with the following characteristics:

It deploys a set of EC2 instances (fleet) for you, so you won’t have to worry about mounting EC2 instances, securing them, etc.
It operates the EC2 instances for you: you will never connect to the instances nor install anything. A game server is automatically set up when you run a new GameLift fleet.
It scales the service for you. No matter how many users your game is played by, GameLift automatically scales up or down according to the demand.
It connects the players through the game server, creating game and player sessions.

The Architecture

We can achieve building a real-time multiplayer game with few components. Here is our architecture:

This is how it works:

The client (Unity application on a mobile device) joins the game by calling a Lambda function.
The Lambda function indicates GameLift to create a new game session if no game session is available and to create a new player session. The function returns the necessary information to connect to the game server (IP, port, credentials, etc.).
The client establishes a new connection with the game server, and the user is ready to play.

Lambda

The Lambda function (Python 3.10) will work directly with GameLift and will have the following behavior:

The origin tutorial exposed a Node.js function, but since I’m a Python guy, I rewrote the function in Python. You can find the whole function in the last section of this article.

Also, the Lambda function needs the following permissions to access the GameLift service: CreateGameSession, DescribeGameSessionDetails, CreatePlayerSession, and SearchGameSessions. You can create a new manage IAM policy with those permissions and add it to the Lambda function role.

You can find the policy at the end of this article.

Cognito

In previous articles of mine, I performed user authentication through IAM users or Cognito User Pools. We don’t need user authentication this time: we will use guest access to avoid any restrictions; everyone who downloads our game can play immediately.

In this opportunity, the Unity client invokes a Lambda function, and the best way to give access to AWS resources without credentials (guest access) is through a Cognito Identity Pool. A Cognito Identity Pool allows both authenticated and guest users, so if you want to enable some authentication feature in the future (for example, linking users’ Facebook or Twitter accounts to save game progress), you could do it easily.

Creating a Cognito Identity Pool with guest access

We must attach the necessary permissions (Lambda Invoke) to the Identity Pool we created. You can edit the guest role associated with the Identity Pool and create a new policy to achieve it. You can find the policy at the end of this article.

GameLift

Before running a new GameLift fleet, we need to elaborate a script that will be attached to the fleet. The first step is defining the behavior of the game server. This is how a game session will be managed:

Based on the above flow, we can write and upload a JavaScript script to GameLift. Then, we can create a new fleet and include our script during the fleet creation. Check the end of this article for the script.

GameLift Script

GameLift Fleet using the script

Before configuring the auto-scaling of your fleet, you need first set limits for it. Imagine that your game has an unexpected success; you don’t want to run out of budget in a few days! In the scaling capacity section, you can set a minimum and a maximum of instances for your fleet:

Scaling options in Amazon GameLift

Now, let’s create an auto-scaling policy. An EC2 instance usually takes one or two minutes to start and can inconvenience the players joining the game. Furthermore, it’s better to start an instance preventively before reaching the player limit:

Auto-scaling policy in Amazon GameLift

You can see how many available instances you have for your fleet here and request a limit increase here if necessary. In my case, I had only one available instance for my fleet (c4.large); the support team gently increased the limit to 15.

Unity

The plugins

We will integrate the Unity client directly with a Lambda function, so we need some AWS classes and functions. In the original tutorial, Chris Blackwell recommends using the AWS Mobile SDK for Unity to achieve it. Unfortunately, the AWS Mobile SDK for Unity has been deprecated, and Amazon recommends now using the AWS SDK for .NET. Let’s see.

In previous articles, I recommended downloading the AWS packages on NuGet.org, the official package manager for .NET. Although it could work, the Amazon documentation recommends doing it here for Unity. Since Unity uses .NET Standard 2.1, download and unzips the .NET Standard 2.0 zip file (it also supports 2.1).

Then, copy the following DLL files to the Plugins folder of your Unity project: AWSSDK.Core, AWSSDK.Lambda, AWSSDK.CognitoIdentity, and AWSSDK.SecurityToken. The Lambda function call is asynchronous, so you also need the Microsoft AsyncInterfaces package.

Unity project settings

Then, we will need an extra plugin to connect to the game server: the GameLift Realtime Client SDK. Go to the GameLift page and download it. Once unzipped, you can figure out this is a whole .NET project. Open the project with Visual Studio and compile it. Five DLL files will be generated: GameLiftRealtimeClientSdkNet45.dll, Google.Protobuf.dll, log4net.dll, SuperSocket.ClientEngine.dll, and WebSocket4Net.dll. Move those DLL files to the Plugins folder of your Unity project.

Plugins folder

Characters and animations

Since I’m not a designer or a 3D modeler, I had to look for a third party who could provide the animations I needed for my Unity project. I found out that Mixamo has a lot of fantastic free characters and animations.

I tried it by downloading the character Remy as an FBX file for Unity and moved it to my Unity project.

Character downloading

It looks good! After that, I downloaded multiple animations to give life to Remy. The animations should be downloaded without skin. We indicate the animation in the editor to use the character’s skin we previously downloaded.

Animation downloading

Then I defined the animations flow inside a new Animator. It looks great!

Animations flow

Building the App

In this section, I will give you some hints about how I built the Unity app. Don’t worry about the code and the project hierarchy! The entire Unity project is available for download at the end of this article.

Character movements

Our character can run, walk back, and rotate. We use the function OnAnimatorMove to have total control over it.

Sending data to the game server

When one starts to build the client, the following question comes to mind:

When should I send data to the game server?

The game server needs to be aware of all actions a player performs to inform the other players, but sending too much information would saturate the game server and burn resources. Remember: don’t send data every frame; otherwise, your game server will die!

I found that data can be sent in those cases: when the player starts and ends any movement (running and walking back), any rotation (turning left and right), and any action (punching, etc.).

If you have played online, you may have experienced the famous “lag,” meaning that what you see is delayed over reality. That can be caused by a slow internet connection or a high-latency game server, but it can also mean the server data must be updated. That is why this is important to “fix” the data occasionally, sending the server updated data. I decided to do it every second in the movement functions.

In red, the player sends updated data to the server.

Player data

For the basic game I made, I used the following data:

Note that the PlayerCoord class has four components: I used this for position or rotation (quaternions).

Costs

Let’s consider the following case: you published a successful FPS game with the following characteristics:

Every game session has a limit of 30 players.
In the peak hour, your game receives approximately 40 concurrent connections from players located in the United States.
Your GameLift fleet uses c4.large Linux Spot instances, and each instance handles a single game server.

Based on the Steam statistics, we can speculate that roughly 35% of the day, a single game server is used versus two games servers in the peak hour:

In red, the number of game servers used

The pricing sheet of Amazon GameLift indicates that a c4.large Linux Spot instance costs $0.07848 per hour, $56.51 per month. We can deduce the following calculation:

Total cost per month = $56.51 x 1.35 ≈ $76.28

GameLift is expensive, but hey, it is worth it! You save time and money instead of implementing your solution and possibly needing help with scaling issues. If you are using GameLift for your game, you should find a way to monetize it (ads, in-app purchases, etc.).

Here are the other costs:

Unity: As always, Unity is free. The personal plan is a good fit unless you need specific advanced features for your game.
Cognito: There is no charge for Cognito Identity Pools. Awesome!
Lambda: The Lambda function is called only once when a new player joins the game. Estimating an exact number of monthly connections is complex, but let’s say 100,000. With 128 MB allocated memory, and each request duration of 500 ms, the cost would be $0.12.

Total: Our real-time multiplayer game bill would be approximately $77 per month.

Final Thoughts

Building a real-time multiplayer game is relatively easy! Amazon GameLift provides all the infrastructure required for this type of application and allows game developers to focus on the visual part.

You are free to use all the resources I provided in this article. Please ping me if you use them to build your game; I would be delighted to see the result!

Resources

Here are all the resources mentioned in the article:

The Lambda function in Python
The Lambda policy attached to the Lambda role
The Cognito policy attached to the Cognito role
The GameLift Script in JavaScript
The Unity package of the game

I also used free Mixamo characters and PolyHeaven-free textures.

The post Building a Real-Time Multiplayer Game With Unity3D and GameLift appeared first on ab.

Unleash the Full Potential of Your Data: Enhancing SQL with Python

Alexandre Bruffa — Tue, 18 Jul 2023 18:25:07 +0000

In this article, we will explain why using SQL with Python is an awesome combination for unleashing the full potential of your data! Read on to find out how learning and starting to use the most popular programming language can have a positive impact on your work.

You may be wondering why we write about Python on the LearnSQL.com blog? Well … They are just a very good couple. Both languages are basic tools in such fields as data science and data visualization. They complement each other perfectly and strengthen their individual capabilities.

That’s why I decided to show you how much you can gain by learning the basics of Python. I know that many people reading this article are still on the path of learning SQL and do not currently have the space to learn another skill. It’s usually best to master one skill – in this case, SQL – and then move on to the next challenge.

However, it is worth planning your development path in advance. That’s why I decided to describe how using SQL with Python will help you spread your wings and boost your career.

4 Ways Python Can Boost Your Marketing Activities

Alexandre Bruffa — Tue, 18 Jul 2023 18:19:16 +0000

In this article, we are going to see why and how you can use Python for marketing.

Most people probably think Python is only for programmers, that this is a skill reserved for a small group. This is not true; Python can be useful to everyone.

In this article, I’ll tell you why you should start using Python for marketing – even if you’ve had nothing to do with coding before.

Let’s Talk About Python

You’ve surely heard about Python, but let’s have a brief recap before we move on to how it can make marketing activities faster and easier.

Read more on LearnPython.com

The post 4 Ways Python Can Boost Your Marketing Activities appeared first on ab.

The Benefits of Learning Python for Business Professionals

Alexandre Bruffa — Tue, 18 Jul 2023 17:55:35 +0000

In this article, we discuss the benefits of learning Python for a professional.

What Is Python?

There are many benefits to learning Python programming. But what is Python? It is a high-level programming language, designed to be easy to read, write, and understand, with a focus on expressing complex concepts concisely and intuitively.

Python is also a general-purpose programming language. Developers use it for a wide range of applications and tasks, as opposed to domain-specific languages focused on specific use cases or fields. Python is used for web development, software development, machine learning, and scripting, among others.

You may not believe it, but Python is not so young! It was conceived in the late 80s by Guido Van Rossum, a Dutch programmer, who released the first version of Python in 1991. Python was originally used as a scripting language, but it is living a second life with the growth of the artificial intelligence and machine learning fields.

Read more on LearnPython.com

The post The Benefits of Learning Python for Business Professionals appeared first on ab.

Python Developer Career Path

Alexandre Bruffa — Tue, 18 Jul 2023 17:51:02 +0000

In this article, we are going to define what a Python developer is nowadays, what kind of skills they should have, and what a company should expect from them.

Maybe you’ve always loved experimenting with computers and you’d like to learn to write your own computer programs. Or maybe you’ve tried one or two other careers and you’re looking for something that’s interesting, challenging, and in demand. Or perhaps you’re working in another area of IT and you want to add Python to your toolkit. This article will explain how to make the transition to a professional Python developer.

A Brief Overview of Python

Python is a general-purpose programming language: almost all domains and applications can use Python. You can build a website with it, use it to train machine learning models, execute complex financial calculations, or write quick automation scripts with it; there is no limit.

Read more on LearnPython.com

The post Python Developer Career Path appeared first on ab.

Different Ways to Practice Python

Alexandre Bruffa — Tue, 18 Jul 2023 17:44:37 +0000

Learning Python means practicing Python. In this article, we’ll explore some of the most popular ways to practice your Python programming skills.

Learning almost any new skill requires not only gaining knowledge but experience. And this is what we acquire through practice.

This article will help anyone who has recently started learning Python or who already knows the basics of Python but cannot progress to the next level. Here are the best ways to practice Python.

What Is Python?

Python is a general-purpose programming language which means it is used for a wide range of domains and applications, unlike domain-specific languages that are designed for a specific task or application (i.e. SQL for databases).

Read more on LearnPython.com

The post Different Ways to Practice Python appeared first on ab.

Generating AI Images With DALL-E, AWS, and Unity3D

Alexandre Bruffa — Sat, 11 Mar 2023 17:05:06 +0000

This article was initially published on my Medium Page.

In my last article, I showed how I built a talking chat thanks to ChatGPT, AWS, and Unity3D. We will reuse the main components to create a chat showing images generated by Dall-E, the OpenAI images generation system.

Here is a video of the final result:

General Architecture

This time, it will be much more straightforward. We use API Gateway to expose our endpoint, Cognito for the authentication, and Lambda to perform the OpenAI API requests:

Notes:

OpenAI stores the image created by Dall-E in its repository and returns the URL.
The Unity client app reads the image directly from the URL.

AWS Implementation

This is our new Lambda function:

Notes:

We re-use the openai layer we created in my previous article, and the same environment variable openai_api_key.
We use the image completion endpoint of OpenAI, as specified in the documentation.
The default image size generation by the endpoint is 1024×1024. We will show the image inside a chat message; then, we can work with a lower resolution (512×512).

Unity implementation

Compared with my previous article, we use the same application with a few changes.

First, our Lambda function only returns the image URL, so we need to change the result data class:

Then, we need to show the image generated by the OpenAI API, so we replace the message Text component by an empty Image in the message object:

Note that the Image width is 512px, the same size we specify in the Lambda function.

And we retrieve the image thanks to the GetTexture function:

Note: Once we have received the texture, we convert it into a Sprite and assign it to the Image of the new instantiated chat message.

Costs

Let’s check with the AWS Calculator what our system cost would be for a very pessimistic scenario: you love the app and perform 50 daily, 1,500 requests monthly.

Cognito: This project only has one MAU (monthly active user). Cost: 0.00 USD
API Gateway: With 1,500 requests to our REST API, the monthly cost is 0,00 USD
Lambda: With 1,500 requests with an average time of 5 seconds and a 1,024 MB of memory allocation, the monthly cost is 0.00 USD; excellent!
OpenAI: According to the OpenAI pricing sheet, each image generated costs 0.018 USD for a 512×512 resolution. Then, for 1,500 images generated monthly, we have a cost of 27.00 USD.

Total: The total bill for our system would be 27.00 USD monthly, with a pessimistic scenario. That is expensive, but image generation is a complex process requiring high resources, so it can be understandable.

Closing Thoughts

In this article, we could figure out how to build an entire cloud architecture on AWS and how easy it is to integrate the OpenAI API with a Lambda function. We could also evaluate the cost of the whole system thanks to the AWS Calculator.

Every code of this article has been tested using Unity 2021.3.3 and Visual Studio Community 2022 for Mac. The mobile device I used to run the Unity app is a Galaxy Tab A7 Lite with Android 11.

All ids and tokens shown in this article are fake or expired; if you try to use them, you will not be able to establish any connections.

You can download the Unity package of the client app specially designed for this article.

A special thanks to Gianca Chavest for designing the fantastic illustration.

The post Generating AI Images With DALL-E, AWS, and Unity3D appeared first on ab.

I made ChatGPT talk using Unity3D and AWS

Alexandre Bruffa — Sat, 11 Mar 2023 06:38:20 +0000

This article was initially published on my Medium Page.

Those last weeks, the whole Internet has been upside down. A new actor arrived and shook the AI game: ChatGPT. If you have tried it, you have probably figured out that ChatGPT is incredible: it can give you a detailed answer to almost every question, create poems or jokes, and help developers to program, among others.

After playing with the platform for a while, I remembered that years ago, I built for a client a chatbot that could talk. What about making ChatGPT talk? Is it even possible? Let’s see.

Spoil alert: I could make it! Here is a video showing the final result:

ChatGPT

The ChatGPT interface is fantastic, but even better, OpenAI has an API and an official library for Python—a gold mine for the developers. Once logged, we can figure out the following: OpenAI gives us two months of free trial usage with an $18 credit. Thanks, dudes, that’s cool.

ChatGPT account

The OpenAI console is straightforward; the only thing you have to do is generate a new API key to allow any external program to connect:

API key creation in the OpenAI console

General Architecture

Here is the general architecture of the project:

Notes:

Do you remember my article about the hotel platform? We will reuse the main components for authentication: we will connect a Unity app to AWS thanks to a login system with Cognito and API Gateway.
We will create a user pool and a new user with a name and a password in Cognito.
We will create an endpoint in API Gateway and an associated Authorizer so that only the users of the user pool can consume the API.
The Lambda function will receive the text from Unity and call the OpenAI API.
Once the OpenAI API has answered, we call Polly, the text-to-voice converter of AWS, which will convert the answer into a voice stream.
We keep the audio file in an S3 bucket and generate a pre-signed URL to restrict access to the file.

AWS implementation

S3

In the same way that my previous article, we create a private repository:

Lambda

First, we create a Lambda layer with the OpenAI library. Do you remember my previous article about making a homemade CCTV? I explain there in detail how to create a Lambda layer from a local environment, so we will follow the same method with the OpenAI library:

Now, we create our Lambda function:

Don’t forget to add the openai Layer to the function:

Inside the Lambda function, we define a new environment variable called openai_api_key with the OpenAI API key value.

Inside the function’s role, we create 2 inline policies, one for Polly, and the other for S3.

Inline policy for Polly

Inline policy for S3

And here is the function:

Notes:

We store the OpenAI API key in a Lambda environment variable called openai_api_key, and we call it in Lambda thanks to the os.getenv function.
We parse the Lambda function’s entry parameters and retrieve the message sent from Unity.
We call the Create completion function of the OpenAI API with the message sent from Unity. We extract the answer, as specified in the OpenAI documentation, and trim it with the strip function to avoid spaces or line breaks.
When we call the Create completion function, we concatenate the message with the sentence “Please give me a short answer.” to be sure that the response given by ChatGPT will not be too elaborate.
We call the synthesize_speech function of Polly with boto3, passing the answer as a parameter. We chose the ogg format, the best choice to work in Unity with, as I demonstrated in this previous article.
I chose Aria, a friendly New Zealand vocal option of Polly, but it’s up to you to choose your favorite one!
We keep the audio stream locally as a file thanks to the open and write functions, and we upload it to an S3 bucket thanks to the upload_file function of the boto3 library. After finishing, we remove the local file thanks to the os.remove function.
We generate a pre-signed URL with 1 minute of time life thanks to the generate_presigned_url function of the boto3 library, so only the user using the Unity app will be able to access the audio file.

Cognito

In the same way that in the hotel platform article, we create a new user pool in Cognito:

In the Pool, we create a new user with a name and a password:

API Gateway

In API Gateway, we create a new REST API with a POST method until our Lambda function, and we deploy it:

And we create an authorizer to allow access to the endpoint only for the Cognito users from the Pool we have created:

Unity3D Implementation

The Audio Component

Our app will be able to talk, so we need an AudioSource component in our project!

AudioSource component

Note: We let the AudioClip parameter empty; we will fill it with the audio file from S3.

The UI components

I usually detail little about the UI building of my Unity apps because I’m not a designer nor a front-end developer. Still, in this case, I found it interesting to explain how I built the client app mainly because of the complex layout of the chat.

That’s how I built the client app:

Client application layout

Notes:

We use a Canvas with a ScrollView (without scrollbars) to show the messages.
We use a vertical Content Size Fitter to resize the content of the ScrollView automatically, and a Vertical Layout Group to place the messages vertically.
We use a combination of horizontal and vertical Content Size Fitter and Layout Groups to resize the box containing the message.
We use a sliced image for the box, so all the messages will always have the same rounder corners, no matter the text size.

Sliced image in the Sprite Editor

The code

Okay, so we have a functional chat in Unity. Let’s connect it with the backend!

First of all, we login to Cognito when the application starts, and we store the token id returned by Cognito in a PlayPrefs parameter:

Please refer to my previous article for an extensive explanation of the above code.

Then, we write the functions to show and hide the user’s device keyboard:

Notes:

Unity work with the native keyboard of the device where the application is running. That means the keyboard will look different if you run it on iOS or Android.
We use the class TouchScreenKeyboard as specified in the Unity documentation and the related function Open.

Then, here is the most exciting part: we call our endpoint, and we pass the message written as a parameter:

Well, our endpoint returned a URL of the audio file, so we use it now to retrieve the file and play it:

Notes:

We use the function GetAudioClip of UnityWebRequestMultimedia to retrieve the audio stream in ogg format.
We assign the audio stream to the clip parameter of our AudioSource object.

And now, we can add the message to the chat:

Notes:

We instantiate the user message object or the friend message object according to the needs.
We use the function ForceRebuildLayoutImmediate to refresh the ScrollView content and avoid graphical bugs.
We set the verticalNormalizedPosition parameter of the ScrollView to 0, so the scroll position is at the bottom, and we can see the last messages.

Costs

Let’s check with the AWS Calculator what our system cost would be for a very pessimistic scenario: you love the app, and you perform 100 daily, 3,000 requests a month.

Cognito: We only have one MAU (monthly active user) for this project. Cost: 0.00 USD
API Gateway: With 3,000 requests to our REST API, the monthly cost is 0,01 USD
S3: Suppose that 50 KB could be the average size of an audio file; we would have 150 MB stored each month. Additionally, we would have 3,000 put requests and 3,000 get requests, leading to a monthly cost of 0.02 USD.
Lambda: With 3,000 requests with an average time of 3 seconds and a 1,024 MB of memory allocation, the monthly cost is 0.00 USD; excellent!
Polly: Polly is undoubtedly the more expensive AWS service here. Let’s suppose chatGPT answers have an average of 100 characters; the monthly bill will be 1.20 USD.
OpenAI: Based on the OpenAI tokenizer tool, suppose that every question we ask ChatGPT represents 15 tokens, so we use 45,000 tokens monthly. According to the OpenAI pricing, this gives us a total of 0.9 USD monthly.

Total: The total bill for our system would be 2.13 USD monthly. It’s totally affordable, taking into account that this is a very pessimistic scenario.

Closing Thoughts

In this article, we could figure out how to build an entire cloud architecture on AWS and how easy it is to integrate the OpenAI API with a Lambda function. We also had the opportunity to discover Polly, the text-to-voice service of AWS. Furthermore, we could evaluate the cost of the entire system thanks to the AWS Calculator.

Every code of this article has been tested using Unity 2021.3.3 and Visual Studio Community 2022 for Mac. The mobile device I used to run the Unity app is a Galaxy Tab A7 Lite with Android 11.

All ids and tokens shown in this article are fake or expired; if you try to use them, you will not be able to establish any connections.

You can download the Unity package of the client app specially designed for this article.

A special thanks to Gianca Chavest for designing the amazing illustration.

The post I made ChatGPT talk using Unity3D and AWS appeared first on ab.