The Multilingual Video Chat accelerator adds live captioning, translation, and text-to-speech to video calls in OutSystems Applications.
The accelerator provides a MultilingualVideoChat_CW module that contains three main blocks:
CallStart. CallStart leverages Amazon Kinesis Video Streams to transport video and audio between participants. Use the CallStart block to start a video call.
CallJoin. CallJoin also leverages Amazon Kinesis Video Streams to transport video and audio between participants. Use CallJoin to join the existing video call.
TranscriptionPanel. TranscriptionPanel, uses Amazon Transcribe, Amazon Translate and Amazon Polly services to transcribe the video call audio, translate it and synthesize this translation into voice.
To start using this component, follow these steps:
Configure the Multilingual Video Chat in Service Studio, and add it to your application.
Configure the Authentication through the Site Properties in Service Center to be able to connect to the Amazon services.
Install the Multilingual Video Chat from the Forge in your environment.
Add the MultilingualVideoChat_CW and the MultilingualVideoChat_CS as dependencies to your app:
To add the Call Start to your application:
Start by creating a new screen, adding two columns to your screen, and adding the blocks CallStart and TranscriptionPanel.
2. The TranscriptionPanel will prompt for the GUID of the video call. This GUID gets automatically generated by the CallStart block, and you can fetch it on the VideoCall event.
3. Store it in a variable and assign this variable to the TranscriptionPanel input parameter.
4. Fill in the remaining parameters for the Transcription Panel:
The UserTypeId parameter identifies if the video call participant is the call creator. For this screen, set the Entities.UserType.Caller value.
The flag ParticipantHasJoined, should be set to True when the video call peer joins it. The CallStart block signalizes it on the ParticipantJoined event.
The procedure for creating the page to join a video call is very similar to the CallStart procedure:
Start by creating a new screen, adding two columns to your screen, and adding the blocks CallJoin and TranscriptionPanel.
2. Both blocks will prompt for the GUID of the call. This GUID must match the one generated in the CallStart screen. Add a page input parameter to receive this value and assign this value to the input parameters.
3. On the TranscriptionPanel block set the remaining parameters:
UserTypeId parameter to Entities.UserType.Participant.
Finally, the flag ParticipantHasJoined, should be set to True when the video call peer joins it. The CallStart block signalizes it on the ParticipantJoined event.
To configure the accelerator to access Amazon services, you need the following AWS authentication information:
AWS access key ID
Secret access key of your AWS access key
AWS Region of the service endpoint to which you want to connect. To reduce latency, choose a region close to your application server. See the API documentation for a list of region names.
To use the above information and authenticate to AWS services:
In the Service Center, in the Factory tab, search for the Video Call Validator app and open it.
Inside the app, open the MultilingualVideoChat_Lib(service) module.
Go to the Site Properties tab and add the Site Properties for:
Amazon_AccessKey
Amazon_SecretKey
Amazon_Region
With these steps, you will have a simple app with multilanguage video call capability. For details of a working video call app, check the sample app.
This connector depends on the following Forge components:
Amazon Kinesis Video Streams Connector: Used for setting up Kinesis Video Streams signaling channels. Also, for creating pre-signed URLs used by each client to start the video call.
Amazon Translate: Used on translating the video call transcription with Amazon Translate.
AWS Polly Connector: Responsible for integrating with Amazon Polly service for text-to-speech capabilities.
AWS Transcribe Connector: Component employed on generating pre-signed URLs used for audio stream transcription by Amazon Transcribe.
This accelerator consists of 3 modules:
MultilingualVideoChat_CW: Front-end module comprised of the blocks for starting, joining, and transcribing video calls
MultilingualVideoChat_Lib: Services module that manages configuration parameters (e.g. AWS authentication keys)
MultilingualVideoChat_CS: Services module containing entities and wrappers for interactions with AWS connectors
Block used to join an existing video call
GUID (Text): GUID that identifies the video call to join
CallStarted: The participant has clicked the button for joining the call
CallEnded: The participant has clicked the button to exit a call
Error: Dispatched whenever an error occurs on a video call starting/ending
ParticipantJoined: The video call peers are connected
ParticipantLeft: The video call peer leaves the call
VideoPlayerInfo: Event that emits the id of the video call HTML elements
Block used to start a new video call
ParticipantJoined: The video call peer joined the video call
ParticipantLeft: The video call peer left the video call
VideoCall: Emits the video call GUID
Block responsible for transcribing the sound, translating the text, and synthesizing voice on top of the video call
UserTypeId (UserType Record): Identifier of the type of user currently viewing the video call (Caller or Participant)
ParticipantHasJoined (Boolean): Flag indicating if the video call peer has already joined the call. If True, the video call transcription starts
VocabularyName: Name of the vocabulary used on video call transcribing. Check the [AWS documentation] (https://docs.aws.amazon.com/transcribe/latest/dg/custom-vocabulary.html ) for more details
Polly: The participant has activated the video call text-to-speech capabilities using Amazon Polly
Contains logs of all video calls handled by the accelerator.
Id (Integer): Identifier of the call record
GUID (Text): GUID associated with the video call
Caller (Text): Name of the user that started the conversation
Receiver (Text): Name of the user participating in the conversation
CallStatusId (CallStatus identifier): Current status of the video call
StartedOn (Date time): Video call starting date
EndedOn (Date time): Video call ending date
CallerLanguageCode (Language identifier): Identifier of the language set by the video call creator
ReceiverLanguageCode (Language identifier): Identifier of the language defined by the video call receiver
Stores the list of messages transcribed in a conversation.
Id (Integer): Identifier of the message record
Message (Text): Transcribed content of the message
MessageTranslation (Text): Translation of the message text content
SenderName (Text): Name of the user associated with the message
VideoCallId (VideoCall identifier): Identifier of the video call the message belongs to
SentAt (Date time): Message sending date
UserTypeId (UserType identifier): Indicates if the message got transcribed from the call creator or participant
Is_Active (Boolean): Flag indicating if the record is active
The static entity with all the possible statuses of a video call. The possible values are:
Waiting: The call got created, but the peer hasn't joined yet
OnGoing: Both participants started chatting in the video call
Ended: The call creator left the video call
The static entity with the languages currently supported by this component. It contains the intersection between the Amazon Transcribe (streaming service), Amazon Translate and Amazon Polly supported languages list.
Chinese (Simplified)
German
English (US/British/Australian)
Spanish (US)
French (French/Canadian)
Italian
Japanese
Korean
Portuguese (Brazilian)
The static entity with types of users in a video call. The existing values are:
Caller: Users that create video calls
Participant: Users joining created video calls
The diagram below illustrates the flow of the information in this accelerator.
For this accelerator to work properly the AWS user used for authentication in the several AWWS services will need this set of permissions:
In Kinesis Video Streams
CreateSignalingChannel
DeleteSignalingChannel
DescribeSignalingChannel
GetIceServerConfig
GetSignalingChannelEndpoint
In AWS transcribe
"StartStreamTranscriptionWebSocket"
In AWS translate
"TranslateText"
In AWS Polly
"DescribeVoices"
"SynthesizeSpeech"