About Xebia_Vocal Insight
Vocal-Insight is a versatile utility designed to convert voice files into text in a specified format, allowing users to perform Q&A on the generated text. We employ OpenAI Whisper to transcribe MP3 files into text and Gemini Pro to facilitate the Q&A process. One of the potential use cases we've chosen is transforming meeting recordings into a comprehensive Minutes of Meeting (MOM) format, which can be further downloaded and summarized by the LLM. Additionally, this tool can answer any questions related to the meeting, making it a convenient companion for meetings. An added advantage is its multilingual capability, making it accessible for diverse user groups.
With some modifications, you can adapt this utility for business-specific use cases.
Pre-requisite
Here is the step-by-step documentation for getting the API key for Whisper model from Open AI.
Click on the Link to proceed further OpenAI
As a prerequisite, you need to download the Whisper connector and the Gemini Pro connector to run Vocal Insight
Configuring OutSystems Demo Application :
Then add that key as Site property of our demo application to continue our services.
About the Demo Application
Step 1: Access the Demo Page
1.Select Audio File: Use the "Select File" option and the upload icon to upload audio.
2.Dropdown Box: Uploaded audio files will appear in the dropdown,
3.Delete Icon: You can delete the selected audio file from the dropdown list by clicking the delete icon.
4.Clear Chat: To reset the "Answer" section, use the "Clear Chat" button
This setup allows you to upload, select, delete, and interact with documents seamlessly.
Step 2: Minute of meeting (MOM)
After the audio is uploaded using Whisper translator utility you will get the Minute of meeting (MOM), of that audio file. Enter your question in the "User Question" box on the left side of the page, where the prompt says, "How can we help you?" For example, type " what they are discussing about?" and click the send icon to submit your query.
Step 3: Generate the answer
After submitting your query, Gemini Pro will process it and display the response in the "Answer" section, which also shows the chat history. To clear all chats, click the "Clear Chat" button, or delete a specific response using the delete icon next to it.
Note: -
Please consider using a more substantial document for uploading into the Q&A system. This will truly demonstrate the power of Genai, allowing you to experience its capabilities firsthand
Use Cases:
1. Education and Language Learning:
Providing translations for non-English lectures, webinars, and tutorials. Helping language learners understand content in their native language by translating it into English.
2. Business and Meetings:
Translating multilingual meetings or conference calls for English-speaking participants.
3.Journalism and Research:
Translating interviews, speeches, or press conferences in foreign languages into English. Assisting researchers in accessing multilingual audio materials.
4. Customer Support and Services:
Translating customer queries or feedback from various languages into English. Assisting support teams in understanding and responding to non-English audio messages.