Vocal-Insight is a versatile utility designed to convert voice files into text in a specified format, allowing users to perform Q&A on the generated text. We employ OpenAI Whisper to transcribe MP3 files into text and GeminiPro to facilitate the Q&A process. One of the potential use cases we've chosen is transforming meeting recordings into a comprehensive Minutes of Meeting (MOM) format, which can be further downloaded and summarized by the LLM. Additionally, this tool can answer any questions related to the meeting, making it a convenient companion for meetings. An added advantage is its multilingual capability, making it accessible for diverse user groups.
With some modifications, you can adapt this utility for business-specific use cases.