speech2text

Introduction

The speech2text container enables automatic speech to text transcription using Google services. Google API key needs to be available and put in folder 00_custom/api_keys in the main project directory. Navigate to localhost:8000 to start GUI and websocket connection to Google.

Before starting the software framework, please adjust the settings as given in the file settings_stt.conf, e.g.

STT_ONLY_CORRECTED: Set to false to use Google speech-to-text data as-is; set to true to validate each received text chunk in the revision interface.
STT_MODEL: Set to define transcription model.
LANGUAGE_S2T: Set to one of the supported languages
MIN_SPEAKERS: Set to expected minimum number of speakers; used for speaker diarization.
MAX_SPEAKERS: Set to expected maximum number of speakers; used for speaker diarization.

Dashboard

Revision Interface

Technical Framework

Backend is programmed in JavaScript.

Frontend is based on JavaScript, HTML and CSS.

Main Contributor

Jan Paul Hölzl