The Diarization method automatically detects and separates speakers within a single audio channel. This method can handle more complex audio inputs where multiple speakers are mixed in the same channel. However, it is significantly slower, taking 2-3 times longer to process compared to transcription without diarization.
Key Features
Use Case: Suitable for mixed audio inputs where speakers are not on separate channels.
Accuracy: Variable, dependent on the complexity of the audio and number of speakers.
Performance: Slow, taking 2-3 times longer than non-diarized processing.
Recommendation: Use only when separation is not feasible.
Transcribe with Diarization
Introduction
This section describes the various ways to use diarization with the Gowajee speech-to-text (STT) API. Diarization is the process of identifying and separating speakers within an audio input. Below are the options for configuring diarization to best suit your needs.
Methods to Enable Diarization
Enable Automatic Diarization
Set diarization to true to enable automatic detection of the number of speakers and speaker separation.
Set diarization to true and define minSpeakers (Integer) and maxSpeakers (Integer) to automatically detect and separate the number of speakers within the specified range.